VLM2Vec & MMEB: Benchmarking multimodal embeddings and adapting state-of-the-art multimodal large language models into embedding models.
Website
-
https://tiger-ai-lab.github.io/VLM2Vec/
Github
https://github.com/TIGER-AI-Lab/VLM2Vec
List of Our Papers
Main VLM2Vec / MMEB Series
VLM2Vec / MMEB
– Image embedding benchmarking and models.
VLM2Vec-V2 / MMEB-V2
– Extension of our previous work to video and visual document tasks.
Other Papers from Our Team
GAE-Retriever
– Benchmark and model for trajectory modeling in GUI environments.
B3
– A novel batch mining strategy for contrastive learning.