VLM2Vec & MMEB: Benchmarking multimodal embeddings and adapting state-of-the-art multimodal large language models into embedding models.

Website - https://tiger-ai-lab.github.io/VLM2Vec/
Github https://github.com/TIGER-AI-Lab/VLM2Vec

List of Our Papers

Main VLM2Vec / MMEB Series

VLM2Vec / MMEB – Image embedding benchmarking and models.
VLM2Vec-V2 / MMEB-V2 – Extension of our previous work to video and visual document tasks.

Other Papers from Our Team

GAE-Retriever – Benchmark and model for trajectory modeling in GUI environments.
B3 – A novel batch mining strategy for contrastive learning.