Bootstrap
My HomePage

Selected Publication


MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures

Jinjie Ni, Yifan Song, Deepanway Ghosal, Bo Li, David Junhao Zhang, Xiang Yue, Fuzhao Xue, Zian Zheng, Kaichen Zhang, Mahir Shah, Kabir Jain, Yang You, Michael Shieh

project page / paper / code / Demo

ArXiv:2410.13754


LLaVA-OneVision: Easy Visual Task Transfer

Bo Li, Yuanhan Zhang, Dong Guo, Renrui Zhang, Feng Li, Hao Zhang, Kaichen Zhang, Yanwei Li, Ziwei Liu, Chunyuan Li

project page / paper / code / Demo

ArXiv:2408.03326


Long Context Transfer from Language to Vision

Peiyuan Zhang*, Kaichen Zhang*, Bo Li*, Guangtao Zeng, Jingkang Yang, Yuanhan Zhang, Ziyue Wang, Haoran Tan, Chunyuan Li, Ziwei Liu

project page / paper / code / Demo

ArXiv:2406.16852


LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models

Kaichen Zhang*, Bo Li*, Peiyuan Zhang*, Fanyi Pu*, Joshua Adrian Cahyono, Kairui Hu, Shuai Liu, Yuanhan Zhang, Jingkang Yang, Chunyuan Li, Ziwei Liu

project page / paper / code /

ArXiv:2407.12772


Llava-next: Stronger llms supercharge multimodal capabilities in the wild

Bo Li, Kaichen Zhang, Hao Zhang, Dong Guo, Renrui Zhang, Feng Li, Yuanhan Zhang, Ziwei Liu, Chunyuan Li

project page / code /

Technical Blog


WorldQA: Multimodal World Knowledge in Videos through Long-Chain Reasoning

Yuanhan Zhang, Kaichen Zhang, Bo Li, Fanyi Pu, Christopher Arif Setiadharma, Jingkang Yang, Ziwei Liu

project page / paper / code /

ArXiv:2405.03272