Selected Publication
MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures
Jinjie Ni, Yifan Song, Deepanway Ghosal, Bo Li, David Junhao Zhang, Xiang Yue, Fuzhao Xue, Zian Zheng, Kaichen Zhang, Mahir Shah, Kabir Jain, Yang You, Michael Shieh
project page / paper / code / Demo
ArXiv:2410.13754
LLaVA-OneVision: Easy Visual Task Transfer
Bo Li, Yuanhan Zhang, Dong Guo, Renrui Zhang, Feng Li, Hao Zhang, Kaichen Zhang, Yanwei Li, Ziwei Liu, Chunyuan Li
project page / paper / code / Demo
ArXiv:2408.03326
Long Context Transfer from Language to Vision
Peiyuan Zhang*, Kaichen Zhang*, Bo Li*, Guangtao Zeng, Jingkang Yang, Yuanhan Zhang, Ziyue Wang, Haoran Tan, Chunyuan Li, Ziwei Liu
project page / paper / code / Demo
ArXiv:2406.16852
LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models
Kaichen Zhang*, Bo Li*, Peiyuan Zhang*, Fanyi Pu*, Joshua Adrian Cahyono, Kairui Hu, Shuai Liu, Yuanhan Zhang, Jingkang Yang, Chunyuan Li, Ziwei Liu
project page / paper / code /
ArXiv:2407.12772
Llava-next: Stronger llms supercharge multimodal capabilities in the wild
Bo Li, Kaichen Zhang, Hao Zhang, Dong Guo, Renrui Zhang, Feng Li, Yuanhan Zhang, Ziwei Liu, Chunyuan Li
project page / code /
Technical Blog
WorldQA: Multimodal World Knowledge in Videos through Long-Chain Reasoning
Yuanhan Zhang, Kaichen Zhang, Bo Li, Fanyi Pu, Christopher Arif Setiadharma, Jingkang Yang, Ziwei Liu
project page / paper / code /
ArXiv:2405.03272