Selected Publication
OpenMMReasoner: Pushing the Frontiers for Multimodal Reasoning with an Open and General Recipe
Kaichen Zhang*, Keming Wu*, Zuhao Yang*, Kairui Hu, Bin Wang, Ziwei Liu, Xingxuan Li, Lidong Bing
paper / code / Collections
ArXiv:2511.16334
LongVT: Incentivizing "Thinking with Long Videos" via Native Tool Calling
Zuhao Yang*, Sudong Wang*, Kaichen Zhang*, Keming Wu, Sicong Leng, Yifan Zhang, Chengwei Qin, Shijian Lu, Xingxuan Li, Lidong Bing
paper / code / Collections
ArXiv:2511.20785
Large Multi-modal Models Can Interpret Features in Large Multi-modal Models
Kaichen Zhang, Yifei Shen, Bo Li, Ziwei Liu
paper / code / Collections
ArXiv:2411.14982, International Conference on Computer Vision 2025
MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures
Jinjie Ni, Yifan Song, Deepanway Ghosal, Bo Li, David Junhao Zhang, Xiang Yue, Fuzhao Xue, Zian Zheng, Kaichen Zhang, Mahir Shah, Kabir Jain, Yang You, Michael Shieh
project page / paper / code / Demo
ArXiv:2410.13754, International Conference on Learning Representations 2025
LLaVA-OneVision: Easy Visual Task Transfer
Bo Li, Yuanhan Zhang, Dong Guo, Renrui Zhang, Feng Li, Hao Zhang, Kaichen Zhang, Yanwei Li, Ziwei Liu, Chunyuan Li
project page / paper / code / Demo
ArXiv:2408.03326, Transactions on Machine Learning Research
Long Context Transfer from Language to Vision
Peiyuan Zhang*, Kaichen Zhang*, Bo Li*, Guangtao Zeng, Jingkang Yang, Yuanhan Zhang, Ziyue Wang, Haoran Tan, Chunyuan Li, Ziwei Liu
project page / paper / code / Demo
ArXiv:2406.16852, Transactions on Machine Learning Research
LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models
Kaichen Zhang*, Bo Li*, Peiyuan Zhang*, Fanyi Pu*, Joshua Adrian Cahyono, Kairui Hu, Shuai Liu, Yuanhan Zhang, Jingkang Yang, Chunyuan Li, Ziwei Liu
project page / paper / code /
ArXiv:2407.12772, North American Association for Computational Linguistics 2025 Findings
Llava-next: Stronger llms supercharge multimodal capabilities in the wild
Bo Li, Kaichen Zhang, Hao Zhang, Dong Guo, Renrui Zhang, Feng Li, Yuanhan Zhang, Ziwei Liu, Chunyuan Li
project page / code /
Technical Blog
WorldQA: Multimodal World Knowledge in Videos through Long-Chain Reasoning
Yuanhan Zhang, Kaichen Zhang, Bo Li, Fanyi Pu, Christopher Arif Setiadharma, Jingkang Yang, Ziwei Liu
project page / paper / code /
ArXiv:2405.03272