OpenAlex Citation Counts

OpenAlex Citations Logo

OpenAlex is a bibliographic catalogue of scientific papers, authors and institutions accessible in open access mode, named after the Library of Alexandria. It's citation coverage is excellent and I hope you will find utility in this listing of citing articles!

If you click the article title, you'll navigate to the article, as listed in CrossRef. If you click the Open Access links, you'll navigate to the "best Open Access location". Clicking the citation count will open this listing for that article. Lastly at the bottom of the page, you'll find basic pagination options.

Requested Article:

MMMU: A Massive Multi-Discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI
Yue Xiang, Yuansheng Ni, Tianyu Zheng, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. 32, pp. 9556-9567
Closed Access | Times Cited: 31

Showing 1-25 of 31 citing articles:

Generative Multimodal Models are In-Context Learners
Quan Sun, Yufeng Cui, Xiaosong Zhang, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. 35, pp. 14398-14409
Closed Access | Times Cited: 13

LVLM-EHub: A Comprehensive Evaluation Benchmark for Large Vision-Language Models
Peng Xu, Wenqi Shao, Kaipeng Zhang, et al.
IEEE Transactions on Pattern Analysis and Machine Intelligence (2024) Vol. 47, Iss. 3, pp. 1877-1893
Open Access | Times Cited: 12

Honeybee: Locality-Enhanced Projector for Multimodal LLM
Junbum Cha, Wooyoung Kang, Jonghwan Mun, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. 3361, pp. 13817-13827
Closed Access | Times Cited: 10

Large language models for life cycle assessments: Opportunities, challenges, and risks
Nathan Preuss, Abdulelah S. Alshehri, Fengqi You
Journal of Cleaner Production (2024) Vol. 466, pp. 142824-142824
Closed Access | Times Cited: 8

MM1: Methods, Analysis and Insights from Multimodal LLM Pre-training
Brandon McKinzie, Zhe Gan, Jean-Philippe Fauconnier, et al.
Lecture notes in computer science (2024), pp. 304-323
Closed Access | Times Cited: 6

Are Vision-Language Models Truly Understanding Multi-vision Sensor?
Sangyun Chung, Youngjoon Yu, Youngchae Chee, et al.
(2025)
Open Access

Multimodal generative AI for medical image interpretation
Vishwanatha M. Rao, Michael Hla, Michael Moor, et al.
Nature (2025) Vol. 639, Iss. 8056, pp. 888-896
Closed Access

An Approach to Complex Visual Data Interpretation with Vision-Language Models
Thanh–Son Nguyen, Viet-Tham Huynh, Van-Loc Nguyen, et al.
Lecture notes in computer science (2025), pp. 338-354
Closed Access

SensorBench: Benchmarking LLMs in Coding-Based Sensor Processing
Pengrui Quan, Xiaomin Ouyang, Jeya Vikranth Jeyakumar, et al.
(2025), pp. 25-30
Closed Access

Foundation models for materials discovery – current state and future directions
Edward O. Pyzer‐Knapp, Matteo Manica, Peter Staar, et al.
npj Computational Materials (2025) Vol. 11, Iss. 1
Open Access

Disambiguating Ambiguous Questions using Eye-Gaze in Visual Question Answering
Shun Inadumi, Seiya Kawano, Akishige Yuguchi, et al.
Journal of Natural Language Processing (2025) Vol. 32, Iss. 1, pp. 3-35
Open Access

An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models
Liang Chen, Haozhe Zhao, Tianyu Liu, et al.
Lecture notes in computer science (2024), pp. 19-35
Closed Access | Times Cited: 3

ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object
Chenshuang Zhang, Fei Pan, Junmo Kim, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. 36, pp. 21752-21762
Closed Access | Times Cited: 2

Vision Language Models are blind
Pooyan Rahmanzadehgervi, Logan Bolton, Mohammad Reza Taesiri, et al.
Lecture notes in computer science (2024), pp. 293-309
Closed Access | Times Cited: 2

Dynamic Intelligence Assessment: Benchmarking LLMs on the Road to AGI with a Focus on Model Confidence
Norbert Tihanyi, Tamás Bisztray, Richard A. Dubniczky, et al.
2021 IEEE International Conference on Big Data (Big Data) (2024), pp. 3313-3321
Open Access | Times Cited: 2

New frontiers in AI for biodiversity research and conservation with multimodal language models
Zhongqi Miao, Yuanhan Zhang, Zalan Fabian, et al.
(2024)
Open Access | Times Cited: 1

What Makes Multimodal In-Context Learning Work?
Folco Bertini Baldassini, Mustafa Shukor, Matthieu Cord, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (2024), pp. 1539-1550
Open Access | Times Cited: 1

Dissecting Dissonance: Benchmarking Large Multimodal Models Against Self-Contradictory Instructions
Jin Gao, Lei Gan, Yuankai Li, et al.
Lecture notes in computer science (2024), pp. 404-420
Closed Access | Times Cited: 1

How Many Are in This Image A Safety Evaluation Benchmark for Vision LLMs
Haoqin Tu, Chenhang Cui, Zijun Wang, et al.
Lecture notes in computer science (2024), pp. 37-55
Closed Access | Times Cited: 1

A Benchmark and Chain-of-Thought Prompting Strategy for Large Multimodal Models with Multiple Image Inputs
Daoan Zhang, Junming Yang, Hanjia Lyu, et al.
Lecture notes in computer science (2024), pp. 226-241
Closed Access | Times Cited: 1

Retrospective Analysis of Google Gemini
Nagendra Singh Yadav, Vishal Goar, Pallavi Singh Yadav, et al.
Lecture notes in networks and systems (2024), pp. 555-580
Closed Access

Attention Prompting on Image for Large Vision-Language Models
Runpeng Yu, Weihao Yu, Xinchao Wang
Lecture notes in computer science (2024), pp. 251-268
Closed Access

PathMMU: A Massive Multimodal Expert-Level Benchmark for Understanding and Reasoning in Pathology
Yuxuan Sun, H. Wu, Chenglu Zhu, et al.
Lecture notes in computer science (2024), pp. 56-73
Closed Access

Page 1 - Next Page

Scroll to top