
OpenAlex is a bibliographic catalogue of scientific papers, authors and institutions accessible in open access mode, named after the Library of Alexandria. It's citation coverage is excellent and I hope you will find utility in this listing of citing articles!
If you click the article title, you'll navigate to the article, as listed in CrossRef. If you click the Open Access links, you'll navigate to the "best Open Access location". Clicking the citation count will open this listing for that article. Lastly at the bottom of the page, you'll find basic pagination options.
Requested Article:
MultiInstruct: Improving Multi-Modal Zero-Shot Learning via Instruction Tuning
Zhiyang Xu, Ying Shen, Lifu Huang
(2023)
Open Access | Times Cited: 28
Zhiyang Xu, Ying Shen, Lifu Huang
(2023)
Open Access | Times Cited: 28
Showing 1-25 of 28 citing articles:
A Survey on Multimodal Large Language Models
Shukang Yin, Chaoyou Fu, Sirui Zhao, et al.
National Science Review (2024) Vol. 11, Iss. 12
Open Access | Times Cited: 61
Shukang Yin, Chaoyou Fu, Sirui Zhao, et al.
National Science Review (2024) Vol. 11, Iss. 12
Open Access | Times Cited: 61
BLIVA: A Simple Multimodal LLM for Better Handling of Text-Rich Visual Questions
Wenbo Hu, Yifan Xu, Yi Li, et al.
Proceedings of the AAAI Conference on Artificial Intelligence (2024) Vol. 38, Iss. 3, pp. 2256-2264
Open Access | Times Cited: 21
Wenbo Hu, Yifan Xu, Yi Li, et al.
Proceedings of the AAAI Conference on Artificial Intelligence (2024) Vol. 38, Iss. 3, pp. 2256-2264
Open Access | Times Cited: 21
TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding
Shuhuai Ren, Linli Yao, Shicheng Li, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. abs/2305.06500, pp. 14313-14323
Closed Access | Times Cited: 11
Shuhuai Ren, Linli Yao, Shicheng Li, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. abs/2305.06500, pp. 14313-14323
Closed Access | Times Cited: 11
Robust Visual Question Answering: Datasets, Methods, and Future Challenges
Jie Ma, Pinghui Wang, Dechen Kong, et al.
IEEE Transactions on Pattern Analysis and Machine Intelligence (2024) Vol. 46, Iss. 8, pp. 5575-5594
Open Access | Times Cited: 5
Jie Ma, Pinghui Wang, Dechen Kong, et al.
IEEE Transactions on Pattern Analysis and Machine Intelligence (2024) Vol. 46, Iss. 8, pp. 5575-5594
Open Access | Times Cited: 5
Latent Challenges of Multimodal Deep Learning Models: Taxonomy and Survey
Sachin Kumar, Olga Ivanova, Olga Vorfolomeeva, et al.
Lecture notes in networks and systems (2025), pp. 39-53
Closed Access
Sachin Kumar, Olga Ivanova, Olga Vorfolomeeva, et al.
Lecture notes in networks and systems (2025), pp. 39-53
Closed Access
GenCeption: Evaluate vision LLMS with unlabeled unimodal data
Lele Cao, Valentin Buchner, Zineb Senane, et al.
Computer Speech & Language (2025) Vol. 93, pp. 101785-101785
Closed Access
Lele Cao, Valentin Buchner, Zineb Senane, et al.
Computer Speech & Language (2025) Vol. 93, pp. 101785-101785
Closed Access
Can Pre-trained Vision and Language Models Answer Visual Information-Seeking Questions?
Yang Chen, Hexiang Hu, Yi Luan, et al.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (2023)
Open Access | Times Cited: 12
Yang Chen, Hexiang Hu, Yi Luan, et al.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (2023)
Open Access | Times Cited: 12
Mitigating Fine-Grained Hallucination by Fine-Tuning Large Vision-Language Models with Caption Rewrites
Lei Wang, Jiabang He, Shenshen Li, et al.
Lecture notes in computer science (2024), pp. 32-45
Closed Access | Times Cited: 4
Lei Wang, Jiabang He, Shenshen Li, et al.
Lecture notes in computer science (2024), pp. 32-45
Closed Access | Times Cited: 4
Exploring the Frontier of Vision-Language Models: A Survey of Current Methodologies and Future Directions
Akash Ghosh, A. Seetharama Acharya, Sriparna Saha, et al.
(2024)
Open Access | Times Cited: 4
Akash Ghosh, A. Seetharama Acharya, Sriparna Saha, et al.
(2024)
Open Access | Times Cited: 4
DRESS : Instructing Large Vision-Language Models to Align and Interact with Humans via Natural Language Feedback
Yangyi Chen, Karan Sikka, Michael Cogswell, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. 35, pp. 14239-14250
Closed Access | Times Cited: 4
Yangyi Chen, Karan Sikka, Michael Cogswell, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. 35, pp. 14239-14250
Closed Access | Times Cited: 4
Artificial general intelligence for radiation oncology
Chenbin Liu, Zhengliang Liu, Jason Holmes, et al.
Meta-Radiology (2023) Vol. 1, Iss. 3, pp. 100045-100045
Open Access | Times Cited: 11
Chenbin Liu, Zhengliang Liu, Jason Holmes, et al.
Meta-Radiology (2023) Vol. 1, Iss. 3, pp. 100045-100045
Open Access | Times Cited: 11
Large Language Model Instruction Following: A Survey of Progresses and Challenges
Renze Lou, Kai Zhang, Wenpeng Yin
Computational Linguistics (2024), pp. 1-43
Open Access | Times Cited: 3
Renze Lou, Kai Zhang, Wenpeng Yin
Computational Linguistics (2024), pp. 1-43
Open Access | Times Cited: 3
ShapeLLM: Universal 3D Object Understanding for Embodied Interaction
Zekun Qi, Runpei Dong, Shaochen Zhang, et al.
Lecture notes in computer science (2024), pp. 214-238
Closed Access | Times Cited: 3
Zekun Qi, Runpei Dong, Shaochen Zhang, et al.
Lecture notes in computer science (2024), pp. 214-238
Closed Access | Times Cited: 3
The Art of SOCRATIC QUESTIONING: Recursive Thinking with Large Language Models
Jingyuan Qi, Zhiyang Xu, Ying Shen, et al.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (2023)
Open Access | Times Cited: 4
Jingyuan Qi, Zhiyang Xu, Ying Shen, et al.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (2023)
Open Access | Times Cited: 4
Real-GPT: Efficiently Tailoring LLMs for Informed Decision-Making in the Real Estate Industry
Benedikt Gloria, Johannes Melsbach, Sven Bienert, et al.
Journal of Real Estate Portfolio Management (2024), pp. 1-17
Closed Access | Times Cited: 1
Benedikt Gloria, Johannes Melsbach, Sven Bienert, et al.
Journal of Real Estate Portfolio Management (2024), pp. 1-17
Closed Access | Times Cited: 1
A Survey on Stability of Learning with Limited Labelled Data and its Sensitivity to the Effects of Randomness
Branislav Pecher, Ivan Srba, Mária Bieliková
ACM Computing Surveys (2024) Vol. 57, Iss. 1, pp. 1-40
Open Access | Times Cited: 1
Branislav Pecher, Ivan Srba, Mária Bieliková
ACM Computing Surveys (2024) Vol. 57, Iss. 1, pp. 1-40
Open Access | Times Cited: 1
Automatic Estimation for Visual Quality Changes of Street Space Via Street-View Images and Multimodal Large Language Models
Hao Liang, Jiaxin Zhang, Yunqin Li, et al.
(2023)
Open Access | Times Cited: 2
Hao Liang, Jiaxin Zhang, Yunqin Li, et al.
(2023)
Open Access | Times Cited: 2
UniIR: Training and Benchmarking Universal Multimodal Information Retrievers
Chunyou Wei, Yang Chen, Haonan Chen, et al.
Lecture notes in computer science (2024), pp. 387-404
Closed Access
Chunyou Wei, Yang Chen, Haonan Chen, et al.
Lecture notes in computer science (2024), pp. 387-404
Closed Access
A Proposal for a Language Model Based Cognitive Architecture
K Knowles, Michael Witbrock, Gillian Dobbie, et al.
Proceedings of the AAAI Symposium Series (2024) Vol. 2, Iss. 1, pp. 295-301
Open Access
K Knowles, Michael Witbrock, Gillian Dobbie, et al.
Proceedings of the AAAI Symposium Series (2024) Vol. 2, Iss. 1, pp. 295-301
Open Access
MULTISCRIPT: Multimodal Script Learning for Supporting Open Domain Everyday Tasks
Jingyuan Qi, Min‐Qian Liu, Ying Shen, et al.
Proceedings of the AAAI Conference on Artificial Intelligence (2024) Vol. 38, Iss. 17, pp. 18888-18896
Open Access
Jingyuan Qi, Min‐Qian Liu, Ying Shen, et al.
Proceedings of the AAAI Conference on Artificial Intelligence (2024) Vol. 38, Iss. 17, pp. 18888-18896
Open Access
Automatic Estimation for Visual Quality Changes of Street Space via Street-View Images and Multimodal Large Language Models
Hao Liang, Jiaxin Zhang, Yunqin Li, et al.
IEEE Access (2024) Vol. 12, pp. 87713-87727
Open Access
Hao Liang, Jiaxin Zhang, Yunqin Li, et al.
IEEE Access (2024) Vol. 12, pp. 87713-87727
Open Access
DIEM: Decomposition-Integration Enhancing Multimodal Insights
Xinyi Jiang, Guoming Wang, Junhao Guo, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. 35, pp. 27294-27303
Closed Access
Xinyi Jiang, Guoming Wang, Junhao Guo, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. 35, pp. 27294-27303
Closed Access
EmoVIT: Revolutionizing Emotion Insights with Visual Instruction Tuning
Hongxia Xie, Chu-Jun Peng, Yu‐Wen Tseng, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024), pp. 26586-26595
Closed Access
Hongxia Xie, Chu-Jun Peng, Yu‐Wen Tseng, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024), pp. 26586-26595
Closed Access
What Makes Multimodal In-Context Learning Work?
Folco Bertini Baldassini, Mustafa Shukor, Matthieu Cord, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (2024), pp. 1539-1550
Open Access
Folco Bertini Baldassini, Mustafa Shukor, Matthieu Cord, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (2024), pp. 1539-1550
Open Access
UniCode: Learning a Unified Codebook for Multimodal Large Language Models
Sipeng Zheng, Bohan Zhou, Yicheng Feng, et al.
Lecture notes in computer science (2024), pp. 426-443
Closed Access
Sipeng Zheng, Bohan Zhou, Yicheng Feng, et al.
Lecture notes in computer science (2024), pp. 426-443
Closed Access