OpenAlex Citation Counts

OpenAlex Citations Logo

OpenAlex is a bibliographic catalogue of scientific papers, authors and institutions accessible in open access mode, named after the Library of Alexandria. It's citation coverage is excellent and I hope you will find utility in this listing of citing articles!

If you click the article title, you'll navigate to the article, as listed in CrossRef. If you click the Open Access links, you'll navigate to the "best Open Access location". Clicking the citation count will open this listing for that article. Lastly at the bottom of the page, you'll find basic pagination options.

Requested Article:

SimVLM: Simple Visual Language Model Pretraining with Weak Supervision
Zirui Wang, Jiahui Yu, Adams Wei Yu, et al.
arXiv (Cornell University) (2021)
Open Access | Times Cited: 300

Showing 1-25 of 300 citing articles:

InstructPix2Pix: Learning to Follow Image Editing Instructions
Tim Brooks, Aleksander Holynski, Alexei A. Efros
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023), pp. 18392-18402
Open Access | Times Cited: 534

Multimodal Learning With Transformers: A Survey
Peng Xu, Xiatian Zhu, David A. Clifton
IEEE Transactions on Pattern Analysis and Machine Intelligence (2023) Vol. 45, Iss. 10, pp. 12113-12132
Open Access | Times Cited: 338

Florence: A New Foundation Model for Computer Vision
Lu Yuan, Dongdong Chen, Yi‐Ling Chen, et al.
arXiv (Cornell University) (2021)
Open Access | Times Cited: 316

FLAVA: A Foundational Language And Vision Alignment Model
Amanpreet Singh, Ronghang Hu, Vedanuj Goswami, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022), pp. 15617-15629
Open Access | Times Cited: 295

Image as a Foreign Language: BEIT Pretraining for Vision and Vision-Language Tasks
Wenhui Wang, Hangbo Bao, Dong Li, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023), pp. 19175-19186
Closed Access | Times Cited: 256

An Empirical Study of Training End-to-End Vision-and-Language Transformers
Zi-Yi Dou, Yichong Xu, Zhe Gan, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
Open Access | Times Cited: 218

From Show to Tell: A Survey on Deep Learning-Based Image Captioning
Matteo Stefanini, Marcella Cornia, Lorenzo Baraldi, et al.
IEEE Transactions on Pattern Analysis and Machine Intelligence (2022) Vol. 45, Iss. 1, pp. 539-559
Open Access | Times Cited: 213

CRIS: CLIP-Driven Referring Image Segmentation
Zhaoqing Wang, Yu Lu, Qiang Li, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022), pp. 11676-11685
Open Access | Times Cited: 190

VLMo: Unified Vision-Language Pre-Training with Mixture-of-Modality-Experts
Wenhui Wang, Hangbo Bao, Dong Li, et al.
arXiv (Cornell University) (2021)
Open Access | Times Cited: 174

FILIP: Fine-grained Interactive Language-Image Pre-Training
Lewei Yao, Runhui Huang, Lu Hou, et al.
arXiv (Cornell University) (2021)
Open Access | Times Cited: 160

Scaling Up Vision-Language Pretraining for Image Captioning
Xiaowei Hu, Zhe Gan, Jianfeng Wang, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
Open Access | Times Cited: 146

Scaling Language-Image Pre-Training via Masking
Yanghao Li, Haoqi Fan, Ronghang Hu, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023)
Open Access | Times Cited: 125

Unified Contrastive Learning in Image-Text-Label Space
Jianwei Yang, Chunyuan Li, Pengchuan Zhang, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022), pp. 19141-19151
Open Access | Times Cited: 124

Large AI Models in Health Informatics: Applications, Challenges, and the Future
Jianing Qiu, Lin Li, Jiankai Sun, et al.
IEEE Journal of Biomedical and Health Informatics (2023) Vol. 27, Iss. 12, pp. 6074-6087
Open Access | Times Cited: 113

MERLOT RESERVE: Neural Script Knowledge through Vision and Language and Sound
Rowan Zellers, Jiasen Lu, Ximing Lu, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022), pp. 16354-16366
Open Access | Times Cited: 98

Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning
Antoine Yang, Arsha Nagrani, Paul Hongsuck Seo, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023)
Open Access | Times Cited: 97

Generalized Decoding for Pixel, Image, and Language
Xueyan Zou, Zi-Yi Dou, Jianwei Yang, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023), pp. 15116-15127
Open Access | Times Cited: 95

mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections
Chenliang Li, Haiyang Xu, Junfeng Tian, et al.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (2022)
Open Access | Times Cited: 93

Injecting Semantic Concepts into End-to-End Image Captioning
Zhiyuan Fang, Jianfeng Wang, Xiaowei Hu, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022), pp. 17988-17998
Open Access | Times Cited: 84

A Survey on Multimodal Large Language Models for Autonomous Driving
Can Cui, Yunsheng Ma, Xu Cao, et al.
(2024), pp. 958-979
Open Access | Times Cited: 81

CLIP-Event: Connecting Text and Images with Event Structures
Manling Li, Ruochen Xu, Shuohang Wang, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022), pp. 16399-16408
Open Access | Times Cited: 80

A Survey of Vision-Language Pre-Trained Models
Yifan Du, Zikang Liu, Junyi Li, et al.
Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (2022), pp. 5436-5443
Open Access | Times Cited: 73

CLIP-ReID: Exploiting Vision-Language Model for Image Re-identification without Concrete Text Labels
Siyuan Li, Sun Li, Qingli Li
Proceedings of the AAAI Conference on Artificial Intelligence (2023) Vol. 37, Iss. 1, pp. 1405-1413
Open Access | Times Cited: 68

Visual-Language Prompt Tuning with Knowledge-Guided Context Optimization
Hantao Yao, Rui Zhang, Changsheng Xu
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023)
Open Access | Times Cited: 64

From Images to Textual Prompts: Zero-shot Visual Question Answering with Frozen Large Language Models
Jiaxian Guo, Junnan Li, Dongxu Li, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023), pp. 10867-10877
Closed Access | Times Cited: 59

Page 1 - Next Page

Scroll to top