OpenAlex Citation Counts

OpenAlex is a bibliographic catalogue of scientific papers, authors and institutions accessible in open access mode, named after the Library of Alexandria. It's citation coverage is excellent and I hope you will find utility in this listing of citing articles!

If you click the article title, you'll navigate to the article, as listed in CrossRef. If you click the Open Access links, you'll navigate to the "best Open Access location". Clicking the citation count will open this listing for that article. Lastly at the bottom of the page, you'll find basic pagination options.

Requested Article:

Scaling Up Vision-Language Pretraining for Image Captioning
Xiaowei Hu, Zhe Gan, Jianfeng Wang, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
Open Access | Times Cited: 146

Showing 1-25 of 146 citing articles:

Multimodal Learning With Transformers: A Survey
Peng Xu, Xiatian Zhu, David A. Clifton
IEEE Transactions on Pattern Analysis and Machine Intelligence (2023) Vol. 45, Iss. 10, pp. 12113-12132
Open Access | Times Cited: 338

From Show to Tell: A Survey on Deep Learning-Based Image Captioning
Matteo Stefanini, Marcella Cornia, Lorenzo Baraldi, et al.
IEEE Transactions on Pattern Analysis and Machine Intelligence (2022) Vol. 45, Iss. 1, pp. 539-559
Open Access | Times Cited: 213

Reproducible Scaling Laws for Contrastive Language-Image Learning
Mehdi Cherti, Romain Beaumont, Ross Wightman, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023), pp. 2818-2829
Closed Access | Times Cited: 198

A comprehensive survey on applications of transformers for deep learning tasks
Saidul Islam, Hanae Elmekki, Ahmed Elsebai, et al.
Expert Systems with Applications (2023) Vol. 241, pp. 122666-122666
Open Access | Times Cited: 106

Vid2Seq: Large-Scale Pretraining of a Visual Language Model for Dense Video Captioning
Antoine Yang, Arsha Nagrani, Paul Hongsuck Seo, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023)
Open Access | Times Cited: 97

mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections
Chenliang Li, Haiyang Xu, Junfeng Tian, et al.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (2022)
Open Access | Times Cited: 93

Sigmoid Loss for Language Image Pre-Training
Xiaohua Zhai, Basil Mustafa, А. И. Колесников, et al.
2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2023)
Open Access | Times Cited: 76

Deep Learning Approaches on Image Captioning: A Review
Taraneh Ghandi, Hamid Reza Pourreza, Hamidreza Mahyar
ACM Computing Surveys (2023) Vol. 56, Iss. 3, pp. 1-39
Open Access | Times Cited: 67

Learning Video Representations from Large Language Models
Yue Zhao, Ishan Misra, Philipp Krähenbühl, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023)
Open Access | Times Cited: 64

Smallcap: Lightweight Image Captioning Prompted with Retrieval Augmentation
Rita Ramos, Bruno Martins, Desmond Elliott, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023)
Open Access | Times Cited: 47

GLaMM: Pixel Grounding Large Multimodal Model
Hanoona Rasheed, Muhammad Maaz, Sahal Shaji, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024), pp. 13009-13018
Closed Access | Times Cited: 24

FuseCap: Leveraging Large Language Models for Enriched Fused Image Captions
Noam Rotstein, David Bensaïd, Shaked Brody, et al.
2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2024), pp. 5677-5688
Open Access | Times Cited: 15

The Unreasonable Effectiveness of CLIP Features for Image Captioning: An Experimental Analysis
Manuele Barraco, Marcella Cornia, Silvia Cascianelli, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (2022)
Open Access | Times Cited: 54

Translation between Molecules and Natural Language
Carl K. Edwards, Tuan Lai, Kevin Ros, et al.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (2022)
Open Access | Times Cited: 54

UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language Modeling
Zhengyuan Yang, Zhe Gan, Jianfeng Wang, et al.
Lecture notes in computer science (2022), pp. 521-539
Closed Access | Times Cited: 53

Text-Only Training for Image Captioning using Noise-Injected CLIP
David Nukrai, Ron Mokady, Amir Globerson
(2022)
Open Access | Times Cited: 42

Enabling Multimodal Generation on CLIP via Vision-Language Knowledge Distillation
Wenliang Dai, Lu Hou, Lifeng Shang, et al.
Findings of the Association for Computational Linguistics: ACL 2022 (2022)
Open Access | Times Cited: 40

VindLU: A Recipe for Effective Video-and-Language Pretraining
Feng Cheng, Xizi Wang, Jie Lei, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023)
Open Access | Times Cited: 38

Deep image captioning: A review of methods, trends and future challenges
Liming Xu, Quan Tang, Jiancheng Lv, et al.
Neurocomputing (2023) Vol. 546, pp. 126287-126287
Closed Access | Times Cited: 29

ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing
Zequn Zeng, Hao Zhang, Ruiying Lu, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023), pp. 23465-23476
Open Access | Times Cited: 26

MMNet: Multi-Collaboration and Multi-Supervision Network for Sequential Deepfake Detection
Ruiyang Xia, Decheng Liu, Jie Li, et al.
IEEE Transactions on Information Forensics and Security (2024) Vol. 19, pp. 3409-3422
Open Access | Times Cited: 8

Universal and extensible language-vision models for organ segmentation and tumor detection from abdominal computed tomography
Jie Liu, Yixiao Zhang, Kang Wang, et al.
Medical Image Analysis (2024) Vol. 97, pp. 103226-103226
Open Access | Times Cited: 8

ViTs as backbones: Leveraging vision transformers for feature extraction
Omar Elharrouss, Yassine Himeur, Yasir Mahmood, et al.
Information Fusion (2025), pp. 102951-102951
Closed Access | Times Cited: 1

Controllable Image Captioning via Prompting
Ning Wang, Jiahao Xie, Jihao Wu, et al.
Proceedings of the AAAI Conference on Artificial Intelligence (2023) Vol. 37, Iss. 2, pp. 2617-2625
Open Access | Times Cited: 21

Context-aware Alignment and Mutual Masking for 3D-Language Pre-training
Jin Zhao, Munawar Hayat, Yuwei Yang, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023), pp. 10984-10994
Closed Access | Times Cited: 21

Page 1 - Next Page

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Analytics" category .
cookielawinfo-checkbox-functional	1 year	The cookie is set by the GDPR Cookie Consent plugin to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Necessary" category .
cookielawinfo-checkbox-others	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to store the user consent for cookies in the category "Others".
cookielawinfo-checkbox-performance	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to store the user consent for cookies in the category "Performance".
CookieLawInfoConsent	1 year	Records the default button state of the corresponding category & the status of CCPA. It works only in coordination with the primary cookie.
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.

Requested Article:

Showing 1-25 of 146 citing articles:

Your Privacy