OpenAlex Citation Counts

OpenAlex Citations Logo

OpenAlex is a bibliographic catalogue of scientific papers, authors and institutions accessible in open access mode, named after the Library of Alexandria. It's citation coverage is excellent and I hope you will find utility in this listing of citing articles!

If you click the article title, you'll navigate to the article, as listed in CrossRef. If you click the Open Access links, you'll navigate to the "best Open Access location". Clicking the citation count will open this listing for that article. Lastly at the bottom of the page, you'll find basic pagination options.

Requested Article:

Text-Only Training for Image Captioning using Noise-Injected CLIP
David Nukrai, Ron Mokady, Amir Globerson
(2022)
Open Access | Times Cited: 42

Showing 1-25 of 42 citing articles:

Transferable Decoding with Visual Entities for Zero-Shot Image Captioning
Junjie Fei, Teng Wang, Jinrui Zhang, et al.
2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2023), pp. 3113-3123
Open Access | Times Cited: 19

AutoAD: Movie Description in Context
Tengda Han, Max Bain, Arsha Nagrani, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023) Vol. 3, pp. 18930-18940
Open Access | Times Cited: 16

MeaCap: Memory-Augmented Zero-shot Image Captioning
Zequn Zeng, Yan Xie, Hao Zhang, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. 35, pp. 14100-14110
Closed Access | Times Cited: 7

Learned Representation-Guided Diffusion Models for Large-Image Generation
Alexandros Graikos, Srikar Yellapragada, Minh-Quan Le, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024), pp. 8532-8542
Closed Access | Times Cited: 7

VicTR: Video-conditioned Text Representations for Activity Recognition
Kumara Kahatapitiya, Anurag Arnab, Arsha Nagran, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. 33, pp. 18547-18558
Closed Access | Times Cited: 6

Image Captioning with Multi-Context Synthetic Data
Feipeng Ma, Yizhou Zhou, Fengyun Rao, et al.
Proceedings of the AAAI Conference on Artificial Intelligence (2024) Vol. 38, Iss. 5, pp. 4089-4097
Open Access | Times Cited: 5

MM-Narrator: Narrating Long-form Videos with Multimodal In-Context Learning
Chaoyi Zhang, Kevin Lin, Zhengyuan Yang, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. 52, pp. 13647-13657
Closed Access | Times Cited: 4

Text-Only Synthesis for Zero-Shot Visual Captioning
Junyu Gao, Junlin Huang, Junyu Gao, et al.
(2025)
Closed Access

CgT-GAN: CLIP-guided Text GAN for Image Captioning
Jiarui Yu, Haoran Li, Yanbin Hao, et al.
(2023), pp. 2252-2263
Open Access | Times Cited: 10

ZeroNLG: Aligning and Autoencoding Domains for Zero-Shot Multimodal and Multilingual Natural Language Generation
Bang Yang, Fenglin Liu, Yuexian Zou, et al.
IEEE Transactions on Pattern Analysis and Machine Intelligence (2024) Vol. 46, Iss. 8, pp. 5712-5724
Open Access | Times Cited: 3

Training Audio Captioning Models without Audio
Soham Deshmukh, Benjamin Elizalde, Dimitra Emmanouilidou, et al.
ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2024), pp. 371-375
Open Access | Times Cited: 3

Improving Cross-Modal Alignment with Synthetic Pairs for Text-Only Image Captioning
Zhiyue Liu, Jinyuan Liu, Fanrong Ma
Proceedings of the AAAI Conference on Artificial Intelligence (2024) Vol. 38, Iss. 4, pp. 3864-3872
Open Access | Times Cited: 3

Military Image Captioning for Low-Altitude UAV or UGV Perspectives
Lizhi Pan, Chengtian Song, Xiaozheng Gan, et al.
Drones (2024) Vol. 8, Iss. 9, pp. 421-421
Open Access | Times Cited: 3

Language-only Efficient Training of Zero-shot Composed Image Retrieval
Geonmo Gu, Sanghyuk Chun, Wonjae Kim, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. 33, pp. 13225-13234
Closed Access | Times Cited: 3

Learning Text-to-Video Retrieval from Image Captioning
Lucas Ventura, Cordelia Schmid, Gül Varol
International Journal of Computer Vision (2024)
Closed Access | Times Cited: 3

Guiding image captioning models toward more specific captions
Simon Kornblith, Lala Li, Zi-Rui Wang, et al.
2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2023), pp. 15213-15223
Open Access | Times Cited: 8

I can’t believe there’s no images! : Learning Visual Tasks Using Only Language Supervision
Sophia Gu, Christopher M. Clark, Aniruddha Kembhavi
2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2023), pp. 2672-2683
Open Access | Times Cited: 6

TSIC-CLIP: Traffic Scene Image Captioning Model Based on Clip
Hao Zhang, Cheng Xu, Bingxin Xu, et al.
Information Technology And Control (2024) Vol. 53, Iss. 1, pp. 98-114
Open Access | Times Cited: 1

NLP-Based Fusion Approach to Robust Image Captioning
Riccardo Ricci, Farid Melgani, José Marcato, et al.
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing (2024) Vol. 17, pp. 11809-11822
Open Access | Times Cited: 1

Mining core information by evaluating semantic importance for unpaired image captioning
Jiahui Wei, Zhixin Li, Canlong Zhang, et al.
Neural Networks (2024) Vol. 179, pp. 106519-106519
Closed Access | Times Cited: 1

Improving Medical Multi-modal Contrastive Learning with Expert Annotations
Yogesh Kumar, Pekka Marttinen
Lecture notes in computer science (2024), pp. 468-486
Closed Access | Times Cited: 1

MultiCapCLIP: Auto-Encoding Prompts for Zero-Shot Multilingual Visual Captioning
Bang Yang, Fenglin Liu, Xian Wu, et al.
(2023), pp. 11908-11922
Open Access | Times Cited: 3

Multimodal Prompt Learning for Product Title Generation with Extremely Limited Labels
Bang Yang, Fenglin Liu, Zheng Li, et al.
Findings of the Association for Computational Linguistics: ACL 2022 (2023)
Open Access | Times Cited: 3

ZeroGen: Zero-Shot Multimodal Controllable Text Generation with Multiple Oracles
Haoqin Tu, Bowen Yang, Xianfeng Zhao
Lecture notes in computer science (2023), pp. 494-506
Closed Access | Times Cited: 3

Semantic-Enhanced Cross-Modal Fusion for Improved Unsupervised Image Captioning
Nan Xiang, Ling Chen, Leiyan Liang, et al.
Electronics (2023) Vol. 12, Iss. 17, pp. 3549-3549
Open Access | Times Cited: 2

Page 1 - Next Page

Scroll to top