
OpenAlex is a bibliographic catalogue of scientific papers, authors and institutions accessible in open access mode, named after the Library of Alexandria. It's citation coverage is excellent and I hope you will find utility in this listing of citing articles!
If you click the article title, you'll navigate to the article, as listed in CrossRef. If you click the Open Access links, you'll navigate to the "best Open Access location". Clicking the citation count will open this listing for that article. Lastly at the bottom of the page, you'll find basic pagination options.
Requested Article:
GRIT: Faster and Better Image Captioning Transformer Using Dual Visual Features
Van-Quang Nguyen, Masanori Suganuma, Takayuki Okatani
Lecture notes in computer science (2022), pp. 167-184
Open Access | Times Cited: 76
Van-Quang Nguyen, Masanori Suganuma, Takayuki Okatani
Lecture notes in computer science (2022), pp. 167-184
Open Access | Times Cited: 76
Showing 1-25 of 76 citing articles:
Deep Learning Approaches on Image Captioning: A Review
Taraneh Ghandi, Hamid Reza Pourreza, Hamidreza Mahyar
ACM Computing Surveys (2023) Vol. 56, Iss. 3, pp. 1-39
Open Access | Times Cited: 68
Taraneh Ghandi, Hamid Reza Pourreza, Hamidreza Mahyar
ACM Computing Surveys (2023) Vol. 56, Iss. 3, pp. 1-39
Open Access | Times Cited: 68
KiUT: Knowledge-injected U-Transformer for Radiology Report Generation
Zhongzhen Huang, Xiaofan Zhang, Shaoting Zhang
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023), pp. 19809-19818
Open Access | Times Cited: 42
Zhongzhen Huang, Xiaofan Zhang, Shaoting Zhang
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023), pp. 19809-19818
Open Access | Times Cited: 42
ConZIC: Controllable Zero-shot Image Captioning by Sampling-Based Polishing
Zequn Zeng, Hao Zhang, Ruiying Lu, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023), pp. 23465-23476
Open Access | Times Cited: 27
Zequn Zeng, Hao Zhang, Ruiying Lu, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023), pp. 23465-23476
Open Access | Times Cited: 27
Memory-Based Augmentation Network for Video Captioning
Shuaiqi Jing, Haonan Zhang, Pengpeng Zeng, et al.
IEEE Transactions on Multimedia (2023) Vol. 26, pp. 2367-2379
Closed Access | Times Cited: 23
Shuaiqi Jing, Haonan Zhang, Pengpeng Zeng, et al.
IEEE Transactions on Multimedia (2023) Vol. 26, pp. 2367-2379
Closed Access | Times Cited: 23
LGR-NET: Language Guided Reasoning Network for Referring Expression Comprehension
Mingcong Lu, Ruifan Li, Fangxiang Feng, et al.
IEEE Transactions on Circuits and Systems for Video Technology (2024) Vol. 34, Iss. 8, pp. 7771-7784
Closed Access | Times Cited: 11
Mingcong Lu, Ruifan Li, Fangxiang Feng, et al.
IEEE Transactions on Circuits and Systems for Video Technology (2024) Vol. 34, Iss. 8, pp. 7771-7784
Closed Access | Times Cited: 11
End-to-End 3D Dense Captioning with Vote2Cap-DETR
Sijin Chen, Hongyuan Zhu, Xin Chen, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023) Vol. 41, pp. 11124-11133
Open Access | Times Cited: 21
Sijin Chen, Hongyuan Zhu, Xin Chen, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023) Vol. 41, pp. 11124-11133
Open Access | Times Cited: 21
HierVL: Learning Hierarchical Video-Language Embeddings
Kumar Ashutosh, Rohit Girdhar, Lorenzo Torresani, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023) Vol. 114, pp. 23066-23078
Open Access | Times Cited: 20
Kumar Ashutosh, Rohit Girdhar, Lorenzo Torresani, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023) Vol. 114, pp. 23066-23078
Open Access | Times Cited: 20
Improving visual question answering for bridge inspection by pre‐training with external data of image–text pairs
Thannarot Kunlamai, Tatsuro Yamane, Masanori Suganuma, et al.
Computer-Aided Civil and Infrastructure Engineering (2023) Vol. 39, Iss. 3, pp. 345-361
Open Access | Times Cited: 19
Thannarot Kunlamai, Tatsuro Yamane, Masanori Suganuma, et al.
Computer-Aided Civil and Infrastructure Engineering (2023) Vol. 39, Iss. 3, pp. 345-361
Open Access | Times Cited: 19
MeaCap: Memory-Augmented Zero-shot Image Captioning
Zequn Zeng, Yan Xie, Hao Zhang, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. 35, pp. 14100-14110
Closed Access | Times Cited: 7
Zequn Zeng, Yan Xie, Hao Zhang, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. 35, pp. 14100-14110
Closed Access | Times Cited: 7
Polos: Multimodal Metric Learning from Human Feedback for Image Captioning
Yuiga Wada, Kanta Kaneda, Daichi Saito, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. 37, pp. 13559-13568
Closed Access | Times Cited: 7
Yuiga Wada, Kanta Kaneda, Daichi Saito, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. 37, pp. 13559-13568
Closed Access | Times Cited: 7
Vote2Cap-DETR++: Decoupling Localization and Describing for End-to-End 3D Dense Captioning
Sijin Chen, Hongyuan Zhu, Mingsheng Li, et al.
IEEE Transactions on Pattern Analysis and Machine Intelligence (2024) Vol. 46, Iss. 11, pp. 7331-7347
Open Access | Times Cited: 6
Sijin Chen, Hongyuan Zhu, Mingsheng Li, et al.
IEEE Transactions on Pattern Analysis and Machine Intelligence (2024) Vol. 46, Iss. 11, pp. 7331-7347
Open Access | Times Cited: 6
ExpansionNet v2: Block Static Expansion in fast end to end training for Image Captioning
Jia Cheng Hu, Roberto Cavicchioli, Alessandro Capotondi
arXiv (Cornell University) (2022)
Open Access | Times Cited: 19
Jia Cheng Hu, Roberto Cavicchioli, Alessandro Capotondi
arXiv (Cornell University) (2022)
Open Access | Times Cited: 19
Fashion-Oriented Image Captioning with External Knowledge Retrieval and Fully Attentive Gates
Nicholas Moratelli, Manuele Barraco, Davide Morelli, et al.
Sensors (2023) Vol. 23, Iss. 3, pp. 1286-1286
Open Access | Times Cited: 12
Nicholas Moratelli, Manuele Barraco, Davide Morelli, et al.
Sensors (2023) Vol. 23, Iss. 3, pp. 1286-1286
Open Access | Times Cited: 12
Embedded Heterogeneous Attention Transformer for Cross-Lingual Image Captioning
Zijie Song, Zhenzhen Hu, Yuanen Zhou, et al.
IEEE Transactions on Multimedia (2024) Vol. 26, pp. 9008-9020
Closed Access | Times Cited: 4
Zijie Song, Zhenzhen Hu, Yuanen Zhou, et al.
IEEE Transactions on Multimedia (2024) Vol. 26, pp. 9008-9020
Closed Access | Times Cited: 4
Leveraging ensemble deep models and llm for visual polysemy and word sense disambiguation
Insaf Setitra, Praboda Rajapaksha, Aung Kaung Myat, et al.
Multimedia Tools and Applications (2025)
Closed Access
Insaf Setitra, Praboda Rajapaksha, Aung Kaung Myat, et al.
Multimedia Tools and Applications (2025)
Closed Access
SCAP: enhancing image captioning through lightweight feature sifting and hierarchical decoding
Yuhao Zhang, Jiaqi Tong, Honglin Liu
The Visual Computer (2025)
Closed Access
Yuhao Zhang, Jiaqi Tong, Honglin Liu
The Visual Computer (2025)
Closed Access
Scene graph sorting and shuffle polishing based controllable image captioning
Guichang Wu, Qian Zhao, Xiushu Liu
Signal Image and Video Processing (2025) Vol. 19, Iss. 4
Closed Access
Guichang Wu, Qian Zhao, Xiushu Liu
Signal Image and Video Processing (2025) Vol. 19, Iss. 4
Closed Access
CDZL: a controllable diversity zero-shot image caption model using large language models
Xin Zhao, Weiwei Kong, Zongyao Liu, et al.
Signal Image and Video Processing (2025) Vol. 19, Iss. 4
Closed Access
Xin Zhao, Weiwei Kong, Zongyao Liu, et al.
Signal Image and Video Processing (2025) Vol. 19, Iss. 4
Closed Access
Dual-visual collaborative enhanced transformer for image captioning
Zhenping Mou, Tianqi Song, Luo Hong
Multimedia Systems (2025) Vol. 31, Iss. 2
Closed Access
Zhenping Mou, Tianqi Song, Luo Hong
Multimedia Systems (2025) Vol. 31, Iss. 2
Closed Access
Multimodal artificial intelligence approaches using large language models for expert‐level landslide image analysis
Kittitouch Areerob, Van‐Quang Nguyen, Xianfeng Li, et al.
Computer-Aided Civil and Infrastructure Engineering (2025)
Open Access
Kittitouch Areerob, Van‐Quang Nguyen, Xianfeng Li, et al.
Computer-Aided Civil and Infrastructure Engineering (2025)
Open Access
See, Perceive, and Answer: A Unified Benchmark for High-Resolution Postdisaster Evaluation in Remote Sensing Images
Danpei Zhao, Jiankai Lu, Bo Yuan
IEEE Transactions on Geoscience and Remote Sensing (2024) Vol. 62, pp. 1-14
Closed Access | Times Cited: 3
Danpei Zhao, Jiankai Lu, Bo Yuan
IEEE Transactions on Geoscience and Remote Sensing (2024) Vol. 62, pp. 1-14
Closed Access | Times Cited: 3
Detours for Navigating Instructional Videos
Kumar Ashutosh, Zihui Xue, Tushar Nagarajan, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. 33, pp. 18804-18815
Closed Access | Times Cited: 3
Kumar Ashutosh, Zihui Xue, Tushar Nagarajan, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. 33, pp. 18804-18815
Closed Access | Times Cited: 3
Image Captioning With Controllable and Adaptive Length Levels
Ning Ding, Chaorui Deng, Mingkui Tan, et al.
IEEE Transactions on Pattern Analysis and Machine Intelligence (2023) Vol. 46, Iss. 2, pp. 764-779
Closed Access | Times Cited: 9
Ning Ding, Chaorui Deng, Mingkui Tan, et al.
IEEE Transactions on Pattern Analysis and Machine Intelligence (2023) Vol. 46, Iss. 2, pp. 764-779
Closed Access | Times Cited: 9
SPT: Spatial Pyramid Transformer for Image Captioning
Haonan Zhang, Pengpeng Zeng, Lianli Gao, et al.
IEEE Transactions on Circuits and Systems for Video Technology (2023) Vol. 34, Iss. 6, pp. 4829-4842
Closed Access | Times Cited: 8
Haonan Zhang, Pengpeng Zeng, Lianli Gao, et al.
IEEE Transactions on Circuits and Systems for Video Technology (2023) Vol. 34, Iss. 6, pp. 4829-4842
Closed Access | Times Cited: 8
Affective Image Captioning for Visual Artworks Using Emotion-Based Cross-Attention Mechanisms
Shintaro Ishikawa, Komei Sugiura
IEEE Access (2023) Vol. 11, pp. 24527-24534
Open Access | Times Cited: 7
Shintaro Ishikawa, Komei Sugiura
IEEE Access (2023) Vol. 11, pp. 24527-24534
Open Access | Times Cited: 7