
OpenAlex is a bibliographic catalogue of scientific papers, authors and institutions accessible in open access mode, named after the Library of Alexandria. It's citation coverage is excellent and I hope you will find utility in this listing of citing articles!
If you click the article title, you'll navigate to the article, as listed in CrossRef. If you click the Open Access links, you'll navigate to the "best Open Access location". Clicking the citation count will open this listing for that article. Lastly at the bottom of the page, you'll find basic pagination options.
Requested Article:
Vision-Language Pre-Training with Triple Contrastive Learning
Jinyu Yang, Jiali Duan, Son N. Tran, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022), pp. 15650-15659
Open Access | Times Cited: 183
Jinyu Yang, Jiali Duan, Son N. Tran, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2022), pp. 15650-15659
Open Access | Times Cited: 183
Showing 26-50 of 183 citing articles:
Cross-Modal and Uni-Modal Soft-Label Alignment for Image-Text Retrieval
Hailang Huang, Zhijie Nie, Ziqiao Wang, et al.
Proceedings of the AAAI Conference on Artificial Intelligence (2024) Vol. 38, Iss. 16, pp. 18298-18306
Open Access | Times Cited: 6
Hailang Huang, Zhijie Nie, Ziqiao Wang, et al.
Proceedings of the AAAI Conference on Artificial Intelligence (2024) Vol. 38, Iss. 16, pp. 18298-18306
Open Access | Times Cited: 6
Transferable Multimodal Attack on Vision-Language Pre-training Models
Haodi Wang, Kai Dong, Zhilei Zhu, et al.
2022 IEEE Symposium on Security and Privacy (SP) (2024) Vol. 34, pp. 1722-1740
Closed Access | Times Cited: 6
Haodi Wang, Kai Dong, Zhilei Zhu, et al.
2022 IEEE Symposium on Security and Privacy (SP) (2024) Vol. 34, pp. 1722-1740
Closed Access | Times Cited: 6
MAP: Multimodal Uncertainty-Aware Vision-Language Pre-training Model
Yatai Ji, Junjie Wang, Yuan Gong, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023), pp. 23262-23271
Open Access | Times Cited: 15
Yatai Ji, Junjie Wang, Yuan Gong, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023), pp. 23262-23271
Open Access | Times Cited: 15
A Sentence Speaks a Thousand Images: Domain Generalization through Distilling CLIP with Language Guidance
Zeyi Huang, Andy Zhou, Zijian Lin, et al.
2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2023), pp. 11651-11661
Open Access | Times Cited: 14
Zeyi Huang, Andy Zhou, Zijian Lin, et al.
2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2023), pp. 11651-11661
Open Access | Times Cited: 14
Set-level Guidance Attack: Boosting Adversarial Transferability of Vision-Language Pre-training Models
Lu Dong, Zhiqiang Wang, Teng Wang, et al.
2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2023)
Open Access | Times Cited: 14
Lu Dong, Zhiqiang Wang, Teng Wang, et al.
2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2023)
Open Access | Times Cited: 14
Efficient Token-Guided Image-Text Retrieval With Consistent Multimodal Contrastive Training
Chong Liu, Yuqi Zhang, Hongsong Wang, et al.
IEEE Transactions on Image Processing (2023) Vol. 32, pp. 3622-3633
Open Access | Times Cited: 13
Chong Liu, Yuqi Zhang, Hongsong Wang, et al.
IEEE Transactions on Image Processing (2023) Vol. 32, pp. 3622-3633
Open Access | Times Cited: 13
TinyCLIP: CLIP Distillation via Affinity Mimicking and Weight Inheritance
Kan Wu, Houwen Peng, Zhenghong Zhou, et al.
2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2023), pp. 21913-21923
Open Access | Times Cited: 13
Kan Wu, Houwen Peng, Zhenghong Zhou, et al.
2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2023), pp. 21913-21923
Open Access | Times Cited: 13
Domain Aligned CLIP for Few-shot Classification
Muhammad Waleed Gondal, Jochen Gast, Inigo Alonso Ruiz, et al.
2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2024)
Open Access | Times Cited: 5
Muhammad Waleed Gondal, Jochen Gast, Inigo Alonso Ruiz, et al.
2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2024)
Open Access | Times Cited: 5
Multi-domain encoder–decoder neural networks for latent data assimilation in dynamical systems
Sibo Cheng, Yilin Zhuang, Lyes Kahouadji, et al.
Computer Methods in Applied Mechanics and Engineering (2024) Vol. 430, pp. 117201-117201
Open Access | Times Cited: 5
Sibo Cheng, Yilin Zhuang, Lyes Kahouadji, et al.
Computer Methods in Applied Mechanics and Engineering (2024) Vol. 430, pp. 117201-117201
Open Access | Times Cited: 5
M-GENE: Multiview genes expression network ensemble for bone metabolism-related gene classification
Keyi Yu, Weilong Tan, Jirong Ge, et al.
Neurocomputing (2025) Vol. 622, pp. 129318-129318
Closed Access
Keyi Yu, Weilong Tan, Jirong Ge, et al.
Neurocomputing (2025) Vol. 622, pp. 129318-129318
Closed Access
Large Vision-Language Model Alignment and Misalignment: A Survey Through the Lens of Explainability
Shuwen Dong, Haiyan Zhao, Jingyu Hu, et al.
(2025)
Open Access
Shuwen Dong, Haiyan Zhao, Jingyu Hu, et al.
(2025)
Open Access
Semantic relation-aware graph attention network with noise augmented layer-wise contrastive learning for recommendation
Jianfang Liu, Wei Wang, Baolin Yi, et al.
Knowledge-Based Systems (2025), pp. 113217-113217
Closed Access
Jianfang Liu, Wei Wang, Baolin Yi, et al.
Knowledge-Based Systems (2025), pp. 113217-113217
Closed Access
Intramodal consistency in triplet-based cross-modal learning for image retrieval
Mario Mallea, Ricardo Ñanculef, Mauricio Araya
Machine Learning (2025) Vol. 114, Iss. 4
Open Access
Mario Mallea, Ricardo Ñanculef, Mauricio Araya
Machine Learning (2025) Vol. 114, Iss. 4
Open Access
Boosting adversarial transferability in vision-language models via multimodal feature heterogeneity
Long Chen, Yuling Chen, Zhi Ouyang, et al.
Scientific Reports (2025) Vol. 15, Iss. 1
Open Access
Long Chen, Yuling Chen, Zhi Ouyang, et al.
Scientific Reports (2025) Vol. 15, Iss. 1
Open Access
Multimodal alignment augmentation transferable attack on vision-language pre-training models
Tingchao Fu, Jinhong Zhang, Fanxiao Li, et al.
Pattern Recognition Letters (2025)
Closed Access
Tingchao Fu, Jinhong Zhang, Fanxiao Li, et al.
Pattern Recognition Letters (2025)
Closed Access
Multi-level Matching Network for Multimodal Entity Linking
Zhiwei Hu, Víctor Gutiérrez-Basulto, Ru Li, et al.
(2025), pp. 508-519
Closed Access
Zhiwei Hu, Víctor Gutiérrez-Basulto, Ru Li, et al.
(2025), pp. 508-519
Closed Access
Language-Image Consistency Augmentation and Distillation Network for visual grounding
Ke Xiao, Peirong Xu, Wenzhong Guo
Pattern Recognition (2025), pp. 111663-111663
Closed Access
Ke Xiao, Peirong Xu, Wenzhong Guo
Pattern Recognition (2025), pp. 111663-111663
Closed Access
Learning Customized Visual Models with Retrieval-Augmented Knowledge
Haotian Liu, Kilho Son, Jianwei Yang, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023), pp. 15148-15158
Open Access | Times Cited: 12
Haotian Liu, Kilho Son, Jianwei Yang, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023), pp. 15148-15158
Open Access | Times Cited: 12
KD-DLGAN: Data Limited Image Generation via Knowledge Distillation
Kaiwen Cui, Yingchen Yu, Fangneng Zhan, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023)
Open Access | Times Cited: 12
Kaiwen Cui, Yingchen Yu, Fangneng Zhan, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023)
Open Access | Times Cited: 12
Improving Augmentation Consistency for Graph Contrastive Learning
Weixin Bu, Xiaofeng Cao, Yizhen Zheng, et al.
Pattern Recognition (2023) Vol. 148, pp. 110182-110182
Closed Access | Times Cited: 12
Weixin Bu, Xiaofeng Cao, Yizhen Zheng, et al.
Pattern Recognition (2023) Vol. 148, pp. 110182-110182
Closed Access | Times Cited: 12
Cross-Modal Adaptive Dual Association for Text-to-Image Person Retrieval
D. M. Lin, Yi-Xing Peng, Jingke Meng, et al.
IEEE Transactions on Multimedia (2024) Vol. 26, pp. 6609-6620
Open Access | Times Cited: 4
D. M. Lin, Yi-Xing Peng, Jingke Meng, et al.
IEEE Transactions on Multimedia (2024) Vol. 26, pp. 6609-6620
Open Access | Times Cited: 4
3VL: Using Trees to Improve Vision-Language Models’ Interpretability
Nir Yellinek, Leonid Karlinsky, Raja Giryes
IEEE Transactions on Image Processing (2025) Vol. 34, pp. 495-509
Open Access
Nir Yellinek, Leonid Karlinsky, Raja Giryes
IEEE Transactions on Image Processing (2025) Vol. 34, pp. 495-509
Open Access
View-Based Knowledge-Augmented Multimodal Semantic Understanding for Optical Remote Sensing Images
Lilu Zhu, Xiaolu Su, Jianan Tang, et al.
IEEE Transactions on Geoscience and Remote Sensing (2025) Vol. 63, pp. 1-33
Closed Access
Lilu Zhu, Xiaolu Su, Jianan Tang, et al.
IEEE Transactions on Geoscience and Remote Sensing (2025) Vol. 63, pp. 1-33
Closed Access
Image-Text Retrieval With Cross-Modal Semantic Importance Consistency
Zejun Liu, Fanglin Chen, Jun Xu, et al.
IEEE Transactions on Circuits and Systems for Video Technology (2022) Vol. 33, Iss. 5, pp. 2465-2476
Closed Access | Times Cited: 17
Zejun Liu, Fanglin Chen, Jun Xu, et al.
IEEE Transactions on Circuits and Systems for Video Technology (2022) Vol. 33, Iss. 5, pp. 2465-2476
Closed Access | Times Cited: 17
Prefix Conditioning Unifies Language and Label Supervision
Kuniaki Saito, Kihyuk Sohn, Xiang Zhang, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023), pp. 2861-2870
Open Access | Times Cited: 9
Kuniaki Saito, Kihyuk Sohn, Xiang Zhang, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023), pp. 2861-2870
Open Access | Times Cited: 9