OpenAlex Citation Counts

OpenAlex Citations Logo

OpenAlex is a bibliographic catalogue of scientific papers, authors and institutions accessible in open access mode, named after the Library of Alexandria. It's citation coverage is excellent and I hope you will find utility in this listing of citing articles!

If you click the article title, you'll navigate to the article, as listed in CrossRef. If you click the Open Access links, you'll navigate to the "best Open Access location". Clicking the citation count will open this listing for that article. Lastly at the bottom of the page, you'll find basic pagination options.

Requested Article:

X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval
Yiwei Ma, Guohai Xu, Xiaoshuai Sun, et al.
Proceedings of the 30th ACM International Conference on Multimedia (2022)
Open Access | Times Cited: 138

Showing 1-25 of 138 citing articles:

CLIP-Driven Fine-Grained Text-Image Person Re-Identification
Shuanglin Yan, Neng Dong, Liyan Zhang, et al.
IEEE Transactions on Image Processing (2023) Vol. 32, pp. 6032-6046
Open Access | Times Cited: 98

Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
Wenhao Wu, Haipeng Luo, Bo Fang, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023), pp. 10704-10713
Open Access | Times Cited: 46

Clover: Towards A Unified Video-Language Alignment and Fusion Model
Jingjia Huang, Yinan Li, Jiashi Feng, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023)
Open Access | Times Cited: 29

HiTeA: Hierarchical Temporal-Aware Video-Language Pre-training
Qinghao Ye, Guohai Xu, Ming Yan, et al.
2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2023), pp. 15359-15370
Open Access | Times Cited: 26

SuS-X: Training-Free Name-Only Transfer of Vision-Language Models
Vishaal Udandarao, Ankush Gupta, Samuel Albanie
2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2023), pp. 2725-2736
Open Access | Times Cited: 26

Toward Video Anomaly Retrieval From Video Anomaly Detection: New Benchmarks and Model
Peng Wu, Jing Liu, Xiangteng He, et al.
IEEE Transactions on Image Processing (2024) Vol. 33, pp. 2213-2225
Closed Access | Times Cited: 11

CoVR: Learning Composed Video Retrieval from Web Video Captions
Lucas Ventura, Antoine Yang, Cordelia Schmid, et al.
Proceedings of the AAAI Conference on Artificial Intelligence (2024) Vol. 38, Iss. 6, pp. 5270-5279
Open Access | Times Cited: 8

Text-Based Occluded Person Re-identification via Multi-Granularity Contrastive Consistency Learning
Xinyi Wu, Wentao Ma, Dan Guo, et al.
Proceedings of the AAAI Conference on Artificial Intelligence (2024) Vol. 38, Iss. 6, pp. 6162-6170
Open Access | Times Cited: 8

Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
Tsai-Shien Chen, Aliaksandr Siarohin, Willi Menapace, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024), pp. 13320-13331
Closed Access | Times Cited: 8

Unified Coarse-to-Fine Alignment for Video-Text Retrieval
Ziyang Wang, Yi-Lin Sung, Feng Cheng, et al.
2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2023), pp. 2804-2815
Open Access | Times Cited: 17

X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic Textual Guidance
Yiwei Ma, Haowei Wang, Xiaoqing Zhang, et al.
2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2023), pp. 2737-2748
Open Access | Times Cited: 17

Dual Alignment Unsupervised Domain Adaptation for Video-Text Retrieval
Xiaoshuai Hao, Wanqian Zhang, Dayan Wu, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023), pp. 18962-18972
Closed Access | Times Cited: 16

DGL: Dynamic Global-Local Prompt Tuning for Text-Video Retrieval
Xiangpeng Yang, Linchao Zhu, Xiaohan Wang, et al.
Proceedings of the AAAI Conference on Artificial Intelligence (2024) Vol. 38, Iss. 7, pp. 6540-6548
Open Access | Times Cited: 7

DePT: Decoupled Prompt Tuning
Ji Zhang, Shihan Wu, Lianli Gao, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. 34, pp. 12924-12933
Closed Access | Times Cited: 7

Rotated Multi-Scale Interaction Network for Referring Remote Sensing Image Segmentation
Sihan Liu, Yiwei Ma, Xiaoqing Zhang, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. abs/2106.03089, pp. 26648-26658
Closed Access | Times Cited: 7

3D-STMN: Dependency-Driven Superpoint-Text Matching Network for End-to-End 3D Referring Expression Segmentation
Changli Wu, Yiwei Ma, Qi Chen, et al.
Proceedings of the AAAI Conference on Artificial Intelligence (2024) Vol. 38, Iss. 6, pp. 5940-5948
Open Access | Times Cited: 6

LidarCLIP or: How I Learned to Talk to Point Clouds
Georg Hess, Adam Tonderski, Christoffer Petersson, et al.
2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2024), pp. 7423-7432
Open Access | Times Cited: 6

Prompt Switch: Efficient CLIP Adaptation for Text-Video Retrieval
Chaorui Deng, Qi Chen, Pengda Qin, et al.
2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2023), pp. 15602-15612
Open Access | Times Cited: 13

Text-Video Retrieval via Multi-Modal Hypergraph Networks
Qian Li, Lixin Su, Jiashu Zhao, et al.
(2024), pp. 369-377
Closed Access | Times Cited: 5

X-RefSeg3D: Enhancing Referring 3D Instance Segmentation via Structured Cross-Modal Graph Neural Networks
Zhipeng Qian, Yiwei Ma, Jiayi Ji, et al.
Proceedings of the AAAI Conference on Artificial Intelligence (2024) Vol. 38, Iss. 5, pp. 4551-4559
Open Access | Times Cited: 5

MQRLD: A multimodal data retrieval platform with query-aware feature representation and learned index based on data lake
Ming Sheng, Shuliang Wang, Yong Zhang, et al.
Information Processing & Management (2025) Vol. 62, Iss. 4, pp. 104101-104101
Closed Access

Efficient text-to-video retrieval via multi-modal multi-tagger derived pre-screening
Yingjia Xu, Mengxia Wu, Zixin Guo, et al.
Visual Intelligence (2025) Vol. 3, Iss. 1
Open Access

Sequential Consistency Matters: Boosting Video Sequence Verification with Teacher Multimodal Transformer
Yaning Zhao, Xun Jiang, Jingran Zhang, et al.
Communications in computer and information science (2025), pp. 207-219
Closed Access

Dynamic semantic prototype perception for text-video retrieval
Henghao Zhao, Rui Yan, Zechao Li
Image and Vision Computing (2025), pp. 105515-105515
Closed Access

TC-MGC: Text-conditioned multi-grained contrastive learning for text-video retrieval
Xiaolun Jing, Genke Yang, Jian Chu
Information Fusion (2025), pp. 103151-103151
Closed Access

Page 1 - Next Page

Scroll to top