OpenAlex Citation Counts

OpenAlex Citations Logo

OpenAlex is a bibliographic catalogue of scientific papers, authors and institutions accessible in open access mode, named after the Library of Alexandria. It's citation coverage is excellent and I hope you will find utility in this listing of citing articles!

If you click the article title, you'll navigate to the article, as listed in CrossRef. If you click the Open Access links, you'll navigate to the "best Open Access location". Clicking the citation count will open this listing for that article. Lastly at the bottom of the page, you'll find basic pagination options.

Requested Article:

LAVENDER: Unifying Video-Language Understanding as Masked Language Modeling
Linjie Li, Zhe Gan, Kevin Lin, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023), pp. 23119-23129
Open Access | Times Cited: 44

Showing 1-25 of 44 citing articles:

Unmasked Teacher: Towards Training-Efficient Video Foundation Models
Kunchang Li, Yali Wang, Yizhuo Li, et al.
2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2023), pp. 19891-19903
Open Access | Times Cited: 48

Multimodal Large Language Models in Healthcare: Applications, Challenges, and Future Outlook (Preprint)
Rawan AlSaad, Alaa Abd‐Alrazaq, Sabri Boughorbel, et al.
Journal of Medical Internet Research (2024) Vol. 26, pp. e59505-e59505
Open Access | Times Cited: 20

VindLU: A Recipe for Effective Video-and-Language Pretraining
Feng Cheng, Xizi Wang, Jie Lei, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023)
Open Access | Times Cited: 38

Text with Knowledge Graph Augmented Transformer for Video Captioning
Xin Gu, Guang Chen, Yufei Wang, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023), pp. 18941-18951
Open Access | Times Cited: 33

HiTeA: Hierarchical Temporal-Aware Video-Language Pre-training
Qinghao Ye, Guohai Xu, Ming Yan, et al.
2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2023), pp. 15359-15370
Open Access | Times Cited: 26

EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone
Shraman Pramanick, Yale Song, Sayan Nag, et al.
2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2023), pp. 5262-5274
Open Access | Times Cited: 23

Text Is MASS: Modeling as Stochastic Embedding for Text-Video Retrieval
Jiamian Wang, Pichao Wang, Guohao Sun, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024), pp. 16551-16560
Closed Access | Times Cited: 13

Unified Coarse-to-Fine Alignment for Video-Text Retrieval
Ziyang Wang, Yi-Lin Sung, Feng Cheng, et al.
2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2023), pp. 2804-2815
Open Access | Times Cited: 17

AutoAD III: The Prequel - Back to the Pixels
Tengda Han, Max Bain, Arsha Nagrani, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024), pp. 18164-18174
Closed Access | Times Cited: 5

Stitching Segments and Sentences towards Generalization in Video-Text Pre-training
Fan Ma, Xiaojie Jin, Heng Wang, et al.
Proceedings of the AAAI Conference on Artificial Intelligence (2024) Vol. 38, Iss. 5, pp. 4080-4088
Open Access | Times Cited: 4

Zero-Shot Video Moment Retrieval from Frozen Vision-Language Models
Dezhao Luo, Jiabo Huang, Shaogang Gong, et al.
2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2024), pp. 5452-5461
Open Access | Times Cited: 4

A Descriptive Basketball Highlight Dataset for Automatic Commentary Generation
Benhui Zhang, Junyu Gao, Yuan Yuan
(2024), pp. 10316-10325
Closed Access | Times Cited: 4

vid-TLDR: Training Free Token merging for Light-Weight Video Transformer
Joonmyung Choi, Sanghyeok Lee, Jaewon Chu, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. 35, pp. 18771-18781
Closed Access | Times Cited: 3

A Simple Recipe for Contrastively Pre-Training Video-First Encoders Beyond 16 Frames
Pinelopi Papalampidi, Skanda Koppula, Shreya Pathak, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. 35, pp. 14386-14397
Closed Access | Times Cited: 3

Learning Text-to-Video Retrieval from Image Captioning
Lucas Ventura, Cordelia Schmid, Gül Varol
International Journal of Computer Vision (2024)
Closed Access | Times Cited: 3

Cross-Modal Multiscale Difference-Aware Network for Joint Moment Retrieval and Highlight Detection
Mingyao Zhou, Wenjing Chen, Hao Sun, et al.
ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2024), pp. 8416-8420
Closed Access | Times Cited: 2

Foundation Models for Video Understanding: A Survey
Neelu Madan, Andreas Møgelmose, Rajat Modi, et al.
(2024)
Open Access | Times Cited: 2

Foundation Models for Video Understanding: A Survey
Neelu Madan, Andreas Møgelmose, Rajat Modi, et al.
(2024)
Open Access | Times Cited: 2

Learning Hierarchical Modular Networks for Video Captioning
Guorong Li, Hanhua Ye, Yuankai Qi, et al.
IEEE Transactions on Pattern Analysis and Machine Intelligence (2023) Vol. 46, Iss. 2, pp. 1049-1064
Closed Access | Times Cited: 6

eP-ALM: Efficient Perceptual Augmentation of Language Models
Mustafa Shukor, Corentin Dancette, Matthieu Cord
2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2023), pp. 21999-22012
Open Access | Times Cited: 6

Alignment and Generation Adapter for Efficient Video-text Understanding
Fang Han, Zhifei Yang, Yuhan Wei, et al.
(2023), pp. 2783-2789
Closed Access | Times Cited: 5

Scalable and Accurate Self-supervised Multimodal Representation Learning without Aligned Video and Text Data
Vladislav Lialin, Stephen Rawls, David W. Chan, et al.
(2023), pp. 390-400
Open Access | Times Cited: 4

Can't make an Omelette without Breaking some Eggs: Plausible Action Anticipation using Large Video-Language Models
Himangi Mittal, Nakul Agarwal, Shao-Yuan Lo, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. 35, pp. 18580-18590
Closed Access | Times Cited: 1

EA-VTR: Event-Aware Video-Text Retrieval
Zongyang Ma, Ziqi Zhang, Yuxin Chen, et al.
Lecture notes in computer science (2024), pp. 76-94
Closed Access | Times Cited: 1

Page 1 - Next Page

Scroll to top