OpenAlex Citation Counts

OpenAlex Citations Logo

OpenAlex is a bibliographic catalogue of scientific papers, authors and institutions accessible in open access mode, named after the Library of Alexandria. It's citation coverage is excellent and I hope you will find utility in this listing of citing articles!

If you click the article title, you'll navigate to the article, as listed in CrossRef. If you click the Open Access links, you'll navigate to the "best Open Access location". Clicking the citation count will open this listing for that article. Lastly at the bottom of the page, you'll find basic pagination options.

Requested Article:

Revisiting Classifier: Transferring Vision-Language Models for Video Recognition
Wenhao Wu, Zhun Sun, Wanli Ouyang
Proceedings of the AAAI Conference on Artificial Intelligence (2023) Vol. 37, Iss. 3, pp. 2847-2855
Open Access | Times Cited: 51

Showing 1-25 of 51 citing articles:

Bidirectional Cross-Modal Knowledge Exploration for Video Recognition with Pre-trained Vision-Language Models
Wenhao Wu, Xiaohan Wang, Haipeng Luo, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023), pp. 6620-6630
Open Access | Times Cited: 47

Cap4Video: What Can Auxiliary Captions Do for Text-Video Retrieval?
Wenhao Wu, Haipeng Luo, Bo Fang, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023), pp. 10704-10713
Open Access | Times Cited: 46

CLIP-guided Prototype Modulating for Few-shot Action Recognition
Xiang Wang, Shiwei Zhang, Jun Cen, et al.
International Journal of Computer Vision (2023) Vol. 132, Iss. 6, pp. 1899-1912
Closed Access | Times Cited: 27

MAtch, eXpand and Improve: Unsupervised Finetuning for Zero-Shot Action Recognition with Language Knowledge
Wei Lin, Leonid Karlinsky, Nina Shvetsova, et al.
2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2023), pp. 2839-2850
Open Access | Times Cited: 16

Building an Open-Vocabulary Video CLIP Model With Better Architectures, Optimization and Data
Zuxuan Wu, Zejia Weng, Wujian Peng, et al.
IEEE Transactions on Pattern Analysis and Machine Intelligence (2024) Vol. 46, Iss. 7, pp. 4747-4762
Open Access | Times Cited: 6

VicTR: Video-conditioned Text Representations for Activity Recognition
Kumara Kahatapitiya, Anurag Arnab, Arsha Nagran, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. 33, pp. 18547-18558
Closed Access | Times Cited: 6

Description Attribute-Enhanced Spatio-Temporal Zero-Shot Action Recognition
Yehna Kim, Ho-Joong Kim, Seong‐Whan Lee
Lecture notes in computer science (2025), pp. 296-309
Closed Access

Multi-TuneV: Fine-tuning the fusion of multiple modules for video action recognition
Xinyuan Liu, Junyong Ye, Jingjing Wang, et al.
Journal of Visual Communication and Image Representation (2025), pp. 104441-104441
Closed Access

Human activity recognition: A review of deep learning‐based methods
Sanjay Jyoti Dutta, Tossapon Boongoen, Reyer Zwiggelaar
IET Computer Vision (2025) Vol. 19, Iss. 1
Open Access

Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning
Zhiwu Qing, Shiwei Zhang, Ziyuan Huang, et al.
2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2023), pp. 13888-13898
Open Access | Times Cited: 10

Deep Learning Innovations in Video Classification: A Survey on Techniques and Dataset Evaluations
Makara Mao, Ahyoung Lee, Min Hong
Electronics (2024) Vol. 13, Iss. 14, pp. 2732-2732
Open Access | Times Cited: 3

TF-FAS: Twofold-Element Fine-Grained Semantic Guidance for Generalizable Face Anti-spoofing
Xudong Wang, Ke-Yue Zhang, Taiping Yao, et al.
Lecture notes in computer science (2024), pp. 148-168
Closed Access | Times Cited: 3

A Highly Compressed Accelerator With Temporal Optical Flow Feature Fusion and Tensorized LSTM for Video Action Recognition on Terminal Device
Peining Zhen, Xiaotao Yan, Wei Wang, et al.
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (2023) Vol. 42, Iss. 10, pp. 3129-3142
Closed Access | Times Cited: 7

OphNet: A Large-Scale Video Benchmark for Ophthalmic Surgical Workflow Understanding
Ming Hu, Peng Xia, Lin Wang, et al.
Lecture notes in computer science (2024), pp. 481-500
Closed Access | Times Cited: 2

Visual-guided hierarchical iterative fusion for multi-modal video action recognition
Bingbing Zhang, Ying Zhang, Jianxin Zhang, et al.
Pattern Recognition Letters (2024)
Closed Access | Times Cited: 2

A Large-scale Study of Spatiotemporal Representation Learning with a New Benchmark on Action Recognition
Andong Deng, Taojiannan Yang, Chen Chen
2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2023), pp. 20462-20474
Open Access | Times Cited: 4

Comprehensive Visual Grounding for Video Description
Wenhui Jiang, Yibo Cheng, Linxin Liu, et al.
Proceedings of the AAAI Conference on Artificial Intelligence (2024) Vol. 38, Iss. 3, pp. 2552-2560
Open Access | Times Cited: 1

GBC: Guided Alignment and Adaptive Boosting CLIP Bridging Vision and Language for Robust Action Recognition
Zhaoqilin Yang, Gaoyun An, Zhenxing Zheng, et al.
IEEE Transactions on Circuits and Systems for Video Technology (2024) Vol. 34, Iss. 9, pp. 8172-8187
Closed Access | Times Cited: 1

OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition
Tongjia Chen, Hongshan Yu, Zhengeng Yang, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024), pp. 18888-18898
Closed Access | Times Cited: 1

ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video
Xinhao Li, Yuhan Zhu, Limin Wang
Lecture notes in computer science (2024), pp. 425-443
Closed Access | Times Cited: 1

What Can Simple Arithmetic Operations Do for Temporal Modeling?
Wenhao Wu, Yuxin Song, Zhun Sun, et al.
2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2023), pp. 13666-13676
Open Access | Times Cited: 3

Behavior Recognition of Squid Jigger Based on Deep Learning
Yifan Song, Shengmao Zhang, Fenghua Tang, et al.
Fishes (2023) Vol. 8, Iss. 10, pp. 502-502
Open Access | Times Cited: 3

SDA-CLIP: surgical visual domain adaptation using video and text labels
Yuchong Li, Shuangfu Jia, Guangbi Song, et al.
Quantitative Imaging in Medicine and Surgery (2023) Vol. 13, Iss. 10, pp. 6989-7001
Open Access | Times Cited: 2

Motion Vector-Based Self-Attention for Real-Time Human Activity Recognition in Compressed Videos: The MVViT Approach
S. M. Praveenkumar, Prakashgoud Patil, P. S. Hiremath
International Journal of Pattern Recognition and Artificial Intelligence (2024) Vol. 38, Iss. 04
Closed Access

Page 1 - Next Page

Scroll to top