OpenAlex Citation Counts

OpenAlex Citations Logo

OpenAlex is a bibliographic catalogue of scientific papers, authors and institutions accessible in open access mode, named after the Library of Alexandria. It's citation coverage is excellent and I hope you will find utility in this listing of citing articles!

If you click the article title, you'll navigate to the article, as listed in CrossRef. If you click the Open Access links, you'll navigate to the "best Open Access location". Clicking the citation count will open this listing for that article. Lastly at the bottom of the page, you'll find basic pagination options.

Requested Article:

Knowing What it is: Semantic-Enhanced Dual Attention Transformer
Yiwei Ma, Jiayi Ji, Xiaoshuai Sun, et al.
IEEE Transactions on Multimedia (2022) Vol. 25, pp. 3723-3736
Closed Access | Times Cited: 22

Showing 22 citing articles:

X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval
Yiwei Ma, Guohai Xu, Xiaoshuai Sun, et al.
Proceedings of the 30th ACM International Conference on Multimedia (2022)
Open Access | Times Cited: 138

Towards local visual modeling for image captioning
Yiwei Ma, Jiayi Ji, Xiaoshuai Sun, et al.
Pattern Recognition (2023) Vol. 138, pp. 109420-109420
Open Access | Times Cited: 59

From multi-scale grids to dynamic regions: Dual-relation enhanced transformer for image captioning
Wei Zhou, Chuanle Song, Dihu Chen, et al.
Knowledge-Based Systems (2025), pp. 113127-113127
Closed Access | Times Cited: 1

X-Mesh: Towards Fast and Accurate Text-driven 3D Stylization via Dynamic Textual Guidance
Yiwei Ma, Haowei Wang, Xiaoqing Zhang, et al.
2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2023), pp. 2737-2748
Open Access | Times Cited: 17

Point Patches Contrastive Learning for Enhanced Point Cloud Completion
Ben Fei, Liwen Liu, T. Luo, et al.
IEEE Transactions on Multimedia (2025) Vol. 27, pp. 581-596
Closed Access

From grids to pseudo-regions: Dynamic memory augmented image captioning with dual relation transformer
Wei Zhou, Weitao Jiang, Zhijie Zheng, et al.
Expert Systems with Applications (2025), pp. 126850-126850
Closed Access

MDFNet: Multi-dimensional Fusion Attention for Enhanced Image Captioning
Dengdi Sun, Xuetao Li, Chengli Mu
Lecture notes in computer science (2025), pp. 52-61
Closed Access

Multi-level semantic-aware transformer for image captioning
Qin Xu, Sanghyeob Song, Qihang Wu, et al.
Neural Networks (2025), pp. 107390-107390
Closed Access

Key Role Guided Transformer for Group Activity Recognition
Duoxuan Pei, Di Huang, Longteng Kong, et al.
IEEE Transactions on Circuits and Systems for Video Technology (2023) Vol. 33, Iss. 12, pp. 7803-7818
Closed Access | Times Cited: 9

Beat: Bi-directional One-to-Many Embedding Alignment for Text-based Person Retrieval
Yiwei Ma, Xiaoshuai Sun, Jiayi Ji, et al.
(2023), pp. 4157-4168
Open Access | Times Cited: 9

Dual-Spatial Normalized Transformer for image captioning
Juntao Hu, Yang You, Yongzhi An, et al.
Engineering Applications of Artificial Intelligence (2023) Vol. 123, pp. 106384-106384
Closed Access | Times Cited: 7

Towards Real-Time Panoptic Narrative Grounding by an End-to-End Grounding Network
Haowei Wang, Jiayi Ji, Yiyi Zhou, et al.
Proceedings of the AAAI Conference on Artificial Intelligence (2023) Vol. 37, Iss. 2, pp. 2528-2536
Open Access | Times Cited: 7

MHRN: A Multimodal Hierarchical Reasoning Network for Topic Detection
Jiankai Li, Yunhong Wang, Weixin Li
IEEE Transactions on Multimedia (2024) Vol. 26, pp. 6968-6980
Closed Access | Times Cited: 1

Spatial–Channel Attention Transformer With Pseudo Regions for Remote Sensing Image-Text Retrieval
Dongqing Wu, Huihui Li, Yinxuan Hou, et al.
IEEE Transactions on Geoscience and Remote Sensing (2024) Vol. 62, pp. 1-15
Closed Access | Times Cited: 1

Dual visual align-cross attention-based image captioning transformer
Yonggong Ren, Jinghan Zhang, Wenqiang Xu, et al.
Multimedia Tools and Applications (2024)
Closed Access

Regular Constrained Multimodal Fusion for Image Captioning
Liya Wang, Haipeng Chen, Yu Liu, et al.
IEEE Transactions on Circuits and Systems for Video Technology (2024) Vol. 34, Iss. 11, pp. 11900-11913
Closed Access

Triple-stream commonsense circulation transformer network for image captioning
Jianchao Li, Wei Zhou, Kai Wang, et al.
Computer Vision and Image Understanding (2024), pp. 104165-104165
Closed Access

Dual-stream Self-attention Network for Image Captioning
Boyang Wan, Wenhui Jiang, Yuming Fang, et al.
2022 IEEE International Conference on Visual Communications and Image Processing (VCIP) (2022), pp. 1-5
Closed Access | Times Cited: 2

PixelFolder: An Efficient Progressive Pixel Synthesis Network for Image Generation
Jing He, Yiyi Zhou, Qi Zhang, et al.
Lecture notes in computer science (2022), pp. 643-660
Closed Access | Times Cited: 1

Research on Image Semantic Description Method Based on RVC Network
Huan Zhou
2022 IEEE 5th International Conference on Automation, Electronics and Electrical Engineering (AUTEEE) (2023), pp. 1180-1188
Closed Access

Complementary Shifted Transformer for Image Captioning
Yanbo Liu, You Yang, Ruoyu Xiang, et al.
Neural Processing Letters (2023) Vol. 55, Iss. 6, pp. 8339-8363
Closed Access

DSCJA-Captioner: Dual-Branch Spatial and Channel Joint Attention for Image Captioning
Xi Tian, Xiaobao Yang, Sugang Ma, et al.
2021 16th International Conference on Intelligent Systems and Knowledge Engineering (ISKE) (2023), pp. 458-464
Closed Access

Page 1

Scroll to top