OpenAlex Citation Counts

OpenAlex Citations Logo

OpenAlex is a bibliographic catalogue of scientific papers, authors and institutions accessible in open access mode, named after the Library of Alexandria. It's citation coverage is excellent and I hope you will find utility in this listing of citing articles!

If you click the article title, you'll navigate to the article, as listed in CrossRef. If you click the Open Access links, you'll navigate to the "best Open Access location". Clicking the citation count will open this listing for that article. Lastly at the bottom of the page, you'll find basic pagination options.

Requested Article:

Vision-Language Models for Vision Tasks: A Survey
J Zhang, Jiaxing Huang, Sheng Jin, et al.
IEEE Transactions on Pattern Analysis and Machine Intelligence (2024) Vol. 46, Iss. 8, pp. 5625-5644
Open Access | Times Cited: 104

Showing 1-25 of 104 citing articles:

The Segment Anything Model (SAM) for remote sensing applications: From zero to one shot
Lucas Prado Osco, Qiusheng Wu, Eduardo Lopes de Lemos, et al.
International Journal of Applied Earth Observation and Geoinformation (2023) Vol. 124, pp. 103540-103540
Open Access | Times Cited: 112

RemoteCLIP: A Vision Language Foundation Model for Remote Sensing
Fan Liu, Delong Chen, Zhangqingyun Guan, et al.
IEEE Transactions on Geoscience and Remote Sensing (2024) Vol. 62, pp. 1-16
Open Access | Times Cited: 63

Capability of GPT-4V(ision) in the Japanese National Medical Licensing Examination: Evaluation Study
Takahiro Nakao, Soichiro Miki, Yuta Nakamura, et al.
JMIR Medical Education (2024) Vol. 10, pp. e54393-e54393
Open Access | Times Cited: 26

Vision-language model-based human-robot collaboration for smart manufacturing: A state-of-the-art survey
Junming Fan, Yue Yin, Tian Wang, et al.
Frontiers of Engineering Management (2025)
Open Access | Times Cited: 1

A survey of efficient fine-tuning methods for Vision-Language Models — Prompt and Adapter
Jialu Xing, Jianping Liu, Jian Wang, et al.
Computers & Graphics (2024) Vol. 119, pp. 103885-103885
Closed Access | Times Cited: 8

Ethological computational psychiatry: Challenges and opportunities
Ilya E. Monosov, Jan Zimmermann, Michael J. Frank, et al.
Current Opinion in Neurobiology (2024) Vol. 86, pp. 102881-102881
Open Access | Times Cited: 5

ChatFFA: An ophthalmic chat system for unified vision-language understanding and question answering for fundus fluorescein angiography
Xiaolan Chen, Pusheng Xu, Y P Li, et al.
iScience (2024) Vol. 27, Iss. 7, pp. 110021-110021
Open Access | Times Cited: 5

DRC-Coder: Automated DRC Checker Code Generation Using LLM Autonomous Agent
Chen-Chia Chang, Chia-Tung Ho, Yaguang Li, et al.
(2025), pp. 143-151
Open Access

Consistent prompt learning for vision-language models
Yonggang Zhang, Xinmei Tian
Knowledge-Based Systems (2025) Vol. 310, pp. 112974-112974
Closed Access

Enabling Harmonious Human-Machine Interaction with Visual-Context Augmented Dialogue System: A Review
Hao Wang, Bin Guo, Yating Zeng, et al.
ACM transactions on office information systems (2025)
Open Access

Prompting robotic modalities (PRM): A structured architecture for centralizing language models in complex systems
Bilel Benjdira, Anis Koubâa, Anas M. Ali
Future Generation Computer Systems (2025), pp. 107723-107723
Closed Access

Unified Text-Image Space Alignment with Cross-Modal Prompting in CLIP for UDA
Yifan Jiao, Chenglong Cai, Bing‐Kun Bao
ACM Transactions on Multimedia Computing Communications and Applications (2025)
Open Access

Perceptual visual security index: Analyzing image content leakage for vision language models
Lishuang Hu, Tao Xiang, Shangwei Guo, et al.
Journal of Information Security and Applications (2025) Vol. 89, pp. 103988-103988
Closed Access

MammoVLM: A generative large vision-language model for mammography-related diagnostic assistance
Zhenjie Cao, Zhuo Deng, Jie Ma, et al.
Information Fusion (2025), pp. 102998-102998
Closed Access

Enhancing cross-domain generalization by fusing language-guided feature remapping
Ziteng Qiao, Dianxi Shi, Songchang Jin, et al.
Information Fusion (2025), pp. 103029-103029
Closed Access

The future of action recognition: are multi-modal visual language models the key?
Enes Gümüşkaynak, Süleyman Eken
Signal Image and Video Processing (2025) Vol. 19, Iss. 4
Closed Access

Video Fire Recognition Using Zero-shot Vision-language Models Guided by a Task-aware Object Detector
Diego Gragnaniello, Antonio Greco, Carlo Sansone, et al.
ACM Transactions on Multimedia Computing Communications and Applications (2025)
Open Access

Assessing the spatial accuracy of geocoding flood-related imagery using Vision Language Models
Sebastian Schmidt, Eleonor Díaz Fragachan, Dorian Arifi, et al.
Spatial Information Research (2025) Vol. 33, Iss. 2
Open Access

Self-supervised visual–textual prompt learning for few-shot grading of gastric intestinal metaplasia
Xuanchi Chen, Xiangwei Zheng, Zhen Li, et al.
Knowledge-Based Systems (2024) Vol. 301, pp. 112303-112303
Closed Access | Times Cited: 4

Would Deep Generative Models Amplify Bias in Future Models?
Tianwei Chen, Yusuke Hirota, Mayu Otani, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. abs/2304.08466, pp. 10833-10843
Closed Access | Times Cited: 4

Pic2Plate: A Vision-Language and Retrieval-Augmented Framework for Personalized Recipe Recommendations
Yosua Setyawan Soekamto, Andreas Pangestu Lim, Leonard Christopher Limanjaya, et al.
Sensors (2025) Vol. 25, Iss. 2, pp. 449-449
Open Access

GI-Grasp: Target-Oriented 6DoF Grasping Strategy with Grasp Intuition Based on Vision-Language Models
Tong Jia, Haiyu Zhang, Guowei Yang, et al.
Lecture notes in computer science (2025), pp. 89-100
Closed Access

Large language model-augmented learning for auto-delineation of treatment targets in head-and-neck cancer radiotherapy
Praveenbalaji Rajendran, Yong Yang, Thomas Niedermayr, et al.
Radiotherapy and Oncology (2025), pp. 110740-110740
Closed Access

Page 1 - Next Page

Scroll to top