
OpenAlex is a bibliographic catalogue of scientific papers, authors and institutions accessible in open access mode, named after the Library of Alexandria. It's citation coverage is excellent and I hope you will find utility in this listing of citing articles!
If you click the article title, you'll navigate to the article, as listed in CrossRef. If you click the Open Access links, you'll navigate to the "best Open Access location". Clicking the citation count will open this listing for that article. Lastly at the bottom of the page, you'll find basic pagination options.
Requested Article:
UniTAB: Unifying Text and Box Outputs for Grounded Vision-Language Modeling
Zhengyuan Yang, Zhe Gan, Jianfeng Wang, et al.
Lecture notes in computer science (2022), pp. 521-539
Closed Access | Times Cited: 53
Zhengyuan Yang, Zhe Gan, Jianfeng Wang, et al.
Lecture notes in computer science (2022), pp. 521-539
Closed Access | Times Cited: 53
Showing 1-25 of 53 citing articles:
Generalized Decoding for Pixel, Image, and Language
Xueyan Zou, Zi-Yi Dou, Jianwei Yang, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023), pp. 15116-15127
Open Access | Times Cited: 95
Xueyan Zou, Zi-Yi Dou, Jianwei Yang, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023), pp. 15116-15127
Open Access | Times Cited: 95
PolyFormer: Referring Image Segmentation as Sequential Polygon Generation
Jiang Liu, Hui Ding, Zhaowei Cai, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023), pp. 18653-18663
Open Access | Times Cited: 54
Jiang Liu, Hui Ding, Zhaowei Cai, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023), pp. 18653-18663
Open Access | Times Cited: 54
ReCo: Region-Controlled Text-to-Image Generation
Zhengyuan Yang, Jianfeng Wang, Zhe Gan, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023)
Open Access | Times Cited: 50
Zhengyuan Yang, Jianfeng Wang, Zhe Gan, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023)
Open Access | Times Cited: 50
LAVENDER: Unifying Video-Language Understanding as Masked Language Modeling
Linjie Li, Zhe Gan, Kevin Lin, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023), pp. 23119-23129
Open Access | Times Cited: 44
Linjie Li, Zhe Gan, Kevin Lin, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023), pp. 23119-23129
Open Access | Times Cited: 44
CPT: Color-based Prompt Tuning for pre-trained vision-language models
Yuan Yao, Ao Zhang, Zhengyan Zhang, et al.
AI Open (2024)
Open Access | Times Cited: 28
Yuan Yao, Ao Zhang, Zhengyan Zhang, et al.
AI Open (2024)
Open Access | Times Cited: 28
Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
Bin Xiao, Haiping Wu, Weijian Xu, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. 35, pp. 4818-4829
Closed Access | Times Cited: 15
Bin Xiao, Haiping Wu, Weijian Xu, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. 35, pp. 4818-4829
Closed Access | Times Cited: 15
PromptCap: Prompt-Guided Image Captioning for VQA with GPT-3
Yushi Hu, Hang Hua, Zhengyuan Yang, et al.
2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2023), pp. 2951-2963
Closed Access | Times Cited: 23
Yushi Hu, Hang Hua, Zhengyuan Yang, et al.
2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2023), pp. 2951-2963
Closed Access | Times Cited: 23
EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone
Shraman Pramanick, Yale Song, Sayan Nag, et al.
2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2023), pp. 5262-5274
Open Access | Times Cited: 23
Shraman Pramanick, Yale Song, Sayan Nag, et al.
2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2023), pp. 5262-5274
Open Access | Times Cited: 23
Uni-Perceiver v2: A Generalist Model for Large-Scale Vision and Vision-Language Tasks
Hao Li, Jinguo Zhu, Xiaohu Jiang, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023)
Open Access | Times Cited: 20
Hao Li, Jinguo Zhu, Xiaohu Jiang, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023)
Open Access | Times Cited: 20
Towards Unified Scene Text Spotting Based on Sequence Generation
Taeho Kil, Seonghyun Kim, Sukmin Seo, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023)
Open Access | Times Cited: 16
Taeho Kil, Seonghyun Kim, Sukmin Seo, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023)
Open Access | Times Cited: 16
Context Disentangling and Prototype Inheriting for Robust Visual Grounding
Wei Tang, Liang Li, Xuejing Liu, et al.
IEEE Transactions on Pattern Analysis and Machine Intelligence (2023) Vol. 46, Iss. 5, pp. 3213-3229
Open Access | Times Cited: 14
Wei Tang, Liang Li, Xuejing Liu, et al.
IEEE Transactions on Pattern Analysis and Machine Intelligence (2023) Vol. 46, Iss. 5, pp. 3213-3229
Open Access | Times Cited: 14
I-Tuning: Tuning Frozen Language Models with Image for Lightweight Image Captioning
Ziyang Luo, Zhipeng Hu, Yadong Xi, et al.
ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2023) Vol. 123, pp. 1-5
Open Access | Times Cited: 13
Ziyang Luo, Zhipeng Hu, Yadong Xi, et al.
ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (2023) Vol. 123, pp. 1-5
Open Access | Times Cited: 13
LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models
Hao Zhang, Hongyang Li, Feng Li, et al.
Lecture notes in computer science (2024), pp. 19-35
Closed Access | Times Cited: 5
Hao Zhang, Hongyang Li, Feng Li, et al.
Lecture notes in computer science (2024), pp. 19-35
Closed Access | Times Cited: 5
PEVL: Position-enhanced Pre-training and Prompt Tuning for Vision-language Models
Yuan Yao, Qianyu Chen, Ao Zhang, et al.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (2022), pp. 11104-11117
Open Access | Times Cited: 21
Yuan Yao, Qianyu Chen, Ao Zhang, et al.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (2022), pp. 11104-11117
Open Access | Times Cited: 21
Dynamic MDETR: A Dynamic Multimodal Transformer Decoder for Visual Grounding
Fengyuan Shi, Ruopeng Gao, Weilin Huang, et al.
IEEE Transactions on Pattern Analysis and Machine Intelligence (2023) Vol. 46, Iss. 2, pp. 1181-1198
Open Access | Times Cited: 12
Fengyuan Shi, Ruopeng Gao, Weilin Huang, et al.
IEEE Transactions on Pattern Analysis and Machine Intelligence (2023) Vol. 46, Iss. 2, pp. 1181-1198
Open Access | Times Cited: 12
DRESS : Instructing Large Vision-Language Models to Align and Interact with Humans via Natural Language Feedback
Yangyi Chen, Karan Sikka, Michael Cogswell, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. 35, pp. 14239-14250
Closed Access | Times Cited: 4
Yangyi Chen, Karan Sikka, Michael Cogswell, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. 35, pp. 14239-14250
Closed Access | Times Cited: 4
Open-Category Human-Object Interaction Pre-training via Language Modeling Framework
Sipeng Zheng, Boshen Xu, Qin Jin
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023) Vol. 30, pp. 19392-19402
Closed Access | Times Cited: 10
Sipeng Zheng, Boshen Xu, Qin Jin
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023) Vol. 30, pp. 19392-19402
Closed Access | Times Cited: 10
Jack of All Tasks, Master of Many: Designing General-purpose Coarse-to-Fine Vision-Language Model
Shraman Pramanick, Guangxing Han, Rui Hou, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. 33, pp. 14076-14088
Closed Access | Times Cited: 3
Shraman Pramanick, Guangxing Han, Rui Hou, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. 33, pp. 14076-14088
Closed Access | Times Cited: 3
Groundhog Grounding Large Language Models to Holistic Segmentation
Yichi Zhang, Ziqiao Ma, Xiaofeng Gao, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. 35, pp. 14227-14238
Closed Access | Times Cited: 3
Yichi Zhang, Ziqiao Ma, Xiaofeng Gao, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. 35, pp. 14227-14238
Closed Access | Times Cited: 3
Deeply Coupled Cross-Modal Prompt Learning
Xuejing Liu, Wei Tang, Jinghui Lu, et al.
Findings of the Association for Computational Linguistics: ACL 2022 (2023), pp. 7957-7970
Open Access | Times Cited: 8
Xuejing Liu, Wei Tang, Jinghui Lu, et al.
Findings of the Association for Computational Linguistics: ACL 2022 (2023), pp. 7957-7970
Open Access | Times Cited: 8
Toward Unified Token Learning for Vision-Language Tracking
Yaozong Zheng, Bineng Zhong, Qihua Liang, et al.
IEEE Transactions on Circuits and Systems for Video Technology (2023) Vol. 34, Iss. 4, pp. 2125-2135
Closed Access | Times Cited: 7
Yaozong Zheng, Bineng Zhong, Qihua Liang, et al.
IEEE Transactions on Circuits and Systems for Video Technology (2023) Vol. 34, Iss. 4, pp. 2125-2135
Closed Access | Times Cited: 7
APoLLo : Unified Adapter and Prompt Learning for Vision Language Models
Sanjoy Chowdhury, Sayan Nag, Dinesh Manocha
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (2023), pp. 10173-10187
Open Access | Times Cited: 7
Sanjoy Chowdhury, Sayan Nag, Dinesh Manocha
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (2023), pp. 10173-10187
Open Access | Times Cited: 7
From Pixels to Graphs: Open-Vocabulary Scene Graph Generation with Vision-Language Models
Rongjie Li, Songyang Zhang, Dahua Lin, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024), pp. 28076-28086
Closed Access | Times Cited: 2
Rongjie Li, Songyang Zhang, Dahua Lin, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024), pp. 28076-28086
Closed Access | Times Cited: 2
Lane2Seq: Towards Unified Lane Detection via Sequence Generation
Kunyang Zhou
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. 33, pp. 16944-16953
Closed Access | Times Cited: 2
Kunyang Zhou
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2024) Vol. 33, pp. 16944-16953
Closed Access | Times Cited: 2
HiVG: Hierarchical Multimodal Fine-grained Modulation for Visual Grounding
Linhui Xiao, Xiaoshan Yang, Fang Peng, et al.
(2024), pp. 5460-5469
Closed Access | Times Cited: 2
Linhui Xiao, Xiaoshan Yang, Fang Peng, et al.
(2024), pp. 5460-5469
Closed Access | Times Cited: 2