OpenAlex Citation Counts

OpenAlex is a bibliographic catalogue of scientific papers, authors and institutions accessible in open access mode, named after the Library of Alexandria. It's citation coverage is excellent and I hope you will find utility in this listing of citing articles!

If you click the article title, you'll navigate to the article, as listed in CrossRef. If you click the Open Access links, you'll navigate to the "best Open Access location". Clicking the citation count will open this listing for that article. Lastly at the bottom of the page, you'll find basic pagination options.

Requested Article:

LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking
Yupan Huang, Tengchao Lv, Lei Cui, et al.
Proceedings of the 30th ACM International Conference on Multimedia (2022)
Open Access | Times Cited: 256

Showing 1-25 of 256 citing articles:

DiT: Self-supervised Pre-training for Document Image Transformer
Junlong Li, Yiheng Xu, Tengchao Lv, et al.
Proceedings of the 30th ACM International Conference on Multimedia (2022)
Open Access | Times Cited: 104

Unifying Vision, Text, and Layout for Universal Document Processing
Zineng Tang, Ziyi Yang, Guoxin Wang, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023), pp. 19254-19264
Open Access | Times Cited: 50

DocFormerv2: Local Features for Document Understanding
Srikar Appalaraju, Peng Tang, Qi Dong, et al.
Proceedings of the AAAI Conference on Artificial Intelligence (2024) Vol. 38, Iss. 2, pp. 709-718
Open Access | Times Cited: 17

GeoLayoutLM: Geometric Pre-training for Visual Information Extraction
Chuwei Luo, Changxu Cheng, Zheng Qi, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023)
Open Access | Times Cited: 33

VLCDoC: Vision-Language contrastive pre-training model for cross-Modal document classification
Souhail Bakkali, Zuheng Ming, Mickaël Coustaty, et al.
Pattern Recognition (2023) Vol. 139, pp. 109419-109419
Open Access | Times Cited: 24

CLIPPO: Image-and-Language Understanding from Pixels Only
Michael Tschannen, Basil Mustafa, Neil Houlsby
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023)
Open Access | Times Cited: 21

Hierarchical multimodal transformers for Multipage DocVQA
Rubèn Tito, Dìmosthenis Karatzas, Ernest Valveny
Pattern Recognition (2023) Vol. 144, pp. 109834-109834
Open Access | Times Cited: 20

UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model
Jiabo Ye, Anwen Hu, Haiyang Xu, et al.
(2023)
Open Access | Times Cited: 19

MatCha: Enhancing Visual Language Pretraining with Math Reasoning and Chart Derendering
Fangyu Liu, Francesco Piccinno, Syrine Krichene, et al.
(2023), pp. 12756-12770
Open Access | Times Cited: 18

DocILE Benchmark for Document Information Localization and Extraction
Štěpán Šimsa, Milan Šulc, Michal Uřičář, et al.
Lecture notes in computer science (2023), pp. 147-166
Closed Access | Times Cited: 16

Harnessing the Power of Multi-Lingual Datasets for Pre-training: Towards Enhancing Text Spotting Performance
Alloy Das, Sanket Biswas, Ayan Banerjee, et al.
2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2024)
Open Access | Times Cited: 7

Vision Grid Transformer for Document Layout Analysis
Da Cheng, Chuwei Luo, Qi Zheng, et al.
2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2023), pp. 19405-19415
Open Access | Times Cited: 15

SwinDocSegmenter: An End-to-End Unified Domain Adaptive Transformer for Document Instance Segmentation
Ayan Banerjee, Sanket Biswas, Josep Lladós, et al.
Lecture notes in computer science (2023), pp. 307-325
Closed Access | Times Cited: 14

PatCID: an open-access dataset of chemical structures in patent documents
Lucas Morin, Valéry Weber, Gerhard Ingmar Meijer, et al.
Nature Communications (2024) Vol. 15, Iss. 1
Open Access | Times Cited: 5

Evaluating Technological and Instructional Factors Influencing the Acceptance of AIGC-Assisted Design Courses
Qianling Jiang, Yuzhuo Zhang, Wei Wei, et al.
Computers and Education Artificial Intelligence (2024) Vol. 7, pp. 100287-100287
Open Access | Times Cited: 5

DANIEL: a fast document attention network for information extraction and labelling of handwritten documents
Thomas Constum, Pierrick Tranouez, Thierry Paquet
International Journal on Document Analysis and Recognition (IJDAR) (2025)
Open Access

Multimodal Document Analytics for Banking Process Automation
Christopher Gerling, Stefan Lessmann
Information Fusion (2025), pp. 102973-102973
Closed Access

Redacted text detection using neural image segmentation methods
Ruben van Heusden, Kenneth Meijer, M. Marx
International Journal on Document Analysis and Recognition (IJDAR) (2025)
Open Access

Information extraction from multi-layout invoice images using FATURA dataset
M. O. A. Limam, Marwa Dhiaf, Yousri Kessentini
Engineering Applications of Artificial Intelligence (2025) Vol. 149, pp. 110478-110478
Closed Access

UniHDSA: A unified relation prediction approach for hierarchical document structure analysis
Jiawei Wang, Kai Hu, Qiang Huo
Pattern Recognition (2025), pp. 111617-111617
Closed Access

Bi-VLDoc: bidirectional vision-language modeling for visually-rich document understanding
Chuwei Luo, Guozhi Tang, Qi Zheng, et al.
International Journal on Document Analysis and Recognition (IJDAR) (2025)
Closed Access

Improving Document-Based Question Answering with Multi-modal Learning: Batch Size and Token Modality Experiments
K. Raagulbharatwaj, J. Senthil Murugan, Jeganathan Lakshmanan
Lecture notes in electrical engineering (2025), pp. 513-526
Closed Access

LiGT: layout-infused generative transformer for visual question answering on Vietnamese receipts
Phong T. Le, Trung Phan, Nghia Hieu Nguyen, et al.
International Journal on Document Analysis and Recognition (IJDAR) (2025)
Closed Access

LayoutMask: Enhance Text-Layout Interaction in Multi-modal Pre-training for Document Understanding
Yi Tu, Ya Guo, Huan Chen, et al.
(2023), pp. 15200-15212
Open Access | Times Cited: 12

Modeling Entities as Semantic Points for Visual Information Extraction in the Wild
Zhibo Yang, Rujiao Long, Pengfei Wang, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023)
Open Access | Times Cited: 12

Page 1 - Next Page

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Analytics" category .
cookielawinfo-checkbox-functional	1 year	The cookie is set by the GDPR Cookie Consent plugin to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Necessary" category .
cookielawinfo-checkbox-others	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to store the user consent for cookies in the category "Others".
cookielawinfo-checkbox-performance	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to store the user consent for cookies in the category "Performance".
CookieLawInfoConsent	1 year	Records the default button state of the corresponding category & the status of CCPA. It works only in coordination with the primary cookie.
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.

Requested Article:

Showing 1-25 of 256 citing articles:

Your Privacy