
OpenAlex is a bibliographic catalogue of scientific papers, authors and institutions accessible in open access mode, named after the Library of Alexandria. It's citation coverage is excellent and I hope you will find utility in this listing of citing articles!
If you click the article title, you'll navigate to the article, as listed in CrossRef. If you click the Open Access links, you'll navigate to the "best Open Access location". Clicking the citation count will open this listing for that article. Lastly at the bottom of the page, you'll find basic pagination options.
Requested Article:
LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking
Yupan Huang, Tengchao Lv, Lei Cui, et al.
Proceedings of the 30th ACM International Conference on Multimedia (2022)
Open Access | Times Cited: 256
Yupan Huang, Tengchao Lv, Lei Cui, et al.
Proceedings of the 30th ACM International Conference on Multimedia (2022)
Open Access | Times Cited: 256
Showing 1-25 of 256 citing articles:
DiT: Self-supervised Pre-training for Document Image Transformer
Junlong Li, Yiheng Xu, Tengchao Lv, et al.
Proceedings of the 30th ACM International Conference on Multimedia (2022)
Open Access | Times Cited: 104
Junlong Li, Yiheng Xu, Tengchao Lv, et al.
Proceedings of the 30th ACM International Conference on Multimedia (2022)
Open Access | Times Cited: 104
Unifying Vision, Text, and Layout for Universal Document Processing
Zineng Tang, Ziyi Yang, Guoxin Wang, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023), pp. 19254-19264
Open Access | Times Cited: 50
Zineng Tang, Ziyi Yang, Guoxin Wang, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023), pp. 19254-19264
Open Access | Times Cited: 50
DocFormerv2: Local Features for Document Understanding
Srikar Appalaraju, Peng Tang, Qi Dong, et al.
Proceedings of the AAAI Conference on Artificial Intelligence (2024) Vol. 38, Iss. 2, pp. 709-718
Open Access | Times Cited: 17
Srikar Appalaraju, Peng Tang, Qi Dong, et al.
Proceedings of the AAAI Conference on Artificial Intelligence (2024) Vol. 38, Iss. 2, pp. 709-718
Open Access | Times Cited: 17
GeoLayoutLM: Geometric Pre-training for Visual Information Extraction
Chuwei Luo, Changxu Cheng, Zheng Qi, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023)
Open Access | Times Cited: 33
Chuwei Luo, Changxu Cheng, Zheng Qi, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023)
Open Access | Times Cited: 33
VLCDoC: Vision-Language contrastive pre-training model for cross-Modal document classification
Souhail Bakkali, Zuheng Ming, Mickaël Coustaty, et al.
Pattern Recognition (2023) Vol. 139, pp. 109419-109419
Open Access | Times Cited: 24
Souhail Bakkali, Zuheng Ming, Mickaël Coustaty, et al.
Pattern Recognition (2023) Vol. 139, pp. 109419-109419
Open Access | Times Cited: 24
CLIPPO: Image-and-Language Understanding from Pixels Only
Michael Tschannen, Basil Mustafa, Neil Houlsby
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023)
Open Access | Times Cited: 21
Michael Tschannen, Basil Mustafa, Neil Houlsby
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023)
Open Access | Times Cited: 21
Hierarchical multimodal transformers for Multipage DocVQA
Rubèn Tito, Dìmosthenis Karatzas, Ernest Valveny
Pattern Recognition (2023) Vol. 144, pp. 109834-109834
Open Access | Times Cited: 20
Rubèn Tito, Dìmosthenis Karatzas, Ernest Valveny
Pattern Recognition (2023) Vol. 144, pp. 109834-109834
Open Access | Times Cited: 20
UReader: Universal OCR-free Visually-situated Language Understanding with Multimodal Large Language Model
Jiabo Ye, Anwen Hu, Haiyang Xu, et al.
(2023)
Open Access | Times Cited: 19
Jiabo Ye, Anwen Hu, Haiyang Xu, et al.
(2023)
Open Access | Times Cited: 19
MatCha: Enhancing Visual Language Pretraining with Math Reasoning and Chart Derendering
Fangyu Liu, Francesco Piccinno, Syrine Krichene, et al.
(2023), pp. 12756-12770
Open Access | Times Cited: 18
Fangyu Liu, Francesco Piccinno, Syrine Krichene, et al.
(2023), pp. 12756-12770
Open Access | Times Cited: 18
DocILE Benchmark for Document Information Localization and Extraction
Štěpán Šimsa, Milan Šulc, Michal Uřičář, et al.
Lecture notes in computer science (2023), pp. 147-166
Closed Access | Times Cited: 16
Štěpán Šimsa, Milan Šulc, Michal Uřičář, et al.
Lecture notes in computer science (2023), pp. 147-166
Closed Access | Times Cited: 16
Harnessing the Power of Multi-Lingual Datasets for Pre-training: Towards Enhancing Text Spotting Performance
Alloy Das, Sanket Biswas, Ayan Banerjee, et al.
2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2024)
Open Access | Times Cited: 7
Alloy Das, Sanket Biswas, Ayan Banerjee, et al.
2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (2024)
Open Access | Times Cited: 7
Vision Grid Transformer for Document Layout Analysis
Da Cheng, Chuwei Luo, Qi Zheng, et al.
2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2023), pp. 19405-19415
Open Access | Times Cited: 15
Da Cheng, Chuwei Luo, Qi Zheng, et al.
2021 IEEE/CVF International Conference on Computer Vision (ICCV) (2023), pp. 19405-19415
Open Access | Times Cited: 15
SwinDocSegmenter: An End-to-End Unified Domain Adaptive Transformer for Document Instance Segmentation
Ayan Banerjee, Sanket Biswas, Josep Lladós, et al.
Lecture notes in computer science (2023), pp. 307-325
Closed Access | Times Cited: 14
Ayan Banerjee, Sanket Biswas, Josep Lladós, et al.
Lecture notes in computer science (2023), pp. 307-325
Closed Access | Times Cited: 14
PatCID: an open-access dataset of chemical structures in patent documents
Lucas Morin, Valéry Weber, Gerhard Ingmar Meijer, et al.
Nature Communications (2024) Vol. 15, Iss. 1
Open Access | Times Cited: 5
Lucas Morin, Valéry Weber, Gerhard Ingmar Meijer, et al.
Nature Communications (2024) Vol. 15, Iss. 1
Open Access | Times Cited: 5
Evaluating Technological and Instructional Factors Influencing the Acceptance of AIGC-Assisted Design Courses
Qianling Jiang, Yuzhuo Zhang, Wei Wei, et al.
Computers and Education Artificial Intelligence (2024) Vol. 7, pp. 100287-100287
Open Access | Times Cited: 5
Qianling Jiang, Yuzhuo Zhang, Wei Wei, et al.
Computers and Education Artificial Intelligence (2024) Vol. 7, pp. 100287-100287
Open Access | Times Cited: 5
DANIEL: a fast document attention network for information extraction and labelling of handwritten documents
Thomas Constum, Pierrick Tranouez, Thierry Paquet
International Journal on Document Analysis and Recognition (IJDAR) (2025)
Open Access
Thomas Constum, Pierrick Tranouez, Thierry Paquet
International Journal on Document Analysis and Recognition (IJDAR) (2025)
Open Access
Multimodal Document Analytics for Banking Process Automation
Christopher Gerling, Stefan Lessmann
Information Fusion (2025), pp. 102973-102973
Closed Access
Christopher Gerling, Stefan Lessmann
Information Fusion (2025), pp. 102973-102973
Closed Access
Redacted text detection using neural image segmentation methods
Ruben van Heusden, Kenneth Meijer, M. Marx
International Journal on Document Analysis and Recognition (IJDAR) (2025)
Open Access
Ruben van Heusden, Kenneth Meijer, M. Marx
International Journal on Document Analysis and Recognition (IJDAR) (2025)
Open Access
Information extraction from multi-layout invoice images using FATURA dataset
M. O. A. Limam, Marwa Dhiaf, Yousri Kessentini
Engineering Applications of Artificial Intelligence (2025) Vol. 149, pp. 110478-110478
Closed Access
M. O. A. Limam, Marwa Dhiaf, Yousri Kessentini
Engineering Applications of Artificial Intelligence (2025) Vol. 149, pp. 110478-110478
Closed Access
UniHDSA: A unified relation prediction approach for hierarchical document structure analysis
Jiawei Wang, Kai Hu, Qiang Huo
Pattern Recognition (2025), pp. 111617-111617
Closed Access
Jiawei Wang, Kai Hu, Qiang Huo
Pattern Recognition (2025), pp. 111617-111617
Closed Access
Bi-VLDoc: bidirectional vision-language modeling for visually-rich document understanding
Chuwei Luo, Guozhi Tang, Qi Zheng, et al.
International Journal on Document Analysis and Recognition (IJDAR) (2025)
Closed Access
Chuwei Luo, Guozhi Tang, Qi Zheng, et al.
International Journal on Document Analysis and Recognition (IJDAR) (2025)
Closed Access
Improving Document-Based Question Answering with Multi-modal Learning: Batch Size and Token Modality Experiments
K. Raagulbharatwaj, J. Senthil Murugan, Jeganathan Lakshmanan
Lecture notes in electrical engineering (2025), pp. 513-526
Closed Access
K. Raagulbharatwaj, J. Senthil Murugan, Jeganathan Lakshmanan
Lecture notes in electrical engineering (2025), pp. 513-526
Closed Access
LiGT: layout-infused generative transformer for visual question answering on Vietnamese receipts
Phong T. Le, Trung Phan, Nghia Hieu Nguyen, et al.
International Journal on Document Analysis and Recognition (IJDAR) (2025)
Closed Access
Phong T. Le, Trung Phan, Nghia Hieu Nguyen, et al.
International Journal on Document Analysis and Recognition (IJDAR) (2025)
Closed Access
LayoutMask: Enhance Text-Layout Interaction in Multi-modal Pre-training for Document Understanding
Yi Tu, Ya Guo, Huan Chen, et al.
(2023), pp. 15200-15212
Open Access | Times Cited: 12
Yi Tu, Ya Guo, Huan Chen, et al.
(2023), pp. 15200-15212
Open Access | Times Cited: 12
Modeling Entities as Semantic Points for Visual Information Extraction in the Wild
Zhibo Yang, Rujiao Long, Pengfei Wang, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023)
Open Access | Times Cited: 12
Zhibo Yang, Rujiao Long, Pengfei Wang, et al.
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2023)
Open Access | Times Cited: 12