OpenAlex Citation Counts

OpenAlex is a bibliographic catalogue of scientific papers, authors and institutions accessible in open access mode, named after the Library of Alexandria. It's citation coverage is excellent and I hope you will find utility in this listing of citing articles!

If you click the article title, you'll navigate to the article, as listed in CrossRef. If you click the Open Access links, you'll navigate to the "best Open Access location". Clicking the citation count will open this listing for that article. Lastly at the bottom of the page, you'll find basic pagination options.

Requested Article:

Repairing the Cracked Foundation: A Survey of Obstacles in Evaluation Practices for Generated Text
Sebastian Gehrmann, Elizabeth A. Clark, Thibault Sellam
Journal of Artificial Intelligence Research (2023) Vol. 77, pp. 103-166
Open Access | Times Cited: 69

Showing 1-25 of 69 citing articles:

Towards a Unified Multi-Dimensional Evaluator for Text Generation
Ming Zhong, Yang Liu, Da Yin, et al.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (2022)
Open Access | Times Cited: 57

Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation
Yixin Liu, Alex Fabbri, Pengfei Liu, et al.
(2023)
Open Access | Times Cited: 37

Large Language Models Effectively Leverage Document-level Context for Literary Translation, but Critical Errors Persist
Marzena Karpinska, Mohit Iyyer
(2023)
Open Access | Times Cited: 32

Prompted Opinion Summarization with GPT-3.5
Adithya Bhaskar, Alex Fabbri, Greg Durrett
Findings of the Association for Computational Linguistics: ACL 2022 (2023)
Open Access | Times Cited: 26

Bridging the Gap: A Survey on Integrating (Human) Feedback for Natural Language Generation
Patrick Fernandes, Aman Madaan, Emmy Liu, et al.
Transactions of the Association for Computational Linguistics (2023) Vol. 11, pp. 1643-1668
Open Access | Times Cited: 24

Benchmarking the Hallucination Tendency of Google Gemini and Moonshot Kimi
Ruoxi Shan, Qiang Ming, Guang Hong, et al.
(2024)
Open Access | Times Cited: 9

EvalLM: Interactive Evaluation of Large Language Model Prompts on User-Defined Criteria
Tae Soo Kim, Yoonjoo Lee, Jamin Shin, et al.
(2024), pp. 1-21
Open Access | Times Cited: 7

Evaluating Attribution in Dialogue Systems: The BEGIN Benchmark
Nouha Dziri, Hannah Rashkin, Tal Linzen, et al.
Transactions of the Association for Computational Linguistics (2022) Vol. 10, pp. 1066-1083
Open Access | Times Cited: 30

LongEval: Guidelines for Human Evaluation of Faithfulness in Long-form Summarization
Kalpesh Krishna, Erin Bransom, Bailey Kuehl, et al.
(2023)
Open Access | Times Cited: 18

A Scoping Study of Evaluation Practices for Responsible AI Tools: Steps Towards Effectiveness Evaluations
Glen Berman, Nitesh Goyal, Michael Madaio
(2024), pp. 1-24
Open Access | Times Cited: 5

Intelligence as Agency: Evaluating the Capacity of Generative AI to Empower or Constrain Human Action
Arvind Satyanarayan, Graham M. Jones
(2024)
Open Access | Times Cited: 5

From text to treatment: the crucial role of validation for generative large language models in health care
Anne de Hond, Tuur Leeuwenberg, Richard Bartels, et al.
The Lancet Digital Health (2024) Vol. 6, Iss. 7, pp. e441-e443
Open Access | Times Cited: 5

Toward cultural interpretability: A linguistic anthropological framework for describing and evaluating large language models
Gregory R. Jones, Shai Satran, Arvind Satyanarayan
Big Data & Society (2025) Vol. 12, Iss. 1
Open Access

Evaluation Workflows for Large Language Models (LLMs) that Integrate Domain Expertise for Complex Knowledge Tasks
Annalisa Szymanski
(2025), pp. 215-217
Closed Access

A Critical Evaluation of Evaluations for Long-form Question Answering
Fangyuan Xu, Yixiao Song, Mohit Iyyer, et al.
(2023), pp. 3225-3245
Open Access | Times Cited: 12

MCRanker: Generating Diverse Criteria On-the-Fly to Improve Pointwise LLM Rankers
Fang Guo, Wenyu Li, Honglei Zhuang, et al.
(2025), pp. 944-953
Closed Access

Human-Centered Evaluation and Auditing of Language Models
Ziang Xiao, Wesley Hanwen Deng, Michelle S. Lam, et al.
(2024), pp. 1-6
Open Access | Times Cited: 3

SQuALITY: Building a Long-Document Summarization Dataset the Hard Way
Alex Wang, Richard Yuanzhe Pang, Angelica Chen, et al.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (2022), pp. 1139-1156
Open Access | Times Cited: 17

Evaluating factual accuracy in complex data-to-text
Craig Thomson, Ehud Reiter, Barkavi Sundararajan
Computer Speech & Language (2023) Vol. 80, pp. 101482-101482
Open Access | Times Cited: 9

Grounding and Evaluation for Large Language Models: Practical Challenges and Lessons Learned (Survey)
Krishnaram Kenthapadi, Mehrnoosh Sameki, Ankur Taly
Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (2024), pp. 6523-6533
Open Access | Times Cited: 3

SafeText: A Benchmark for Exploring Physical Safety in Language Models
Sharon Levy, Emily Allaway, Melanie Subbiah, et al.
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (2022), pp. 2407-2421
Open Access | Times Cited: 14

GEMv2: Multilingual NLG Benchmarking in a Single Line of Code
Sebastian Gehrmann, Abhik Bhattacharjee, Abinaya Mahendiran, et al.
(2022), pp. 266-281
Open Access | Times Cited: 14

Dialect-robust Evaluation of Generated Text
Jiao Sun, Thibault Sellam, Elizabeth A. Clark, et al.
(2023), pp. 6010-6028
Open Access | Times Cited: 7

Common Flaws in Running Human Evaluation Experiments in NLP
Craig Thomson, Ehud Reiter, Anja Belz
Computational Linguistics (2024) Vol. 50, Iss. 2, pp. 795-805
Open Access | Times Cited: 2

Automatic Histograms: Leveraging Language Models for Text Dataset Exploration
Emily Reif, Crystal Qian, James Wexler, et al.
(2024), pp. 1-9
Open Access | Times Cited: 2

Page 1 - Next Page

Cookie	Duration	Description
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Analytics" category .
cookielawinfo-checkbox-functional	1 year	The cookie is set by the GDPR Cookie Consent plugin to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Necessary" category .
cookielawinfo-checkbox-others	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to store the user consent for cookies in the category "Others".
cookielawinfo-checkbox-performance	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to store the user consent for cookies in the category "Performance".
CookieLawInfoConsent	1 year	Records the default button state of the corresponding category & the status of CCPA. It works only in coordination with the primary cookie.
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.

Requested Article:

Showing 1-25 of 69 citing articles:

Your Privacy