OpenAlex Citation Counts

OpenAlex Citations Logo

OpenAlex is a bibliographic catalogue of scientific papers, authors and institutions accessible in open access mode, named after the Library of Alexandria. It's citation coverage is excellent and I hope you will find utility in this listing of citing articles!

If you click the article title, you'll navigate to the article, as listed in CrossRef. If you click the Open Access links, you'll navigate to the "best Open Access location". Clicking the citation count will open this listing for that article. Lastly at the bottom of the page, you'll find basic pagination options.

Requested Article:

PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU
Yixin Song, Zeyu Mi, Haotong Xie, et al.
(2024), pp. 590-606
Open Access | Times Cited: 8

Showing 8 citing articles:

MoE-L ightning : High-Throughput MoE Inference on Memory-constrained GPUs
Shiyi Cao, Shu Liu, Tyler Griggs, et al.
(2025), pp. 715-730
Closed Access

A Survey: Collaborative Hardware and Software Design in the Era of Large Language Models
Cong Guo, Feng Cheng, Zhixu Du, et al.
IEEE Circuits and Systems Magazine (2025) Vol. 25, Iss. 1, pp. 35-57
Open Access

Frugal: Efficient and Economic Embedding Model Training with Commodity GPUs
Minhui Xie, Shaoxun Zeng, Hao Guo, et al.
(2025), pp. 509-523
Closed Access

Accelerating Mixture-of-Experts language model inference via plug-and-play lookahead gate on a single GPU
Jie Ou, Yueming Chen, Buyao Xiong, et al.
Computer Standards & Interfaces (2025), pp. 103996-103996
Closed Access

Achieving Peak Performance for Large Language Models: A Systematic Review
Zhyar Rzgar K Rostam, Sándor Szénási, Gábor Kertész
IEEE Access (2024) Vol. 12, pp. 96017-96050
Open Access | Times Cited: 2

A review of AI edge devices and lightweight CNN deployment
Kailai Sun, Xinwei Wang, Xi Miao, et al.
Neurocomputing (2024) Vol. 614, pp. 128791-128791
Closed Access | Times Cited: 2

Governing Open Vocabulary Data Leaks Using an Edge LLM through Programming by Example
Qiyu Li, J. Wen, Haojian Jin
Proceedings of the ACM on Interactive Mobile Wearable and Ubiquitous Technologies (2024) Vol. 8, Iss. 4, pp. 1-31
Open Access

Cambricon-LLM: A Chiplet-Based Hybrid Architecture for On-Device Inference of 70B LLM
Zhongkai Yu, Shengwen Liang, Tianyun Ma, et al.
(2024), pp. 1474-1488
Closed Access

Page 1

Scroll to top