
OpenAlex is a bibliographic catalogue of scientific papers, authors and institutions accessible in open access mode, named after the Library of Alexandria. It's citation coverage is excellent and I hope you will find utility in this listing of citing articles!
If you click the article title, you'll navigate to the article, as listed in CrossRef. If you click the Open Access links, you'll navigate to the "best Open Access location". Clicking the citation count will open this listing for that article. Lastly at the bottom of the page, you'll find basic pagination options.
Requested Article:
Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity
Haojun Xia, Zheng Zhen, Yuchao Li, et al.
Proceedings of the VLDB Endowment (2023) Vol. 17, Iss. 2, pp. 211-224
Open Access | Times Cited: 7
Haojun Xia, Zheng Zhen, Yuchao Li, et al.
Proceedings of the VLDB Endowment (2023) Vol. 17, Iss. 2, pp. 211-224
Open Access | Times Cited: 7
Showing 7 citing articles:
A Survey on Model Compression for Large Language Models
Xunyu Zhu, Jian Li, Yong Liu, et al.
Transactions of the Association for Computational Linguistics (2024) Vol. 12, pp. 1556-1577
Open Access | Times Cited: 20
Xunyu Zhu, Jian Li, Yong Liu, et al.
Transactions of the Association for Computational Linguistics (2024) Vol. 12, pp. 1556-1577
Open Access | Times Cited: 20
PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU
Yixin Song, Zeyu Mi, Haotong Xie, et al.
(2024), pp. 590-606
Open Access | Times Cited: 8
Yixin Song, Zeyu Mi, Haotong Xie, et al.
(2024), pp. 590-606
Open Access | Times Cited: 8
Navigating the Web of Disinformation and Misinformation: Large Language Models as Double-Edged Swords
Siddhant Bikram Shah, Surendrabikram Thapa, Ashish Acharya, et al.
IEEE Access (2024), pp. 1-1
Open Access | Times Cited: 6
Siddhant Bikram Shah, Surendrabikram Thapa, Ashish Acharya, et al.
IEEE Access (2024), pp. 1-1
Open Access | Times Cited: 6
Optimizing Dynamic-Shape Neural Networks on Accelerators via On-the-Fly Micro-Kernel Polymerization
Feng Yu, Guangli Li, Jiacheng Zhao, et al.
(2024), pp. 797-812
Closed Access | Times Cited: 2
Feng Yu, Guangli Li, Jiacheng Zhao, et al.
(2024), pp. 797-812
Closed Access | Times Cited: 2
A review of AI edge devices and lightweight CNN deployment
Kailai Sun, Xinwei Wang, Xi Miao, et al.
Neurocomputing (2024) Vol. 614, pp. 128791-128791
Closed Access | Times Cited: 2
Kailai Sun, Xinwei Wang, Xi Miao, et al.
Neurocomputing (2024) Vol. 614, pp. 128791-128791
Closed Access | Times Cited: 2
AutoMC: Automated Model Compression Based on Domain Knowledge and Progressive Search
Chunnan Wang, Hongzhi Wang, Xiangyu Shi
2022 IEEE 38th International Conference on Data Engineering (ICDE) (2024), pp. 1819-1832
Open Access
Chunnan Wang, Hongzhi Wang, Xiangyu Shi
2022 IEEE 38th International Conference on Data Engineering (ICDE) (2024), pp. 1819-1832
Open Access
Accelerating and Compressing Transformer-Based PLMs for Enhanced Comprehension of Computer Terminology
Jian Peng, Kai Zhong
Future Internet (2024) Vol. 16, Iss. 11, pp. 385-385
Open Access
Jian Peng, Kai Zhong
Future Internet (2024) Vol. 16, Iss. 11, pp. 385-385
Open Access