A Survey on Evaluation of Large Language Models
Yupeng Chang, Xu Wang, Jindong Wang, Yuan Wu, Linyi Yang, Kaijie Zhu, Hao Chen, Xiaoyuan Yi, Cunxiang Wang, Yidong Wang, Wei Ye, Yue Zhang, Yi Chang, Philip S. Yu, Qiang Yang, Xing Xie
DOI: 10.1145/3641289
Journal: ACM Transactions on Intelligent Systems and Technology
This paper presents a comprehensive review of these evaluation methods for LLMs, focusing on three key dimensions: what to evaluate, where to evaluate, and how to evaluate, and offers invaluable insights to researchers in the realm of LLMs evaluation.
ivySCI AI Smartly Parses PDF, Answers Researchers' Questions, and Helps You Understand Papers in Seconds
Journal Info
Journals:
ISSN 2157-6904
Quartile
Category | Quartile |
COMPUTER SCIENCE, INFORMATION SYSTEMS | 1 |
Quartile(CN)
Category | Quartile |
计算机科学 | 4 |
计算机科学, 计算机人工智能 | 4 |
计算机科学, 计算机信息系统 | 4 |