『Cognitive Episodes in LLM Reasoning Traces Enable Interpretable Human Item Difficulty Prediction』のカバーアート

Cognitive Episodes in LLM Reasoning Traces Enable Interpretable Human Item Difficulty Prediction

Cognitive Episodes in LLM Reasoning Traces Enable Interpretable Human Item Difficulty Prediction

無料で聴く

ポッドキャストの詳細を見る
Predicting how hard an exam question will be for human test-takers — without running expensive human trials — would transform educational assessment. This paper proposes using the reasoning traces of large language models as a proxy for human cognitive effort. Rather than treating these traces as raw text, Epi2Diff structures them into meaningful "cognitive episodes" — functional states like planning, implementing, and verifying — and uses the dynamics between these states to predict difficulty. Tested on four real-world human difficulty datasets including SAT-derived benchmarks, it consistently outperforms strong baselines. Applications include automated test construction, adaptive learning platforms, and AI-assisted item difficulty calibration for standardized assessments. Authors: Chenguang Wang, Ming Li, Xinyue Zeng, Zhuochun Li, Hong Jiao, Tianyi Zhou, Dawei Zhou Paper: https://arxiv.org/abs/2606.28186v1
adbl_web_anon_alc_button_suppression_t1
まだレビューはありません