Distill Not Only Data but Also Rewards: Can Smaller Language Models Surpass Larger Ones?Yudi Zhang,Lu Wang,Meng Fang,Yali Du, Chenghua Huang,Jun Wang,Qingwei Lin,Mykola Pechenizkiy,Dongmei Zhang,Saravan Rajmohan,Qi ZhangCoRR(2025)引用 0|浏览8AI 理解论文溯源树样例生成溯源树,研究论文发展脉络Chat Paper正在生成论文摘要