WanJuan-CC: A Safe and High-Quality Open-sourced English Webtext Dataset
Jiantao Qiu,Haijun Lv,Zhenjiang Jin, Rui Wang, Wenchang Ning, Jia Yu, ChaoBin Zhang, Zhenxiang Li, Pei Chu, Yuan Qu, Jin Shi, Lindong Lu,Runyu Peng,Zhiyuan Zeng, Huanze Tang,Zhikai Lei,Jiawei Hong,Keyu Chen,Zhaoye Fei, Ruiliang Xu,Wei Li, Zhongying Tu,Lin Dahua, Yu Qiao,Hang Yan,Conghui He CoRR(2024)
AI 理解论文
溯源树
样例
