Alert button
Picture for Binyuan Hui

Binyuan Hui

Alert button

DevBench: A Comprehensive Benchmark for Software Development

Add code
Bookmark button
Alert button
Mar 15, 2024
Bowen Li, Wenhan Wu, Ziwei Tang, Lin Shi, John Yang, Jinyang Li, Shunyu Yao, Chen Qian, Binyuan Hui, Qicheng Zhang, Zhiyin Yu, He Du, Ping Yang, Dahua Lin, Chao Peng, Kai Chen

Figure 1 for DevBench: A Comprehensive Benchmark for Software Development
Figure 2 for DevBench: A Comprehensive Benchmark for Software Development
Figure 3 for DevBench: A Comprehensive Benchmark for Software Development
Figure 4 for DevBench: A Comprehensive Benchmark for Software Development
Viaarxiv icon

StarCoder 2 and The Stack v2: The Next Generation

Add code
Bookmark button
Alert button
Feb 29, 2024
Anton Lozhkov, Raymond Li, Loubna Ben Allal, Federico Cassano, Joel Lamy-Poirier, Nouamane Tazi, Ao Tang, Dmytro Pykhtar, Jiawei Liu, Yuxiang Wei, Tianyang Liu, Max Tian, Denis Kocetkov, Arthur Zucker, Younes Belkada, Zijian Wang, Qian Liu, Dmitry Abulkhanov, Indraneil Paul, Zhuang Li, Wen-Ding Li, Megan Risdal, Jia Li, Jian Zhu, Terry Yue Zhuo, Evgenii Zheltonozhskii, Nii Osae Osae Dade, Wenhao Yu, Lucas Krauß, Naman Jain, Yixuan Su, Xuanli He, Manan Dey, Edoardo Abati, Yekun Chai, Niklas Muennighoff, Xiangru Tang, Muhtasham Oblokulov, Christopher Akiki, Marc Marone, Chenghao Mou, Mayank Mishra, Alex Gu, Binyuan Hui, Tri Dao, Armel Zebaze, Olivier Dehaene, Nicolas Patry, Canwen Xu, Julian McAuley, Han Hu, Torsten Scholak, Sebastien Paquet, Jennifer Robinson, Carolyn Jane Anderson, Nicolas Chapados, Mostofa Patwary, Nima Tajbakhsh, Yacine Jernite, Carlos Muñoz Ferrandis, Lingming Zhang, Sean Hughes, Thomas Wolf, Arjun Guha, Leandro von Werra, Harm de Vries

Viaarxiv icon

One Shot Learning as Instruction Data Prospector for Large Language Models

Add code
Bookmark button
Alert button
Jan 04, 2024
Yunshui Li, Binyuan Hui, Xiaobo Xia, Jiaxi Yang, Min Yang, Lei Zhang, Shuzheng Si, Junhao Liu, Tongliang Liu, Fei Huang, Yongbin Li

Viaarxiv icon

DialCLIP: Empowering CLIP as Multi-Modal Dialog Retriever

Add code
Bookmark button
Alert button
Jan 03, 2024
Zhichao Yin, Binyuan Hui, Min Yang, Fei Huang, Yongbin Li

Viaarxiv icon

An Investigation of LLMs' Inefficacy in Understanding Converse Relations

Add code
Bookmark button
Alert button
Oct 25, 2023
Chengwen Qi, Bowen Li, Binyuan Hui, Bailin Wang, Jinyang Li, Jinwang Wu, Yuanjun Laili

Viaarxiv icon

Lemur: Harmonizing Natural Language and Code for Language Agents

Add code
Bookmark button
Alert button
Oct 10, 2023
Yiheng Xu, Hongjin Su, Chen Xing, Boyu Mi, Qian Liu, Weijia Shi, Binyuan Hui, Fan Zhou, Yitao Liu, Tianbao Xie, Zhoujun Cheng, Siheng Zhao, Lingpeng Kong, Bailin Wang, Caiming Xiong, Tao Yu

Figure 1 for Lemur: Harmonizing Natural Language and Code for Language Agents
Figure 2 for Lemur: Harmonizing Natural Language and Code for Language Agents
Figure 3 for Lemur: Harmonizing Natural Language and Code for Language Agents
Figure 4 for Lemur: Harmonizing Natural Language and Code for Language Agents
Viaarxiv icon

Qwen Technical Report

Add code
Bookmark button
Alert button
Sep 28, 2023
Jinze Bai, Shuai Bai, Yunfei Chu, Zeyu Cui, Kai Dang, Xiaodong Deng, Yang Fan, Wenbin Ge, Yu Han, Fei Huang, Binyuan Hui, Luo Ji, Mei Li, Junyang Lin, Runji Lin, Dayiheng Liu, Gao Liu, Chengqiang Lu, Keming Lu, Jianxin Ma, Rui Men, Xingzhang Ren, Xuancheng Ren, Chuanqi Tan, Sinan Tan, Jianhong Tu, Peng Wang, Shijie Wang, Wei Wang, Shengguang Wu, Benfeng Xu, Jin Xu, An Yang, Hao Yang, Jian Yang, Shusheng Yang, Yang Yao, Bowen Yu, Hongyi Yuan, Zheng Yuan, Jianwei Zhang, Xingxuan Zhang, Yichang Zhang, Zhenru Zhang, Chang Zhou, Jingren Zhou, Xiaohuan Zhou, Tianhang Zhu

Figure 1 for Qwen Technical Report
Figure 2 for Qwen Technical Report
Figure 3 for Qwen Technical Report
Figure 4 for Qwen Technical Report
Viaarxiv icon

VDialogUE: A Unified Evaluation Benchmark for Visually-grounded Dialogue

Add code
Bookmark button
Alert button
Sep 14, 2023
Yunshui Li, Binyuan Hui, Zhaochao Yin, Wanwei He, Run Luo, Yuxing Long, Min Yang, Fei Huang, Yongbin Li

Figure 1 for VDialogUE: A Unified Evaluation Benchmark for Visually-grounded Dialogue
Figure 2 for VDialogUE: A Unified Evaluation Benchmark for Visually-grounded Dialogue
Figure 3 for VDialogUE: A Unified Evaluation Benchmark for Visually-grounded Dialogue
Figure 4 for VDialogUE: A Unified Evaluation Benchmark for Visually-grounded Dialogue
Viaarxiv icon