Alert button
Picture for Jindong Wang

Jindong Wang

Alert button

FreeEval: A Modular Framework for Trustworthy and Efficient Evaluation of Large Language Models

Add code
Bookmark button
Alert button
Apr 09, 2024
Zhuohao Yu, Chang Gao, Wenjin Yao, Yidong Wang, Zhengran Zeng, Wei Ye, Jindong Wang, Yue Zhang, Shikun Zhang

Viaarxiv icon

Detoxifying Large Language Models via Knowledge Editing

Add code
Bookmark button
Alert button
Mar 28, 2024
Mengru Wang, Ningyu Zhang, Ziwen Xu, Zekun Xi, Shumin Deng, Yunzhi Yao, Qishen Zhang, Linyi Yang, Jindong Wang, Huajun Chen

Figure 1 for Detoxifying Large Language Models via Knowledge Editing
Figure 2 for Detoxifying Large Language Models via Knowledge Editing
Figure 3 for Detoxifying Large Language Models via Knowledge Editing
Figure 4 for Detoxifying Large Language Models via Knowledge Editing
Viaarxiv icon

Learning with Noisy Foundation Models

Add code
Bookmark button
Alert button
Mar 11, 2024
Hao Chen, Jindong Wang, Zihan Wang, Ran Tao, Hongxin Wei, Xing Xie, Masashi Sugiyama, Bhiksha Raj

Viaarxiv icon

ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language Models

Add code
Bookmark button
Alert button
Mar 08, 2024
Jio Oh, Soyeon Kim, Junseok Seo, Jindong Wang, Ruochen Xu, Xing Xie, Steven Euijong Whang

Figure 1 for ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language Models
Figure 2 for ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language Models
Figure 3 for ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language Models
Figure 4 for ERBench: An Entity-Relationship based Automatically Verifiable Hallucination Benchmark for Large Language Models
Viaarxiv icon

NPHardEval4V: A Dynamic Reasoning Benchmark of Multimodal Large Language Models

Add code
Bookmark button
Alert button
Mar 05, 2024
Lizhou Fan, Wenyue Hua, Xiang Li, Kaijie Zhu, Mingyu Jin, Lingyao Li, Haoyang Ling, Jinkui Chi, Jindong Wang, Xin Ma, Yongfeng Zhang

Figure 1 for NPHardEval4V: A Dynamic Reasoning Benchmark of Multimodal Large Language Models
Figure 2 for NPHardEval4V: A Dynamic Reasoning Benchmark of Multimodal Large Language Models
Figure 3 for NPHardEval4V: A Dynamic Reasoning Benchmark of Multimodal Large Language Models
Figure 4 for NPHardEval4V: A Dynamic Reasoning Benchmark of Multimodal Large Language Models
Viaarxiv icon

Adversarial example soups: averaging multiple adversarial examples improves transferability without increasing additional generation time

Add code
Bookmark button
Alert button
Feb 27, 2024
Bo Yang, Hengwei Zhang, Chenwei Li, Jindong Wang

Viaarxiv icon

LSTPrompt: Large Language Models as Zero-Shot Time Series Forecasters by Long-Short-Term Prompting

Add code
Bookmark button
Alert button
Feb 25, 2024
Haoxin Liu, Zhiyuan Zhao, Jindong Wang, Harshavardhan Kamarthi, B. Aditya Prakash

Viaarxiv icon

KIEval: A Knowledge-grounded Interactive Evaluation Framework for Large Language Models

Add code
Bookmark button
Alert button
Feb 23, 2024
Zhuohao Yu, Chang Gao, Wenjin Yao, Yidong Wang, Wei Ye, Jindong Wang, Xing Xie, Yue Zhang, Shikun Zhang

Viaarxiv icon

MM-Soc: Benchmarking Multimodal Large Language Models in Social Media Platforms

Add code
Bookmark button
Alert button
Feb 21, 2024
Yiqiao Jin, Minje Choi, Gaurav Verma, Jindong Wang, Srijan Kumar

Viaarxiv icon