Picture for Chengye Wang

Chengye Wang

AbGen: Evaluating Large Language Models in Ablation Study Design and Evaluation for Scientific Research

Add code
Jul 17, 2025
Viaarxiv icon

SciVer: Evaluating Foundation Models for Multimodal Scientific Claim Verification

Add code
Jun 18, 2025
Viaarxiv icon

A Neuro-inspired Interpretation of Unlearning in Large Language Models through Sample-level Unlearning Difficulty

Add code
Apr 09, 2025
Viaarxiv icon

MMVU: Measuring Expert-Level Multi-Discipline Video Understanding

Add code
Jan 21, 2025
Figure 1 for MMVU: Measuring Expert-Level Multi-Discipline Video Understanding
Figure 2 for MMVU: Measuring Expert-Level Multi-Discipline Video Understanding
Figure 3 for MMVU: Measuring Expert-Level Multi-Discipline Video Understanding
Figure 4 for MMVU: Measuring Expert-Level Multi-Discipline Video Understanding
Viaarxiv icon

FinDVer: Explainable Claim Verification over Long and Hybrid-Content Financial Documents

Add code
Nov 08, 2024
Figure 1 for FinDVer: Explainable Claim Verification over Long and Hybrid-Content Financial Documents
Figure 2 for FinDVer: Explainable Claim Verification over Long and Hybrid-Content Financial Documents
Figure 3 for FinDVer: Explainable Claim Verification over Long and Hybrid-Content Financial Documents
Figure 4 for FinDVer: Explainable Claim Verification over Long and Hybrid-Content Financial Documents
Viaarxiv icon