Picture for Jingwei Ni

Jingwei Ni

When AI Benchmarks Plateau: A Systematic Study of Benchmark Saturation

Add code
Feb 18, 2026
Viaarxiv icon

pdfQA: Diverse, Challenging, and Realistic Question Answering over PDFs

Add code
Jan 06, 2026
Viaarxiv icon

Reasoning with Confidence: Efficient Verification of LLM Reasoning Steps via Uncertainty Heads

Add code
Nov 11, 2025
Viaarxiv icon

Apertus: Democratizing Open and Compliant LLMs for Global Language Environments

Add code
Sep 17, 2025
Figure 1 for Apertus: Democratizing Open and Compliant LLMs for Global Language Environments
Figure 2 for Apertus: Democratizing Open and Compliant LLMs for Global Language Environments
Figure 3 for Apertus: Democratizing Open and Compliant LLMs for Global Language Environments
Figure 4 for Apertus: Democratizing Open and Compliant LLMs for Global Language Environments
Viaarxiv icon

Can Large Language Models Capture Human Annotator Disagreements?

Add code
Jun 24, 2025
Viaarxiv icon

LEXam: Benchmarking Legal Reasoning on 340 Law Exams

Add code
May 19, 2025
Figure 1 for LEXam: Benchmarking Legal Reasoning on 340 Law Exams
Figure 2 for LEXam: Benchmarking Legal Reasoning on 340 Law Exams
Figure 3 for LEXam: Benchmarking Legal Reasoning on 340 Law Exams
Figure 4 for LEXam: Benchmarking Legal Reasoning on 340 Law Exams
Viaarxiv icon

Navigating the Helpfulness-Truthfulness Trade-Off with Uncertainty-Aware Instruction Fine-Tuning

Add code
Feb 17, 2025
Viaarxiv icon

DIRAS: Efficient LLM-Assisted Annotation of Document Relevance in Retrieval Augmented Generation

Add code
Jun 20, 2024
Figure 1 for DIRAS: Efficient LLM-Assisted Annotation of Document Relevance in Retrieval Augmented Generation
Figure 2 for DIRAS: Efficient LLM-Assisted Annotation of Document Relevance in Retrieval Augmented Generation
Figure 3 for DIRAS: Efficient LLM-Assisted Annotation of Document Relevance in Retrieval Augmented Generation
Figure 4 for DIRAS: Efficient LLM-Assisted Annotation of Document Relevance in Retrieval Augmented Generation
Viaarxiv icon

ClimRetrieve: A Benchmarking Dataset for Information Retrieval from Corporate Climate Disclosures

Add code
Jun 14, 2024
Viaarxiv icon

Towards Faithful and Robust LLM Specialists for Evidence-Based Question-Answering

Add code
Feb 26, 2024
Viaarxiv icon