Picture for Tianle Gu

Tianle Gu

A Rigorous Benchmark with Multidimensional Evaluation for Deep Research Agents: From Answers to Reports

Add code
Oct 02, 2025
Viaarxiv icon

S2J: Bridging the Gap Between Solving and Judging Ability in Generative Reward Models

Add code
Sep 26, 2025
Viaarxiv icon

SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law

Add code
Jul 24, 2025
Figure 1 for SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law
Figure 2 for SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law
Figure 3 for SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law
Figure 4 for SafeWork-R1: Coevolving Safety and Intelligence under the AI-45$^{\circ}$ Law
Viaarxiv icon

Invisible Entropy: Towards Safe and Efficient Low-Entropy LLM Watermarking

Add code
May 20, 2025
Viaarxiv icon

From Rankings to Insights: Evaluation Should Shift Focus from Leaderboard to Feedback

Add code
May 10, 2025
Viaarxiv icon

Word Form Matters: LLMs' Semantic Reconstruction under Typoglycemia

Add code
Mar 03, 2025
Viaarxiv icon

Recent Advances in Large Langauge Model Benchmarks against Data Contamination: From Static to Dynamic Evaluation

Add code
Feb 23, 2025
Viaarxiv icon

A Cognitive Writing Perspective for Constrained Long-Form Text Generation

Add code
Feb 19, 2025
Viaarxiv icon

HoneypotNet: Backdoor Attacks Against Model Extraction

Add code
Jan 02, 2025
Figure 1 for HoneypotNet: Backdoor Attacks Against Model Extraction
Figure 2 for HoneypotNet: Backdoor Attacks Against Model Extraction
Figure 3 for HoneypotNet: Backdoor Attacks Against Model Extraction
Figure 4 for HoneypotNet: Backdoor Attacks Against Model Extraction
Viaarxiv icon

MEOW: MEMOry Supervised LLM Unlearning Via Inverted Facts

Add code
Sep 18, 2024
Viaarxiv icon