Picture for Da Zheng

Da Zheng

TabDLM: Free-Form Tabular Data Generation via Joint Numerical-Language Diffusion

Add code
Feb 26, 2026
Viaarxiv icon

GREPO: A Benchmark for Graph Neural Networks on Repository-Level Bug Localization

Add code
Feb 14, 2026
Viaarxiv icon

LLaDA2.1: Speeding Up Text Diffusion via Token Editing

Add code
Feb 09, 2026
Viaarxiv icon

Full-Graph vs. Mini-Batch Training: Comprehensive Analysis from a Batch Size and Fan-Out Size Perspective

Add code
Jan 30, 2026
Viaarxiv icon

LLaDA2.0: Scaling Up Diffusion Language Models to 100B

Add code
Dec 24, 2025
Viaarxiv icon

EvolProver: Advancing Automated Theorem Proving by Evolving Formalized Problems via Symmetry and Difficulty

Add code
Oct 01, 2025
Figure 1 for EvolProver: Advancing Automated Theorem Proving by Evolving Formalized Problems via Symmetry and Difficulty
Figure 2 for EvolProver: Advancing Automated Theorem Proving by Evolving Formalized Problems via Symmetry and Difficulty
Figure 3 for EvolProver: Advancing Automated Theorem Proving by Evolving Formalized Problems via Symmetry and Difficulty
Figure 4 for EvolProver: Advancing Automated Theorem Proving by Evolving Formalized Problems via Symmetry and Difficulty
Viaarxiv icon

Why Do Open-Source LLMs Struggle with Data Analysis? A Systematic Empirical Study

Add code
Jun 24, 2025
Figure 1 for Why Do Open-Source LLMs Struggle with Data Analysis? A Systematic Empirical Study
Figure 2 for Why Do Open-Source LLMs Struggle with Data Analysis? A Systematic Empirical Study
Figure 3 for Why Do Open-Source LLMs Struggle with Data Analysis? A Systematic Empirical Study
Figure 4 for Why Do Open-Source LLMs Struggle with Data Analysis? A Systematic Empirical Study
Viaarxiv icon

AutoMind: Adaptive Knowledgeable Agent for Automated Data Science

Add code
Jun 12, 2025
Viaarxiv icon

Right Is Not Enough: The Pitfalls of Outcome Supervision in Training LLMs for Math Reasoning

Add code
Jun 07, 2025
Viaarxiv icon

dots.llm1 Technical Report

Add code
Jun 06, 2025
Viaarxiv icon