Picture for Wenhao Huang

Wenhao Huang

Encyclo-K: Evaluating LLMs with Dynamically Composed Knowledge Statements

Add code
Dec 31, 2025
Viaarxiv icon

Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space

Add code
Dec 31, 2025
Viaarxiv icon

LLM Swiss Round: Aggregating Multi-Benchmark Performance via Competitive Swiss-System Dynamics

Add code
Dec 24, 2025
Viaarxiv icon

Mitigating LLM Hallucination via Behaviorally Calibrated Reinforcement Learning

Add code
Dec 22, 2025
Viaarxiv icon

NL2Repo-Bench: Towards Long-Horizon Repository Generation Evaluation of Coding Agents

Add code
Dec 14, 2025
Viaarxiv icon

AutoMV: An Automatic Multi-Agent System for Music Video Generation

Add code
Dec 13, 2025
Viaarxiv icon

DiscoX: Benchmarking Discourse-Level Translation task in Expert Domains

Add code
Nov 14, 2025
Figure 1 for DiscoX: Benchmarking Discourse-Level Translation task in Expert Domains
Figure 2 for DiscoX: Benchmarking Discourse-Level Translation task in Expert Domains
Figure 3 for DiscoX: Benchmarking Discourse-Level Translation task in Expert Domains
Figure 4 for DiscoX: Benchmarking Discourse-Level Translation task in Expert Domains
Viaarxiv icon

LPFQA: A Long-Tail Professional Forum-based Benchmark for LLM Evaluation

Add code
Nov 09, 2025
Viaarxiv icon

RLoop: An Self-Improving Framework for Reinforcement Learning with Iterative Policy Initialization

Add code
Nov 06, 2025
Figure 1 for RLoop: An Self-Improving Framework for Reinforcement Learning with Iterative Policy Initialization
Figure 2 for RLoop: An Self-Improving Framework for Reinforcement Learning with Iterative Policy Initialization
Figure 3 for RLoop: An Self-Improving Framework for Reinforcement Learning with Iterative Policy Initialization
Figure 4 for RLoop: An Self-Improving Framework for Reinforcement Learning with Iterative Policy Initialization
Viaarxiv icon

Scaling Latent Reasoning via Looped Language Models

Add code
Oct 29, 2025
Figure 1 for Scaling Latent Reasoning via Looped Language Models
Figure 2 for Scaling Latent Reasoning via Looped Language Models
Figure 3 for Scaling Latent Reasoning via Looped Language Models
Figure 4 for Scaling Latent Reasoning via Looped Language Models
Viaarxiv icon