Picture for Yang Yu

Yang Yu

Tsinghua University

Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Add code
Apr 14, 2026
Viaarxiv icon

Adapting 2D Multi-Modal Large Language Model for 3D CT Image Analysis

Add code
Apr 11, 2026
Viaarxiv icon

ReinVBC: A Model-based Reinforcement Learning Approach to Vehicle Braking Controller

Add code
Apr 06, 2026
Viaarxiv icon

Off-Policy Value-Based Reinforcement Learning for Large Language Models

Add code
Mar 24, 2026
Viaarxiv icon

VLGOR: Visual-Language Knowledge Guided Offline Reinforcement Learning for Generalizable Agents

Add code
Mar 24, 2026
Viaarxiv icon

Non-Adversarial Imitation Learning Provably Free of Compounding Errors: The Role of Bellman Constraints

Add code
Mar 24, 2026
Viaarxiv icon

RLVR Training of LLMs Does Not Improve Thinking Ability for General QA: Evaluation Method and a Simple Solution

Add code
Mar 21, 2026
Viaarxiv icon

Towards Practical World Model-based Reinforcement Learning for Vision-Language-Action Models

Add code
Mar 21, 2026
Viaarxiv icon

Speedup Patch: Learning a Plug-and-Play Policy to Accelerate Embodied Manipulation

Add code
Mar 21, 2026
Viaarxiv icon

SIGMA: A Semantic-Grounded Instruction-Driven Generative Multi-Task Recommender at AliExpress

Add code
Feb 26, 2026
Viaarxiv icon