Picture for Kam-Fai Wong

Kam-Fai Wong

Robust Tool Use via Fission-GRPO: Learning to Recover from Execution Errors

Add code
Jan 22, 2026
Viaarxiv icon

Memory-T1: Reinforcement Learning for Temporal Reasoning in Multi-session Agents

Add code
Dec 23, 2025
Viaarxiv icon

Dual-Density Inference for Efficient Language Model Reasoning

Add code
Dec 17, 2025
Figure 1 for Dual-Density Inference for Efficient Language Model Reasoning
Figure 2 for Dual-Density Inference for Efficient Language Model Reasoning
Figure 3 for Dual-Density Inference for Efficient Language Model Reasoning
Figure 4 for Dual-Density Inference for Efficient Language Model Reasoning
Viaarxiv icon

Hybrid Attribution Priors for Explainable and Robust Model Training

Add code
Dec 09, 2025
Viaarxiv icon

Explore to Evolve: Scaling Evolved Aggregation Logic via Proactive Online Exploration for Deep Research Agents

Add code
Oct 16, 2025
Viaarxiv icon

Beyond Two-Stage Training: Cooperative SFT and RL for LLM Reasoning

Add code
Sep 08, 2025
Viaarxiv icon

ReSURE: Regularizing Supervision Unreliability for Multi-turn Dialogue Fine-tuning

Add code
Aug 27, 2025
Figure 1 for ReSURE: Regularizing Supervision Unreliability for Multi-turn Dialogue Fine-tuning
Figure 2 for ReSURE: Regularizing Supervision Unreliability for Multi-turn Dialogue Fine-tuning
Figure 3 for ReSURE: Regularizing Supervision Unreliability for Multi-turn Dialogue Fine-tuning
Figure 4 for ReSURE: Regularizing Supervision Unreliability for Multi-turn Dialogue Fine-tuning
Viaarxiv icon

Vulnerability-Aware Alignment: Mitigating Uneven Forgetting in Harmful Fine-Tuning

Add code
Jun 04, 2025
Viaarxiv icon

MiniMax-Remover: Taming Bad Noise Helps Video Object Removal

Add code
May 30, 2025
Viaarxiv icon

Bridging the Long-Term Gap: A Memory-Active Policy for Multi-Session Task-Oriented Dialogue

Add code
May 26, 2025
Figure 1 for Bridging the Long-Term Gap: A Memory-Active Policy for Multi-Session Task-Oriented Dialogue
Figure 2 for Bridging the Long-Term Gap: A Memory-Active Policy for Multi-Session Task-Oriented Dialogue
Figure 3 for Bridging the Long-Term Gap: A Memory-Active Policy for Multi-Session Task-Oriented Dialogue
Figure 4 for Bridging the Long-Term Gap: A Memory-Active Policy for Multi-Session Task-Oriented Dialogue
Viaarxiv icon