Picture for Kam-Fai Wong

Kam-Fai Wong

Memory-T1: Reinforcement Learning for Temporal Reasoning in Multi-session Agents

Add code
Dec 23, 2025
Viaarxiv icon

Dual-Density Inference for Efficient Language Model Reasoning

Add code
Dec 17, 2025
Viaarxiv icon

Hybrid Attribution Priors for Explainable and Robust Model Training

Add code
Dec 09, 2025
Viaarxiv icon

Explore to Evolve: Scaling Evolved Aggregation Logic via Proactive Online Exploration for Deep Research Agents

Add code
Oct 16, 2025
Viaarxiv icon

Beyond Two-Stage Training: Cooperative SFT and RL for LLM Reasoning

Add code
Sep 08, 2025
Viaarxiv icon

ReSURE: Regularizing Supervision Unreliability for Multi-turn Dialogue Fine-tuning

Add code
Aug 27, 2025
Figure 1 for ReSURE: Regularizing Supervision Unreliability for Multi-turn Dialogue Fine-tuning
Figure 2 for ReSURE: Regularizing Supervision Unreliability for Multi-turn Dialogue Fine-tuning
Figure 3 for ReSURE: Regularizing Supervision Unreliability for Multi-turn Dialogue Fine-tuning
Figure 4 for ReSURE: Regularizing Supervision Unreliability for Multi-turn Dialogue Fine-tuning
Viaarxiv icon

Vulnerability-Aware Alignment: Mitigating Uneven Forgetting in Harmful Fine-Tuning

Add code
Jun 04, 2025
Viaarxiv icon

MiniMax-Remover: Taming Bad Noise Helps Video Object Removal

Add code
May 30, 2025
Viaarxiv icon

Bridging the Long-Term Gap: A Memory-Active Policy for Multi-Session Task-Oriented Dialogue

Add code
May 26, 2025
Figure 1 for Bridging the Long-Term Gap: A Memory-Active Policy for Multi-Session Task-Oriented Dialogue
Figure 2 for Bridging the Long-Term Gap: A Memory-Active Policy for Multi-Session Task-Oriented Dialogue
Figure 3 for Bridging the Long-Term Gap: A Memory-Active Policy for Multi-Session Task-Oriented Dialogue
Figure 4 for Bridging the Long-Term Gap: A Memory-Active Policy for Multi-Session Task-Oriented Dialogue
Viaarxiv icon

T$^2$: An Adaptive Test-Time Scaling Strategy for Contextual Question Answering

Add code
May 23, 2025
Viaarxiv icon