Picture for Hung Le

Hung Le

Pick

Federated Domain Generalization with Latent Space Inversion

Add code
Dec 11, 2025
Viaarxiv icon

Empowering Multi-Turn Tool-Integrated Reasoning with Group Turn Policy Optimization

Add code
Nov 18, 2025
Figure 1 for Empowering Multi-Turn Tool-Integrated Reasoning with Group Turn Policy Optimization
Figure 2 for Empowering Multi-Turn Tool-Integrated Reasoning with Group Turn Policy Optimization
Figure 3 for Empowering Multi-Turn Tool-Integrated Reasoning with Group Turn Policy Optimization
Figure 4 for Empowering Multi-Turn Tool-Integrated Reasoning with Group Turn Policy Optimization
Viaarxiv icon

Uncertainty-Guided Checkpoint Selection for Reinforcement Finetuning of Large Language Models

Add code
Nov 13, 2025
Figure 1 for Uncertainty-Guided Checkpoint Selection for Reinforcement Finetuning of Large Language Models
Figure 2 for Uncertainty-Guided Checkpoint Selection for Reinforcement Finetuning of Large Language Models
Figure 3 for Uncertainty-Guided Checkpoint Selection for Reinforcement Finetuning of Large Language Models
Figure 4 for Uncertainty-Guided Checkpoint Selection for Reinforcement Finetuning of Large Language Models
Viaarxiv icon

Probabilities Are All You Need: A Probability-Only Approach to Uncertainty Estimation in Large Language Models

Add code
Nov 10, 2025
Figure 1 for Probabilities Are All You Need: A Probability-Only Approach to Uncertainty Estimation in Large Language Models
Figure 2 for Probabilities Are All You Need: A Probability-Only Approach to Uncertainty Estimation in Large Language Models
Figure 3 for Probabilities Are All You Need: A Probability-Only Approach to Uncertainty Estimation in Large Language Models
Figure 4 for Probabilities Are All You Need: A Probability-Only Approach to Uncertainty Estimation in Large Language Models
Viaarxiv icon

GRAD: Graph-Retrieved Adaptive Decoding for Hallucination Mitigation

Add code
Nov 05, 2025
Viaarxiv icon

SPaRFT: Self-Paced Reinforcement Fine-Tuning for Large Language Models

Add code
Aug 07, 2025
Viaarxiv icon

DmC: Nearest Neighbor Guidance Diffusion Model for Offline Cross-domain Reinforcement Learning

Add code
Jul 28, 2025
Viaarxiv icon

Hybrid Cross-domain Robust Reinforcement Learning

Add code
May 29, 2025
Viaarxiv icon

Beyond the Known: Decision Making with Counterfactual Reasoning Decision Transformer

Add code
May 14, 2025
Viaarxiv icon

Reasoning Under 1 Billion: Memory-Augmented Reinforcement Learning for Large Language Models

Add code
Apr 03, 2025
Viaarxiv icon