Picture for Dai Do

Dai Do

Uncertainty-Guided Checkpoint Selection for Reinforcement Finetuning of Large Language Models

Add code
Nov 13, 2025
Viaarxiv icon

GRAD: Graph-Retrieved Adaptive Decoding for Hallucination Mitigation

Add code
Nov 05, 2025
Viaarxiv icon

SPaRFT: Self-Paced Reinforcement Fine-Tuning for Large Language Models

Add code
Aug 07, 2025
Viaarxiv icon

Reasoning Under 1 Billion: Memory-Augmented Reinforcement Learning for Large Language Models

Add code
Apr 03, 2025
Viaarxiv icon

Large Language Models Prompting With Episodic Memory

Add code
Aug 14, 2024
Viaarxiv icon