Picture for Xuan-Phi Nguyen

Xuan-Phi Nguyen

Least-Loaded Expert Parallelism: Load Balancing An Imbalanced Mixture-of-Experts

Add code
Jan 23, 2026
Viaarxiv icon

MAS-Orchestra: Understanding and Improving Multi-Agent Reasoning Through Holistic Orchestration and Controlled Benchmarks

Add code
Jan 21, 2026
Viaarxiv icon

MAS-ZERO: Designing Multi-Agent Systems with Zero Supervision

Add code
May 26, 2025
Viaarxiv icon

Meta-Design Matters: A Self-Design Multi-Agent System

Add code
May 21, 2025
Viaarxiv icon

J4R: Learning to Judge with Equivalent Initial State Group Relative Preference Optimization

Add code
May 19, 2025
Viaarxiv icon

A Survey of Frontiers in LLM Reasoning: Inference Scaling, Learning to Reason, and Agentic Systems

Add code
Apr 12, 2025
Figure 1 for A Survey of Frontiers in LLM Reasoning: Inference Scaling, Learning to Reason, and Agentic Systems
Figure 2 for A Survey of Frontiers in LLM Reasoning: Inference Scaling, Learning to Reason, and Agentic Systems
Figure 3 for A Survey of Frontiers in LLM Reasoning: Inference Scaling, Learning to Reason, and Agentic Systems
Figure 4 for A Survey of Frontiers in LLM Reasoning: Inference Scaling, Learning to Reason, and Agentic Systems
Viaarxiv icon

Demystifying Domain-adaptive Post-training for Financial LLMs

Add code
Jan 09, 2025
Figure 1 for Demystifying Domain-adaptive Post-training for Financial LLMs
Figure 2 for Demystifying Domain-adaptive Post-training for Financial LLMs
Figure 3 for Demystifying Domain-adaptive Post-training for Financial LLMs
Figure 4 for Demystifying Domain-adaptive Post-training for Financial LLMs
Viaarxiv icon

Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction

Add code
Sep 25, 2024
Figure 1 for Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction
Figure 2 for Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction
Figure 3 for Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction
Figure 4 for Discovering the Gems in Early Layers: Accelerating Long-Context LLMs with 1000x Input Token Reduction
Viaarxiv icon

SFR-RAG: Towards Contextually Faithful LLMs

Add code
Sep 16, 2024
Viaarxiv icon

ParaICL: Towards Robust Parallel In-Context Learning

Add code
Mar 31, 2024
Viaarxiv icon