Picture for Xiaolu Zhang

Xiaolu Zhang

Short-Path Prompting in LLMs: Analyzing Reasoning Instability and Solutions for Robust Performance

Add code
Apr 13, 2025
Viaarxiv icon

Effective and Efficient Masked Image Generation Models

Add code
Mar 10, 2025
Viaarxiv icon

Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs

Add code
Mar 07, 2025
Figure 1 for Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs
Figure 2 for Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs
Figure 3 for Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs
Figure 4 for Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs
Viaarxiv icon

Keypoint-based Progressive Chain-of-Thought Distillation for LLMs

Add code
May 25, 2024
Viaarxiv icon

Unifying Bayesian Flow Networks and Diffusion Models through Stochastic Differential Equations

Add code
Apr 24, 2024
Figure 1 for Unifying Bayesian Flow Networks and Diffusion Models through Stochastic Differential Equations
Figure 2 for Unifying Bayesian Flow Networks and Diffusion Models through Stochastic Differential Equations
Figure 3 for Unifying Bayesian Flow Networks and Diffusion Models through Stochastic Differential Equations
Figure 4 for Unifying Bayesian Flow Networks and Diffusion Models through Stochastic Differential Equations
Viaarxiv icon

AntDT: A Self-Adaptive Distributed Training Framework for Leader and Straggler Nodes

Add code
Apr 15, 2024
Viaarxiv icon

Breaking the Length Barrier: LLM-Enhanced CTR Prediction in Long Textual User Behaviors

Add code
Mar 28, 2024
Viaarxiv icon

G-Meta: Distributed Meta Learning in GPU Clusters for Large-Scale Recommender Systems

Add code
Jan 09, 2024
Viaarxiv icon

An Adaptive Placement and Parallelism Framework for Accelerating RLHF Training

Add code
Dec 19, 2023
Figure 1 for An Adaptive Placement and Parallelism Framework for Accelerating RLHF Training
Figure 2 for An Adaptive Placement and Parallelism Framework for Accelerating RLHF Training
Figure 3 for An Adaptive Placement and Parallelism Framework for Accelerating RLHF Training
Figure 4 for An Adaptive Placement and Parallelism Framework for Accelerating RLHF Training
Viaarxiv icon

One Model for All: Large Language Models are Domain-Agnostic Recommendation Systems

Add code
Oct 22, 2023
Viaarxiv icon