Picture for Lu Yu

Lu Yu

Diffusion Models with Heavy-Tailed Targets: Score Estimation and Sampling Guarantees

Add code
Jan 10, 2026
Viaarxiv icon

Rethinking Recurrent Neural Networks for Time Series Forecasting: A Reinforced Recurrent Encoder with Prediction-Oriented Proximal Policy Optimization

Add code
Jan 07, 2026
Viaarxiv icon

Improving Autoformalization Using Direct Dependency Retrieval

Add code
Nov 15, 2025
Viaarxiv icon

Efficient Text-Attributed Graph Learning through Selective Annotation and Graph Alignment

Add code
Jun 08, 2025
Viaarxiv icon

Locality Preserving Markovian Transition for Instance Retrieval

Add code
Jun 05, 2025
Viaarxiv icon

GIFStream: 4D Gaussian-based Immersive Video with Feature Stream

Add code
May 12, 2025
Viaarxiv icon

Language Guided Concept Bottleneck Models for Interpretable Continual Learning

Add code
Mar 30, 2025
Figure 1 for Language Guided Concept Bottleneck Models for Interpretable Continual Learning
Figure 2 for Language Guided Concept Bottleneck Models for Interpretable Continual Learning
Figure 3 for Language Guided Concept Bottleneck Models for Interpretable Continual Learning
Figure 4 for Language Guided Concept Bottleneck Models for Interpretable Continual Learning
Viaarxiv icon

MASS: Mathematical Data Selection via Skill Graphs for Pretraining Large Language Models

Add code
Mar 19, 2025
Figure 1 for MASS: Mathematical Data Selection via Skill Graphs for Pretraining Large Language Models
Figure 2 for MASS: Mathematical Data Selection via Skill Graphs for Pretraining Large Language Models
Figure 3 for MASS: Mathematical Data Selection via Skill Graphs for Pretraining Large Language Models
Figure 4 for MASS: Mathematical Data Selection via Skill Graphs for Pretraining Large Language Models
Viaarxiv icon

Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs

Add code
Mar 07, 2025
Figure 1 for Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs
Figure 2 for Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs
Figure 3 for Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs
Figure 4 for Every FLOP Counts: Scaling a 300B Mixture-of-Experts LING LLM without Premium GPUs
Viaarxiv icon

Advancing Wasserstein Convergence Analysis of Score-Based Models: Insights from Discretization and Second-Order Acceleration

Add code
Feb 07, 2025
Figure 1 for Advancing Wasserstein Convergence Analysis of Score-Based Models: Insights from Discretization and Second-Order Acceleration
Viaarxiv icon