Picture for Fangcheng Fu

Fangcheng Fu

SALE : Low-bit Estimation for Efficient Sparse Attention in Long-context LLM Prefilling

Add code
May 30, 2025
Viaarxiv icon

Thinking Short and Right Over Thinking Long: Serving LLM Reasoning Efficiently and Accurately

Add code
May 19, 2025
Viaarxiv icon

Galvatron: An Automatic Distributed System for Efficient Foundation Model Training

Add code
Apr 30, 2025
Viaarxiv icon

Training-free and Adaptive Sparse Attention for Efficient Long Video Generation

Add code
Feb 28, 2025
Viaarxiv icon

ByteScale: Efficient Scaling of LLM Training with a 2048K Context Length on More Than 12,000 GPUs

Add code
Feb 28, 2025
Viaarxiv icon

Demystifying Workload Imbalances in Large Transformer Model Training over Variable-length Sequences

Add code
Dec 10, 2024
Viaarxiv icon

Data-Centric and Heterogeneity-Adaptive Sequence Parallelism for Efficient LLM Training

Add code
Dec 02, 2024
Viaarxiv icon

Gradual Learning: Optimizing Fine-Tuning with Partially Mastered Knowledge in Large Language Models

Add code
Oct 08, 2024
Figure 1 for Gradual Learning: Optimizing Fine-Tuning with Partially Mastered Knowledge in Large Language Models
Figure 2 for Gradual Learning: Optimizing Fine-Tuning with Partially Mastered Knowledge in Large Language Models
Figure 3 for Gradual Learning: Optimizing Fine-Tuning with Partially Mastered Knowledge in Large Language Models
Figure 4 for Gradual Learning: Optimizing Fine-Tuning with Partially Mastered Knowledge in Large Language Models
Viaarxiv icon

Retrofitting Temporal Graph Neural Networks with Transformer

Add code
Sep 10, 2024
Figure 1 for Retrofitting Temporal Graph Neural Networks with Transformer
Figure 2 for Retrofitting Temporal Graph Neural Networks with Transformer
Figure 3 for Retrofitting Temporal Graph Neural Networks with Transformer
Figure 4 for Retrofitting Temporal Graph Neural Networks with Transformer
Viaarxiv icon

Efficient Multi-Task Large Model Training via Data Heterogeneity-aware Model Management

Add code
Sep 05, 2024
Figure 1 for Efficient Multi-Task Large Model Training via Data Heterogeneity-aware Model Management
Figure 2 for Efficient Multi-Task Large Model Training via Data Heterogeneity-aware Model Management
Figure 3 for Efficient Multi-Task Large Model Training via Data Heterogeneity-aware Model Management
Figure 4 for Efficient Multi-Task Large Model Training via Data Heterogeneity-aware Model Management
Viaarxiv icon