Picture for Shuchen Zhu

Shuchen Zhu

Negligible in Size, Significant in Effect: On Scale Vectors in Large Language Models

Add code
May 26, 2026
Viaarxiv icon

Subspace Optimization for Efficient Federated Learning under Heterogeneous Data

Add code
Apr 28, 2026
Viaarxiv icon

Accelerating LLM Pre-Training through Flat-Direction Dynamics Enhancement

Add code
Feb 26, 2026
Viaarxiv icon

SPARKLE: A Unified Single-Loop Primal-Dual Framework for Decentralized Bilevel Optimization

Add code
Nov 21, 2024
Figure 1 for SPARKLE: A Unified Single-Loop Primal-Dual Framework for Decentralized Bilevel Optimization
Figure 2 for SPARKLE: A Unified Single-Loop Primal-Dual Framework for Decentralized Bilevel Optimization
Figure 3 for SPARKLE: A Unified Single-Loop Primal-Dual Framework for Decentralized Bilevel Optimization
Figure 4 for SPARKLE: A Unified Single-Loop Primal-Dual Framework for Decentralized Bilevel Optimization
Viaarxiv icon

Decentralized Bilevel Optimization over Graphs: Loopless Algorithmic Update and Transient Iteration Complexity

Add code
Feb 05, 2024
Figure 1 for Decentralized Bilevel Optimization over Graphs: Loopless Algorithmic Update and Transient Iteration Complexity
Figure 2 for Decentralized Bilevel Optimization over Graphs: Loopless Algorithmic Update and Transient Iteration Complexity
Figure 3 for Decentralized Bilevel Optimization over Graphs: Loopless Algorithmic Update and Transient Iteration Complexity
Figure 4 for Decentralized Bilevel Optimization over Graphs: Loopless Algorithmic Update and Transient Iteration Complexity
Viaarxiv icon