Picture for Zhao Song

Zhao Song

LazyDiT: Lazy Learning for the Acceleration of Diffusion Transformers

Add code
Dec 17, 2024
Viaarxiv icon

Numerical Pruning for Efficient Autoregressive Models

Add code
Dec 17, 2024
Figure 1 for Numerical Pruning for Efficient Autoregressive Models
Figure 2 for Numerical Pruning for Efficient Autoregressive Models
Figure 3 for Numerical Pruning for Efficient Autoregressive Models
Figure 4 for Numerical Pruning for Efficient Autoregressive Models
Viaarxiv icon

The Computational Limits of State-Space Models and Mamba via the Lens of Circuit Complexity

Add code
Dec 09, 2024
Viaarxiv icon

Curse of Attention: A Kernel-Based Perspective for Why Transformers Fail to Generalize on Time Series Forecasting and Beyond

Add code
Dec 08, 2024
Viaarxiv icon

On Socially Fair Low-Rank Approximation and Column Subset Selection

Add code
Dec 08, 2024
Figure 1 for On Socially Fair Low-Rank Approximation and Column Subset Selection
Figure 2 for On Socially Fair Low-Rank Approximation and Column Subset Selection
Viaarxiv icon

On the Expressive Power of Modern Hopfield Networks

Add code
Dec 07, 2024
Viaarxiv icon

Transformers are Deep Optimizers: Provable In-Context Learning for Deep Model Training

Add code
Nov 25, 2024
Figure 1 for Transformers are Deep Optimizers: Provable In-Context Learning for Deep Model Training
Figure 2 for Transformers are Deep Optimizers: Provable In-Context Learning for Deep Model Training
Figure 3 for Transformers are Deep Optimizers: Provable In-Context Learning for Deep Model Training
Viaarxiv icon

Fundamental Limits of Prompt Tuning Transformers: Universality, Capacity and Efficiency

Add code
Nov 25, 2024
Viaarxiv icon

Circuit Complexity Bounds for RoPE-based Transformer Architecture

Add code
Nov 12, 2024
Viaarxiv icon

On Differentially Private String Distances

Add code
Nov 08, 2024
Viaarxiv icon