Picture for Zhao Song

Zhao Song

Multi-Layer Transformers Gradient Can be Approximated in Almost Linear Time

Add code
Aug 23, 2024
Viaarxiv icon

A Tighter Complexity Analysis of SparseGPT

Add code
Aug 22, 2024
Viaarxiv icon

Inverting the Leverage Score Gradient: An Efficient Approximate Newton Method

Add code
Aug 21, 2024
Viaarxiv icon

Fast John Ellipsoid Computation with Differential Privacy Optimization

Add code
Aug 12, 2024
Viaarxiv icon

Differential Privacy of Cross-Attention with Provable Guarantee

Add code
Jul 20, 2024
Figure 1 for Differential Privacy of Cross-Attention with Provable Guarantee
Viaarxiv icon

Differential Privacy Mechanisms in Neural Tangent Kernel Regression

Add code
Jul 18, 2024
Figure 1 for Differential Privacy Mechanisms in Neural Tangent Kernel Regression
Viaarxiv icon

On Statistical Rates and Provably Efficient Criteria of Latent Diffusion Transformers (DiTs)

Add code
Jul 01, 2024
Viaarxiv icon

Toward Infinite-Long Prefix in Transformer

Add code
Jun 20, 2024
Figure 1 for Toward Infinite-Long Prefix in Transformer
Figure 2 for Toward Infinite-Long Prefix in Transformer
Figure 3 for Toward Infinite-Long Prefix in Transformer
Viaarxiv icon

Computational Limits of Low-Rank Adaptation (LoRA) for Transformer-Based Models

Add code
Jun 05, 2024
Viaarxiv icon

Tensor Attention Training: Provably Efficient Learning of Higher-order Transformers

Add code
May 26, 2024
Figure 1 for Tensor Attention Training: Provably Efficient Learning of Higher-order Transformers
Figure 2 for Tensor Attention Training: Provably Efficient Learning of Higher-order Transformers
Figure 3 for Tensor Attention Training: Provably Efficient Learning of Higher-order Transformers
Viaarxiv icon