Picture for Zhao Song

Zhao Song

Exploring the Frontiers of Softmax: Provable Optimization, Applications in Diffusion Model, and Beyond

Add code
May 06, 2024
Viaarxiv icon

How to Inverting the Leverage Score Distribution?

Add code
Apr 21, 2024
Viaarxiv icon

Attention is Naturally Sparse with Gaussian Distributed Input

Add code
Apr 03, 2024
Figure 1 for Attention is Naturally Sparse with Gaussian Distributed Input
Figure 2 for Attention is Naturally Sparse with Gaussian Distributed Input
Viaarxiv icon

On Computational Limits of Modern Hopfield Models: A Fine-Grained Complexity Analysis

Add code
Feb 14, 2024
Viaarxiv icon

Fourier Circuits in Neural Networks: Unlocking the Potential of Large Language Models in Mathematical Reasoning and Modular Arithmetic

Add code
Feb 12, 2024
Viaarxiv icon

Quantum Speedup for Spectral Approximation of Kronecker Products

Add code
Feb 10, 2024
Viaarxiv icon

The Fine-Grained Complexity of Gradient Computation for Training Large Language Models

Add code
Feb 07, 2024
Viaarxiv icon

Enhancing Stochastic Gradient Descent: A Unified Framework and Novel Acceleration Methods for Faster Convergence

Add code
Feb 02, 2024
Viaarxiv icon

Local Convergence of Approximate Newton Method for Two Layer Nonlinear Regression

Add code
Nov 26, 2023
Viaarxiv icon

One Pass Streaming Algorithm for Super Long Token Attention Approximation in Sublinear Space

Add code
Nov 24, 2023
Viaarxiv icon