Picture for Zhao Song

Zhao Song

Attention is Naturally Sparse with Gaussian Distributed Input

Add code
Apr 03, 2024
Figure 1 for Attention is Naturally Sparse with Gaussian Distributed Input
Figure 2 for Attention is Naturally Sparse with Gaussian Distributed Input
Viaarxiv icon

On Computational Limits of Modern Hopfield Models: A Fine-Grained Complexity Analysis

Add code
Feb 14, 2024
Viaarxiv icon

Fourier Circuits in Neural Networks: Unlocking the Potential of Large Language Models in Mathematical Reasoning and Modular Arithmetic

Add code
Feb 12, 2024
Viaarxiv icon

Quantum Speedup for Spectral Approximation of Kronecker Products

Add code
Feb 10, 2024
Viaarxiv icon

The Fine-Grained Complexity of Gradient Computation for Training Large Language Models

Add code
Feb 07, 2024
Viaarxiv icon

Enhancing Stochastic Gradient Descent: A Unified Framework and Novel Acceleration Methods for Faster Convergence

Add code
Feb 02, 2024
Viaarxiv icon

Local Convergence of Approximate Newton Method for Two Layer Nonlinear Regression

Add code
Nov 26, 2023
Viaarxiv icon

Revisiting Quantum Algorithms for Linear Regressions: Quadratic Speedups without Data-Dependent Parameters

Add code
Nov 24, 2023
Viaarxiv icon

One Pass Streaming Algorithm for Super Long Token Attention Approximation in Sublinear Space

Add code
Nov 24, 2023
Viaarxiv icon

A Theoretical Insight into Attack and Defense of Gradient Leakage in Transformer

Add code
Nov 22, 2023
Viaarxiv icon