Picture for Zhao Song

Zhao Song

Enhancing Stochastic Gradient Descent: A Unified Framework and Novel Acceleration Methods for Faster Convergence

Add code
Feb 02, 2024
Viaarxiv icon

Local Convergence of Approximate Newton Method for Two Layer Nonlinear Regression

Add code
Nov 26, 2023
Viaarxiv icon

Revisiting Quantum Algorithms for Linear Regressions: Quadratic Speedups without Data-Dependent Parameters

Add code
Nov 24, 2023
Viaarxiv icon

One Pass Streaming Algorithm for Super Long Token Attention Approximation in Sublinear Space

Add code
Nov 24, 2023
Viaarxiv icon

A Theoretical Insight into Attack and Defense of Gradient Leakage in Transformer

Add code
Nov 22, 2023
Viaarxiv icon

Fast Heavy Inner Product Identification Between Weights and Inputs in Neural Network Training

Add code
Nov 19, 2023
Figure 1 for Fast Heavy Inner Product Identification Between Weights and Inputs in Neural Network Training
Viaarxiv icon

The Expressibility of Polynomial based Attention Scheme

Add code
Oct 30, 2023
Viaarxiv icon

Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time

Add code
Oct 26, 2023
Viaarxiv icon

Unmasking Transformers: A Theoretical Approach to Data Recovery via Attention Weights

Add code
Oct 19, 2023
Figure 1 for Unmasking Transformers: A Theoretical Approach to Data Recovery via Attention Weights
Figure 2 for Unmasking Transformers: A Theoretical Approach to Data Recovery via Attention Weights
Figure 3 for Unmasking Transformers: A Theoretical Approach to Data Recovery via Attention Weights
Figure 4 for Unmasking Transformers: A Theoretical Approach to Data Recovery via Attention Weights
Viaarxiv icon

Superiority of Softmax: Unveiling the Performance Edge Over Linear Attention

Add code
Oct 18, 2023
Figure 1 for Superiority of Softmax: Unveiling the Performance Edge Over Linear Attention
Figure 2 for Superiority of Softmax: Unveiling the Performance Edge Over Linear Attention
Figure 3 for Superiority of Softmax: Unveiling the Performance Edge Over Linear Attention
Figure 4 for Superiority of Softmax: Unveiling the Performance Edge Over Linear Attention
Viaarxiv icon