Alert button
Picture for Zhao Song

Zhao Song

Alert button

A Theoretical Insight into Attack and Defense of Gradient Leakage in Transformer

Add code
Bookmark button
Alert button
Nov 22, 2023
Chenyang Li, Zhao Song, Weixin Wang, Chiwun Yang

Viaarxiv icon

Fast Heavy Inner Product Identification Between Weights and Inputs in Neural Network Training

Add code
Bookmark button
Alert button
Nov 19, 2023
Lianke Qin, Saayan Mitra, Zhao Song, Yuanyuan Yang, Tianyi Zhou

Figure 1 for Fast Heavy Inner Product Identification Between Weights and Inputs in Neural Network Training
Viaarxiv icon

The Expressibility of Polynomial based Attention Scheme

Add code
Bookmark button
Alert button
Oct 30, 2023
Zhao Song, Guangyi Xu, Junze Yin

Viaarxiv icon

Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time

Add code
Bookmark button
Alert button
Oct 26, 2023
Zichang Liu, Jue Wang, Tri Dao, Tianyi Zhou, Binhang Yuan, Zhao Song, Anshumali Shrivastava, Ce Zhang, Yuandong Tian, Christopher Re, Beidi Chen

Figure 1 for Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Figure 2 for Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Figure 3 for Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Figure 4 for Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Viaarxiv icon

Unmasking Transformers: A Theoretical Approach to Data Recovery via Attention Weights

Add code
Bookmark button
Alert button
Oct 19, 2023
Yichuan Deng, Zhao Song, Shenghao Xie, Chiwun Yang

Viaarxiv icon

Superiority of Softmax: Unveiling the Performance Edge Over Linear Attention

Add code
Bookmark button
Alert button
Oct 18, 2023
Yichuan Deng, Zhao Song, Tianyi Zhou

Figure 1 for Superiority of Softmax: Unveiling the Performance Edge Over Linear Attention
Figure 2 for Superiority of Softmax: Unveiling the Performance Edge Over Linear Attention
Figure 3 for Superiority of Softmax: Unveiling the Performance Edge Over Linear Attention
Figure 4 for Superiority of Softmax: Unveiling the Performance Edge Over Linear Attention
Viaarxiv icon

An Automatic Learning Rate Schedule Algorithm for Achieving Faster Convergence and Steeper Descent

Add code
Bookmark button
Alert button
Oct 17, 2023
Zhao Song, Chiwun Yang

Viaarxiv icon

How to Capture Higher-order Correlations? Generalizing Matrix Softmax Attention to Kronecker Computation

Add code
Bookmark button
Alert button
Oct 06, 2023
Josh Alman, Zhao Song

Viaarxiv icon

Fine-tune Language Models to Approximate Unbiased In-context Learning

Add code
Bookmark button
Alert button
Oct 05, 2023
Timothy Chu, Zhao Song, Chiwun Yang

Figure 1 for Fine-tune Language Models to Approximate Unbiased In-context Learning
Figure 2 for Fine-tune Language Models to Approximate Unbiased In-context Learning
Figure 3 for Fine-tune Language Models to Approximate Unbiased In-context Learning
Viaarxiv icon

A Unified Scheme of ResNet and Softmax

Add code
Bookmark button
Alert button
Sep 23, 2023
Zhao Song, Weixin Wang, Junze Yin

Figure 1 for A Unified Scheme of ResNet and Softmax
Figure 2 for A Unified Scheme of ResNet and Softmax
Figure 3 for A Unified Scheme of ResNet and Softmax
Figure 4 for A Unified Scheme of ResNet and Softmax
Viaarxiv icon