Picture for Ruichong Zhang

Ruichong Zhang

Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence

Add code
Apr 10, 2024
Viaarxiv icon

Cramer Type Distances for Learning Gaussian Mixture Models by Gradient Descent

Jul 13, 2023
Figure 1 for Cramer Type Distances for Learning Gaussian Mixture Models by Gradient Descent
Figure 2 for Cramer Type Distances for Learning Gaussian Mixture Models by Gradient Descent
Figure 3 for Cramer Type Distances for Learning Gaussian Mixture Models by Gradient Descent
Figure 4 for Cramer Type Distances for Learning Gaussian Mixture Models by Gradient Descent
Viaarxiv icon

RWKV: Reinventing RNNs for the Transformer Era

Add code
May 22, 2023
Figure 1 for RWKV: Reinventing RNNs for the Transformer Era
Figure 2 for RWKV: Reinventing RNNs for the Transformer Era
Figure 3 for RWKV: Reinventing RNNs for the Transformer Era
Figure 4 for RWKV: Reinventing RNNs for the Transformer Era
Viaarxiv icon