Picture for Ted Zadouri

Ted Zadouri

FlashAttention-4: Algorithm and Kernel Pipelining Co-Design for Asymmetric Hardware Scaling

Add code
Mar 05, 2026
Viaarxiv icon

Hardware-Efficient Attention for Fast Decoding

Add code
May 27, 2025
Figure 1 for Hardware-Efficient Attention for Fast Decoding
Figure 2 for Hardware-Efficient Attention for Fast Decoding
Figure 3 for Hardware-Efficient Attention for Fast Decoding
Figure 4 for Hardware-Efficient Attention for Fast Decoding
Viaarxiv icon

Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient MoE for Instruction Tuning

Add code
Sep 11, 2023
Figure 1 for Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient MoE for Instruction Tuning
Figure 2 for Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient MoE for Instruction Tuning
Figure 3 for Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient MoE for Instruction Tuning
Figure 4 for Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient MoE for Instruction Tuning
Viaarxiv icon

High Probability Bounds for Stochastic Continuous Submodular Maximization

Add code
Mar 20, 2023
Figure 1 for High Probability Bounds for Stochastic Continuous Submodular Maximization
Figure 2 for High Probability Bounds for Stochastic Continuous Submodular Maximization
Figure 3 for High Probability Bounds for Stochastic Continuous Submodular Maximization
Figure 4 for High Probability Bounds for Stochastic Continuous Submodular Maximization
Viaarxiv icon