Alert button
Picture for Raffy Fahim

Raffy Fahim

Alert button

Mixture of Quantized Experts (MoQE): Complementary Effect of Low-bit Quantization and Robustness

Add code
Bookmark button
Alert button
Oct 03, 2023
Young Jin Kim, Raffy Fahim, Hany Hassan Awadalla

Viaarxiv icon

FineQuant: Unlocking Efficiency with Fine-Grained Weight-Only Quantization for LLMs

Add code
Bookmark button
Alert button
Aug 16, 2023
Young Jin Kim, Rawn Henry, Raffy Fahim, Hany Hassan Awadalla

Viaarxiv icon

Who Says Elephants Can't Run: Bringing Large Scale MoE Models into Cloud Scale Production

Add code
Bookmark button
Alert button
Nov 18, 2022
Young Jin Kim, Rawn Henry, Raffy Fahim, Hany Hassan Awadalla

Figure 1 for Who Says Elephants Can't Run: Bringing Large Scale MoE Models into Cloud Scale Production
Figure 2 for Who Says Elephants Can't Run: Bringing Large Scale MoE Models into Cloud Scale Production
Figure 3 for Who Says Elephants Can't Run: Bringing Large Scale MoE Models into Cloud Scale Production
Figure 4 for Who Says Elephants Can't Run: Bringing Large Scale MoE Models into Cloud Scale Production
Viaarxiv icon