Picture for Sadjad Fouladi

Sadjad Fouladi

Glinthawk: A Two-Tiered Architecture for High-Throughput LLM Inference

Add code
Jan 20, 2025
Figure 1 for Glinthawk: A Two-Tiered Architecture for High-Throughput LLM Inference
Figure 2 for Glinthawk: A Two-Tiered Architecture for High-Throughput LLM Inference
Figure 3 for Glinthawk: A Two-Tiered Architecture for High-Throughput LLM Inference
Figure 4 for Glinthawk: A Two-Tiered Architecture for High-Throughput LLM Inference
Viaarxiv icon

Gemino: Practical and Robust Neural Compression for Video Conferencing

Add code
Sep 22, 2022
Figure 1 for Gemino: Practical and Robust Neural Compression for Video Conferencing
Figure 2 for Gemino: Practical and Robust Neural Compression for Video Conferencing
Figure 3 for Gemino: Practical and Robust Neural Compression for Video Conferencing
Figure 4 for Gemino: Practical and Robust Neural Compression for Video Conferencing
Viaarxiv icon

Parallelization Techniques for Verifying Neural Networks

Add code
Apr 26, 2020
Figure 1 for Parallelization Techniques for Verifying Neural Networks
Figure 2 for Parallelization Techniques for Verifying Neural Networks
Figure 3 for Parallelization Techniques for Verifying Neural Networks
Figure 4 for Parallelization Techniques for Verifying Neural Networks
Viaarxiv icon