Picture for Amir Gholami

Amir Gholami

UC Berkeley/LBNL/ICSI

LoSA: Locality Aware Sparse Attention for Block-Wise Diffusion Language Models

Add code
Apr 13, 2026
Viaarxiv icon

On Neural Scaling Laws for Weather Emulation through Continual Training

Add code
Mar 26, 2026
Viaarxiv icon

Agentic Test-Time Scaling for WebAgents

Add code
Feb 12, 2026
Viaarxiv icon

Residual Context Diffusion Language Models

Add code
Jan 30, 2026
Viaarxiv icon

XQuant: Breaking the Memory Wall for LLM Inference with KV Cache Rematerialization

Add code
Aug 14, 2025
Viaarxiv icon

Multipole Attention for Efficient Long Context Reasoning

Add code
Jun 16, 2025
Figure 1 for Multipole Attention for Efficient Long Context Reasoning
Figure 2 for Multipole Attention for Efficient Long Context Reasoning
Figure 3 for Multipole Attention for Efficient Long Context Reasoning
Figure 4 for Multipole Attention for Efficient Long Context Reasoning
Viaarxiv icon

Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks

Add code
Mar 12, 2025
Figure 1 for Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks
Figure 2 for Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks
Figure 3 for Plan-and-Act: Improving Planning of Agents for Long-Horizon Tasks
Viaarxiv icon

ETS: Efficient Tree Search for Inference-Time Scaling

Add code
Feb 19, 2025
Figure 1 for ETS: Efficient Tree Search for Inference-Time Scaling
Figure 2 for ETS: Efficient Tree Search for Inference-Time Scaling
Figure 3 for ETS: Efficient Tree Search for Inference-Time Scaling
Figure 4 for ETS: Efficient Tree Search for Inference-Time Scaling
Viaarxiv icon

Squeezed Attention: Accelerating Long Context Length LLM Inference

Add code
Nov 14, 2024
Figure 1 for Squeezed Attention: Accelerating Long Context Length LLM Inference
Figure 2 for Squeezed Attention: Accelerating Long Context Length LLM Inference
Figure 3 for Squeezed Attention: Accelerating Long Context Length LLM Inference
Figure 4 for Squeezed Attention: Accelerating Long Context Length LLM Inference
Viaarxiv icon

Efficient and Scalable Estimation of Tool Representations in Vector Space

Add code
Sep 02, 2024
Figure 1 for Efficient and Scalable Estimation of Tool Representations in Vector Space
Figure 2 for Efficient and Scalable Estimation of Tool Representations in Vector Space
Figure 3 for Efficient and Scalable Estimation of Tool Representations in Vector Space
Figure 4 for Efficient and Scalable Estimation of Tool Representations in Vector Space
Viaarxiv icon