Alert button

HiRE: High Recall Approximate Top-$k$ Estimation for Efficient LLM Inference

Feb 14, 2024
Yashas Samaga B L, Varun Yerram, Chong You, Srinadh Bhojanapalli, Sanjiv Kumar, Prateek Jain, Praneeth Netrapalli

Share this with someone who'll enjoy it:

View paper onarxiv icon

Share this with someone who'll enjoy it: