Alert button
Picture for Sanjiv Kumar

Sanjiv Kumar

Alert button

Does label smoothing mitigate label noise?

Add code
Bookmark button
Alert button
Mar 05, 2020
Michal Lukasik, Srinadh Bhojanapalli, Aditya Krishna Menon, Sanjiv Kumar

Figure 1 for Does label smoothing mitigate label noise?
Figure 2 for Does label smoothing mitigate label noise?
Figure 3 for Does label smoothing mitigate label noise?
Figure 4 for Does label smoothing mitigate label noise?
Viaarxiv icon

Adaptive Federated Optimization

Add code
Bookmark button
Alert button
Feb 29, 2020
Sashank Reddi, Zachary Charles, Manzil Zaheer, Zachary Garrett, Keith Rush, Jakub Konečný, Sanjiv Kumar, H. Brendan McMahan

Figure 1 for Adaptive Federated Optimization
Figure 2 for Adaptive Federated Optimization
Figure 3 for Adaptive Federated Optimization
Figure 4 for Adaptive Federated Optimization
Viaarxiv icon

Low-Rank Bottleneck in Multi-head Attention Models

Add code
Bookmark button
Alert button
Feb 17, 2020
Srinadh Bhojanapalli, Chulhee Yun, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar

Figure 1 for Low-Rank Bottleneck in Multi-head Attention Models
Figure 2 for Low-Rank Bottleneck in Multi-head Attention Models
Figure 3 for Low-Rank Bottleneck in Multi-head Attention Models
Figure 4 for Low-Rank Bottleneck in Multi-head Attention Models
Viaarxiv icon

Pre-training Tasks for Embedding-based Large-scale Retrieval

Add code
Bookmark button
Alert button
Feb 10, 2020
Wei-Cheng Chang, Felix X. Yu, Yin-Wen Chang, Yiming Yang, Sanjiv Kumar

Figure 1 for Pre-training Tasks for Embedding-based Large-scale Retrieval
Figure 2 for Pre-training Tasks for Embedding-based Large-scale Retrieval
Figure 3 for Pre-training Tasks for Embedding-based Large-scale Retrieval
Figure 4 for Pre-training Tasks for Embedding-based Large-scale Retrieval
Viaarxiv icon

Are Transformers universal approximators of sequence-to-sequence functions?

Add code
Bookmark button
Alert button
Dec 20, 2019
Chulhee Yun, Srinadh Bhojanapalli, Ankit Singh Rawat, Sashank J. Reddi, Sanjiv Kumar

Figure 1 for Are Transformers universal approximators of sequence-to-sequence functions?
Figure 2 for Are Transformers universal approximators of sequence-to-sequence functions?
Viaarxiv icon

Why ADAM Beats SGD for Attention Models

Add code
Bookmark button
Alert button
Dec 06, 2019
Jingzhao Zhang, Sai Praneeth Karimireddy, Andreas Veit, Seungyeon Kim, Sashank J Reddi, Sanjiv Kumar, Suvrit Sra

Figure 1 for Why ADAM Beats SGD for Attention Models
Figure 2 for Why ADAM Beats SGD for Attention Models
Figure 3 for Why ADAM Beats SGD for Attention Models
Figure 4 for Why ADAM Beats SGD for Attention Models
Viaarxiv icon

Learning to Learn by Zeroth-Order Oracle

Add code
Bookmark button
Alert button
Oct 21, 2019
Yangjun Ruan, Yuanhao Xiong, Sashank Reddi, Sanjiv Kumar, Cho-Jui Hsieh

Figure 1 for Learning to Learn by Zeroth-Order Oracle
Figure 2 for Learning to Learn by Zeroth-Order Oracle
Figure 3 for Learning to Learn by Zeroth-Order Oracle
Figure 4 for Learning to Learn by Zeroth-Order Oracle
Viaarxiv icon

Online Hierarchical Clustering Approximations

Add code
Bookmark button
Alert button
Sep 20, 2019
Aditya Krishna Menon, Anand Rajagopalan, Baris Sumengen, Gui Citovsky, Qin Cao, Sanjiv Kumar

Figure 1 for Online Hierarchical Clustering Approximations
Figure 2 for Online Hierarchical Clustering Approximations
Figure 3 for Online Hierarchical Clustering Approximations
Figure 4 for Online Hierarchical Clustering Approximations
Viaarxiv icon

New Loss Functions for Fast Maximum Inner Product Search

Add code
Bookmark button
Alert button
Sep 11, 2019
Ruiqi Guo, Quan Geng, David Simcha, Felix Chern, Sanjiv Kumar, Xiang Wu

Figure 1 for New Loss Functions for Fast Maximum Inner Product Search
Figure 2 for New Loss Functions for Fast Maximum Inner Product Search
Viaarxiv icon