Picture for Sanjiv Kumar

Sanjiv Kumar

Google Research

Supervision Complexity and its Role in Knowledge Distillation

Add code
Jan 28, 2023
Viaarxiv icon

EmbedDistill: A Geometric Knowledge Distillation for Information Retrieval

Add code
Jan 27, 2023
Figure 1 for EmbedDistill: A Geometric Knowledge Distillation for Information Retrieval
Figure 2 for EmbedDistill: A Geometric Knowledge Distillation for Information Retrieval
Figure 3 for EmbedDistill: A Geometric Knowledge Distillation for Information Retrieval
Figure 4 for EmbedDistill: A Geometric Knowledge Distillation for Information Retrieval
Viaarxiv icon

Automating Nearest Neighbor Search Configuration with Constrained Optimization

Add code
Jan 04, 2023
Figure 1 for Automating Nearest Neighbor Search Configuration with Constrained Optimization
Figure 2 for Automating Nearest Neighbor Search Configuration with Constrained Optimization
Figure 3 for Automating Nearest Neighbor Search Configuration with Constrained Optimization
Figure 4 for Automating Nearest Neighbor Search Configuration with Constrained Optimization
Viaarxiv icon

Large Language Models with Controllable Working Memory

Add code
Nov 09, 2022
Viaarxiv icon

Preserving In-Context Learning ability in Large Language Model Fine-tuning

Add code
Nov 01, 2022
Viaarxiv icon

When does mixup promote local linearity in learned representations?

Add code
Oct 28, 2022
Figure 1 for When does mixup promote local linearity in learned representations?
Figure 2 for When does mixup promote local linearity in learned representations?
Figure 3 for When does mixup promote local linearity in learned representations?
Figure 4 for When does mixup promote local linearity in learned representations?
Viaarxiv icon

Large Models are Parsimonious Learners: Activation Sparsity in Trained Transformers

Add code
Oct 12, 2022
Figure 1 for Large Models are Parsimonious Learners: Activation Sparsity in Trained Transformers
Figure 2 for Large Models are Parsimonious Learners: Activation Sparsity in Trained Transformers
Figure 3 for Large Models are Parsimonious Learners: Activation Sparsity in Trained Transformers
Figure 4 for Large Models are Parsimonious Learners: Activation Sparsity in Trained Transformers
Viaarxiv icon

Decoupled Context Processing for Context Augmented Language Modeling

Add code
Oct 11, 2022
Figure 1 for Decoupled Context Processing for Context Augmented Language Modeling
Figure 2 for Decoupled Context Processing for Context Augmented Language Modeling
Figure 3 for Decoupled Context Processing for Context Augmented Language Modeling
Figure 4 for Decoupled Context Processing for Context Augmented Language Modeling
Viaarxiv icon

Teacher Guided Training: An Efficient Framework for Knowledge Transfer

Add code
Aug 14, 2022
Figure 1 for Teacher Guided Training: An Efficient Framework for Knowledge Transfer
Figure 2 for Teacher Guided Training: An Efficient Framework for Knowledge Transfer
Figure 3 for Teacher Guided Training: An Efficient Framework for Knowledge Transfer
Figure 4 for Teacher Guided Training: An Efficient Framework for Knowledge Transfer
Viaarxiv icon

TPU-KNN: K Nearest Neighbor Search at Peak FLOP/s

Add code
Jun 30, 2022
Figure 1 for TPU-KNN: K Nearest Neighbor Search at Peak FLOP/s
Figure 2 for TPU-KNN: K Nearest Neighbor Search at Peak FLOP/s
Figure 3 for TPU-KNN: K Nearest Neighbor Search at Peak FLOP/s
Figure 4 for TPU-KNN: K Nearest Neighbor Search at Peak FLOP/s
Viaarxiv icon