Picture for Sewoong Oh

Sewoong Oh

Data Mixture Inference: What do BPE Tokenizers Reveal about their Training Data?

Add code
Jul 24, 2024
Figure 1 for Data Mixture Inference: What do BPE Tokenizers Reveal about their Training Data?
Figure 2 for Data Mixture Inference: What do BPE Tokenizers Reveal about their Training Data?
Figure 3 for Data Mixture Inference: What do BPE Tokenizers Reveal about their Training Data?
Figure 4 for Data Mixture Inference: What do BPE Tokenizers Reveal about their Training Data?
Viaarxiv icon

Understanding the Gains from Repeated Self-Distillation

Add code
Jul 05, 2024
Figure 1 for Understanding the Gains from Repeated Self-Distillation
Figure 2 for Understanding the Gains from Repeated Self-Distillation
Figure 3 for Understanding the Gains from Repeated Self-Distillation
Figure 4 for Understanding the Gains from Repeated Self-Distillation
Viaarxiv icon

PLeaS -- Merging Models with Permutations and Least Squares

Add code
Jul 02, 2024
Viaarxiv icon

DataComp-LM: In search of the next generation of training sets for language models

Add code
Jun 18, 2024
Figure 1 for DataComp-LM: In search of the next generation of training sets for language models
Figure 2 for DataComp-LM: In search of the next generation of training sets for language models
Figure 3 for DataComp-LM: In search of the next generation of training sets for language models
Figure 4 for DataComp-LM: In search of the next generation of training sets for language models
Viaarxiv icon

Multilingual Diversity Improves Vision-Language Representations

Add code
May 27, 2024
Figure 1 for Multilingual Diversity Improves Vision-Language Representations
Figure 2 for Multilingual Diversity Improves Vision-Language Representations
Figure 3 for Multilingual Diversity Improves Vision-Language Representations
Figure 4 for Multilingual Diversity Improves Vision-Language Representations
Viaarxiv icon

Air Gap: Protecting Privacy-Conscious Conversational Agents

Add code
May 08, 2024
Figure 1 for Air Gap: Protecting Privacy-Conscious Conversational Agents
Figure 2 for Air Gap: Protecting Privacy-Conscious Conversational Agents
Figure 3 for Air Gap: Protecting Privacy-Conscious Conversational Agents
Figure 4 for Air Gap: Protecting Privacy-Conscious Conversational Agents
Viaarxiv icon

Improved Communication-Privacy Trade-offs in $L_2$ Mean Estimation under Streaming Differential Privacy

Add code
May 02, 2024
Figure 1 for Improved Communication-Privacy Trade-offs in $L_2$ Mean Estimation under Streaming Differential Privacy
Figure 2 for Improved Communication-Privacy Trade-offs in $L_2$ Mean Estimation under Streaming Differential Privacy
Figure 3 for Improved Communication-Privacy Trade-offs in $L_2$ Mean Estimation under Streaming Differential Privacy
Figure 4 for Improved Communication-Privacy Trade-offs in $L_2$ Mean Estimation under Streaming Differential Privacy
Viaarxiv icon

Insufficient Statistics Perturbation: Stable Estimators for Private Least Squares

Add code
Apr 23, 2024
Figure 1 for Insufficient Statistics Perturbation: Stable Estimators for Private Least Squares
Figure 2 for Insufficient Statistics Perturbation: Stable Estimators for Private Least Squares
Figure 3 for Insufficient Statistics Perturbation: Stable Estimators for Private Least Squares
Viaarxiv icon

On the Convergence of Differentially-Private Fine-tuning: To Linearly Probe or to Fully Fine-tune?

Add code
Feb 29, 2024
Figure 1 for On the Convergence of Differentially-Private Fine-tuning: To Linearly Probe or to Fully Fine-tune?
Figure 2 for On the Convergence of Differentially-Private Fine-tuning: To Linearly Probe or to Fully Fine-tune?
Figure 3 for On the Convergence of Differentially-Private Fine-tuning: To Linearly Probe or to Fully Fine-tune?
Viaarxiv icon

Privacy-Preserving Instructions for Aligning Large Language Models

Add code
Feb 21, 2024
Figure 1 for Privacy-Preserving Instructions for Aligning Large Language Models
Figure 2 for Privacy-Preserving Instructions for Aligning Large Language Models
Figure 3 for Privacy-Preserving Instructions for Aligning Large Language Models
Figure 4 for Privacy-Preserving Instructions for Aligning Large Language Models
Viaarxiv icon