Picture for Afshin Rostamizadeh

Afshin Rostamizadeh

This Time is Different: An Observability Perspective on Time Series Foundation Models

Add code
May 20, 2025
Viaarxiv icon

Analyzing Similarity Metrics for Data Selection for Language Model Pretraining

Add code
Feb 04, 2025
Figure 1 for Analyzing Similarity Metrics for Data Selection for Language Model Pretraining
Figure 2 for Analyzing Similarity Metrics for Data Selection for Language Model Pretraining
Figure 3 for Analyzing Similarity Metrics for Data Selection for Language Model Pretraining
Figure 4 for Analyzing Similarity Metrics for Data Selection for Language Model Pretraining
Viaarxiv icon

A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs

Add code
Oct 24, 2024
Figure 1 for A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs
Figure 2 for A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs
Figure 3 for A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs
Figure 4 for A Little Help Goes a Long Way: Efficient LLM Training by Leveraging Small LMs
Viaarxiv icon

No more hard prompts: SoftSRV prompting for synthetic data generation

Add code
Oct 23, 2024
Figure 1 for No more hard prompts: SoftSRV prompting for synthetic data generation
Figure 2 for No more hard prompts: SoftSRV prompting for synthetic data generation
Figure 3 for No more hard prompts: SoftSRV prompting for synthetic data generation
Figure 4 for No more hard prompts: SoftSRV prompting for synthetic data generation
Viaarxiv icon

SpacTor-T5: Pre-training T5 Models with Span Corruption and Replaced Token Detection

Add code
Jan 24, 2024
Viaarxiv icon

DistillSpec: Improving Speculative Decoding via Knowledge Distillation

Add code
Oct 12, 2023
Figure 1 for DistillSpec: Improving Speculative Decoding via Knowledge Distillation
Figure 2 for DistillSpec: Improving Speculative Decoding via Knowledge Distillation
Figure 3 for DistillSpec: Improving Speculative Decoding via Knowledge Distillation
Figure 4 for DistillSpec: Improving Speculative Decoding via Knowledge Distillation
Viaarxiv icon

Leveraging Importance Weights in Subset Selection

Add code
Jan 28, 2023
Figure 1 for Leveraging Importance Weights in Subset Selection
Figure 2 for Leveraging Importance Weights in Subset Selection
Figure 3 for Leveraging Importance Weights in Subset Selection
Figure 4 for Leveraging Importance Weights in Subset Selection
Viaarxiv icon

Is margin all you need? An extensive empirical study of active learning on tabular data

Add code
Oct 07, 2022
Figure 1 for Is margin all you need? An extensive empirical study of active learning on tabular data
Figure 2 for Is margin all you need? An extensive empirical study of active learning on tabular data
Figure 3 for Is margin all you need? An extensive empirical study of active learning on tabular data
Figure 4 for Is margin all you need? An extensive empirical study of active learning on tabular data
Viaarxiv icon

Batch Active Learning at Scale

Add code
Jul 29, 2021
Figure 1 for Batch Active Learning at Scale
Figure 2 for Batch Active Learning at Scale
Figure 3 for Batch Active Learning at Scale
Figure 4 for Batch Active Learning at Scale
Viaarxiv icon

Churn Reduction via Distillation

Add code
Jun 04, 2021
Figure 1 for Churn Reduction via Distillation
Figure 2 for Churn Reduction via Distillation
Figure 3 for Churn Reduction via Distillation
Figure 4 for Churn Reduction via Distillation
Viaarxiv icon