Picture for Yinglun Zhu

Yinglun Zhu

Online Finetuning Decision Transformers with Pure RL Gradients

Add code
Jan 01, 2026
Viaarxiv icon

Interactive Machine Learning: From Theory to Scale

Add code
Dec 30, 2025
Viaarxiv icon

Strategic Scaling of Test-Time Compute: A Bandit Learning Approach

Add code
Jun 15, 2025
Viaarxiv icon

Mixtraining: A Better Trade-Off Between Compute and Performance

Add code
Feb 26, 2025
Figure 1 for Mixtraining: A Better Trade-Off Between Compute and Performance
Figure 2 for Mixtraining: A Better Trade-Off Between Compute and Performance
Figure 3 for Mixtraining: A Better Trade-Off Between Compute and Performance
Figure 4 for Mixtraining: A Better Trade-Off Between Compute and Performance
Viaarxiv icon

Efficient Sparse PCA via Block-Diagonalization

Add code
Oct 18, 2024
Figure 1 for Efficient Sparse PCA via Block-Diagonalization
Figure 2 for Efficient Sparse PCA via Block-Diagonalization
Figure 3 for Efficient Sparse PCA via Block-Diagonalization
Figure 4 for Efficient Sparse PCA via Block-Diagonalization
Viaarxiv icon

Efficient Sequential Decision Making with Large Language Models

Add code
Jun 17, 2024
Figure 1 for Efficient Sequential Decision Making with Large Language Models
Figure 2 for Efficient Sequential Decision Making with Large Language Models
Figure 3 for Efficient Sequential Decision Making with Large Language Models
Figure 4 for Efficient Sequential Decision Making with Large Language Models
Viaarxiv icon

An Experimental Design Framework for Label-Efficient Supervised Finetuning of Large Language Models

Add code
Jan 12, 2024
Figure 1 for An Experimental Design Framework for Label-Efficient Supervised Finetuning of Large Language Models
Figure 2 for An Experimental Design Framework for Label-Efficient Supervised Finetuning of Large Language Models
Figure 3 for An Experimental Design Framework for Label-Efficient Supervised Finetuning of Large Language Models
Figure 4 for An Experimental Design Framework for Label-Efficient Supervised Finetuning of Large Language Models
Viaarxiv icon

LabelBench: A Comprehensive Framework for Benchmarking Label-Efficient Learning

Add code
Jun 16, 2023
Figure 1 for LabelBench: A Comprehensive Framework for Benchmarking Label-Efficient Learning
Figure 2 for LabelBench: A Comprehensive Framework for Benchmarking Label-Efficient Learning
Figure 3 for LabelBench: A Comprehensive Framework for Benchmarking Label-Efficient Learning
Figure 4 for LabelBench: A Comprehensive Framework for Benchmarking Label-Efficient Learning
Viaarxiv icon

Infinite Action Contextual Bandits with Reusable Data Exhaust

Add code
Feb 16, 2023
Figure 1 for Infinite Action Contextual Bandits with Reusable Data Exhaust
Figure 2 for Infinite Action Contextual Bandits with Reusable Data Exhaust
Figure 3 for Infinite Action Contextual Bandits with Reusable Data Exhaust
Figure 4 for Infinite Action Contextual Bandits with Reusable Data Exhaust
Viaarxiv icon

Active Learning with Neural Networks: Insights from Nonparametric Statistics

Add code
Oct 15, 2022
Viaarxiv icon