Picture for Percy Liang

Percy Liang

Shammie

Estimating near-verbatim extraction risk in language models with decoding-constrained beam search

Add code
Mar 26, 2026
Viaarxiv icon

Data-efficient pre-training by scaling synthetic megadocs

Add code
Mar 19, 2026
Viaarxiv icon

Replaying pre-training data improves fine-tuning

Add code
Mar 05, 2026
Viaarxiv icon

Beyond a Single Extractor: Re-thinking HTML-to-Text Extraction for LLM Pretraining

Add code
Feb 23, 2026
Viaarxiv icon

VLAW: Iterative Co-Improvement of Vision-Language-Action Policy and World Model

Add code
Feb 15, 2026
Viaarxiv icon

When RL Meets Adaptive Speculative Training: A Unified Training-Serving System

Add code
Feb 06, 2026
Viaarxiv icon

Cosmos Policy: Fine-Tuning Video Models for Visuomotor Control and Planning

Add code
Jan 22, 2026
Viaarxiv icon

RoboReward: General-Purpose Vision-Language Reward Models for Robotics

Add code
Jan 08, 2026
Viaarxiv icon

Extracting books from production language models

Add code
Jan 06, 2026
Viaarxiv icon

The 2025 Foundation Model Transparency Index

Add code
Dec 11, 2025
Figure 1 for The 2025 Foundation Model Transparency Index
Figure 2 for The 2025 Foundation Model Transparency Index
Figure 3 for The 2025 Foundation Model Transparency Index
Figure 4 for The 2025 Foundation Model Transparency Index
Viaarxiv icon