Picture for Sewoong Oh

Sewoong Oh

OpenThoughts: Data Recipes for Reasoning Models

Add code
Jun 05, 2025
Viaarxiv icon

Zeroth-Order Optimization Finds Flat Minima

Add code
Jun 05, 2025
Viaarxiv icon

Recycling the Web: A Method to Enhance Pre-training Data Quality and Quantity for Language Models

Add code
Jun 05, 2025
Viaarxiv icon

Foundation model for mass spectrometry proteomics

Add code
May 19, 2025
Viaarxiv icon

A False Sense of Privacy: Evaluating Textual Data Sanitization Beyond Surface-level Privacy Leakage

Add code
Apr 28, 2025
Viaarxiv icon

Open Deep Search: Democratizing Search with Open-source Reasoning Agents

Add code
Mar 26, 2025
Viaarxiv icon

SuperBPE: Space Travel for Language Models

Add code
Mar 17, 2025
Viaarxiv icon

S4S: Solving for a Diffusion Model Solver

Add code
Feb 24, 2025
Viaarxiv icon

Scalable Fingerprinting of Large Language Models

Add code
Feb 11, 2025
Viaarxiv icon

Economics of Sourcing Human Data

Add code
Feb 11, 2025
Viaarxiv icon