Picture for Pratyush Maini

Pratyush Maini

Natively Unlearnable Large Language Models

Add code
Jun 11, 2026
Viaarxiv icon

The Finetuner's Fallacy: When to Pretrain with Your Finetuning Data

Add code
Mar 17, 2026
Viaarxiv icon

ÜberWeb: Insights from Multilingual Curation for a 20-Trillion-Token Dataset

Add code
Feb 16, 2026
Viaarxiv icon

When Should We Introduce Safety Interventions During Pretraining?

Add code
Jan 11, 2026
Viaarxiv icon

DatBench: Discriminative, Faithful, and Efficient VLM Evaluations

Add code
Jan 05, 2026
Viaarxiv icon

Luxical: High-Speed Lexical-Dense Text Embeddings

Add code
Dec 11, 2025
Viaarxiv icon

Unlocking Post-hoc Dataset Inference with Synthetic Data

Add code
Jun 18, 2025
Viaarxiv icon

OpenUnlearning: Accelerating LLM Unlearning via Unified Benchmarking of Methods and Metrics

Add code
Jun 14, 2025
Figure 1 for OpenUnlearning: Accelerating LLM Unlearning via Unified Benchmarking of Methods and Metrics
Figure 2 for OpenUnlearning: Accelerating LLM Unlearning via Unified Benchmarking of Methods and Metrics
Figure 3 for OpenUnlearning: Accelerating LLM Unlearning via Unified Benchmarking of Methods and Metrics
Figure 4 for OpenUnlearning: Accelerating LLM Unlearning via Unified Benchmarking of Methods and Metrics
Viaarxiv icon

Safety Pretraining: Toward the Next Generation of Safe AI

Add code
Apr 23, 2025
Viaarxiv icon

STAMP Your Content: Proving Dataset Membership via Watermarked Rephrasings

Add code
Apr 18, 2025
Viaarxiv icon