Picture for Anastasiia Kucherenko

Anastasiia Kucherenko

Getting Your Indices in a Row: Full-Text Search for LLM Training Data for Real World

Add code
Oct 10, 2025
Viaarxiv icon

Apertus: Democratizing Open and Compliant LLMs for Global Language Environments

Add code
Sep 17, 2025
Viaarxiv icon

Going over Fine Web with a Fine-Tooth Comb: Technical Report of Indexing Fine Web for Problematic Content Search and Retrieval

Add code
Aug 29, 2025
Viaarxiv icon

Low-Perplexity LLM-Generated Sequences and Where To Find Them

Add code
Jul 02, 2025
Viaarxiv icon