Picture for DatologyAI

DatologyAI

Brevity is the Soul of Inference Efficiency: Inducing Concision in VLMs via Data Curation

Add code
Jun 24, 2026
Viaarxiv icon

ÜberWeb: Insights from Multilingual Curation for a 20-Trillion-Token Dataset

Add code
Feb 16, 2026
Viaarxiv icon

Luxical: High-Speed Lexical-Dense Text Embeddings

Add code
Dec 11, 2025
Viaarxiv icon