Kimi


When Perplexity Lies: Generation-Focused Distillation of Hybrid Sequence Models

Add code
Mar 27, 2026
Viaarxiv icon

When AI Shows Its Work, Is It Actually Working? Step-Level Evaluation Reveals Frontier Language Models Frequently Bypass Their Own Reasoning

Add code
Mar 24, 2026
Viaarxiv icon

Multi-Method Validation of Large Language Model Medical Translation Across High- and Low-Resource Languages

Add code
Mar 23, 2026
Viaarxiv icon

Evaluating 5W3H Structured Prompting for Intent Alignment in Human-AI Interaction

Add code
Mar 19, 2026
Viaarxiv icon

VeriGrey: Greybox Agent Validation

Add code
Mar 18, 2026
Viaarxiv icon

Neuron-Level Emotion Control in Speech-Generative Large Audio-Language Models

Add code
Mar 18, 2026
Viaarxiv icon

MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild

Add code
Mar 17, 2026
Viaarxiv icon

Attention Residuals

Add code
Mar 16, 2026
Viaarxiv icon

Beyond Task Completion: Revealing Corrupt Success in LLM Agents through Procedure-Aware Evaluation

Add code
Mar 03, 2026
Viaarxiv icon

Think, But Don't Overthink: Reproducing Recursive Language Models

Add code
Mar 03, 2026
Viaarxiv icon