Picture for Y. Charles

Y. Charles

Kimi K2.5: Visual Agentic Intelligence

Add code
Feb 02, 2026
Viaarxiv icon

WorldVQA: Measuring Atomic World Knowledge in Multimodal Large Language Models

Add code
Jan 28, 2026
Viaarxiv icon

Towards Pixel-Level VLM Perception via Simple Points Prediction

Add code
Jan 27, 2026
Viaarxiv icon

OpenCUA: Open Foundations for Computer-Use Agents

Add code
Aug 12, 2025
Viaarxiv icon

VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning?

Add code
May 29, 2025
Viaarxiv icon

Kimi-Audio Technical Report

Add code
Apr 25, 2025
Figure 1 for Kimi-Audio Technical Report
Figure 2 for Kimi-Audio Technical Report
Figure 3 for Kimi-Audio Technical Report
Figure 4 for Kimi-Audio Technical Report
Viaarxiv icon

Kimi-VL Technical Report

Add code
Apr 10, 2025
Figure 1 for Kimi-VL Technical Report
Figure 2 for Kimi-VL Technical Report
Figure 3 for Kimi-VL Technical Report
Figure 4 for Kimi-VL Technical Report
Viaarxiv icon