Picture for Jianjian Sun

Jianjian Sun

Step-Audio-AQAA: a Fully End-to-End Expressive Large Audio Language Model

Add code
Jun 10, 2025
Viaarxiv icon

Perception-R1: Pioneering Perception Policy with Reinforcement Learning

Add code
Apr 10, 2025
Viaarxiv icon

Perception in Reflection

Add code
Apr 09, 2025
Viaarxiv icon

Step-Audio: Unified Understanding and Generation in Intelligent Speech Interaction

Add code
Feb 18, 2025
Viaarxiv icon

Unhackable Temporal Rewarding for Scalable Video MLLMs

Add code
Feb 17, 2025
Viaarxiv icon

PerPO: Perceptual Preference Optimization via Discriminative Rewarding

Add code
Feb 05, 2025
Viaarxiv icon

Slow Perception: Let's Perceive Geometric Figures Step-by-step

Add code
Dec 30, 2024
Figure 1 for Slow Perception: Let's Perceive Geometric Figures Step-by-step
Figure 2 for Slow Perception: Let's Perceive Geometric Figures Step-by-step
Figure 3 for Slow Perception: Let's Perceive Geometric Figures Step-by-step
Figure 4 for Slow Perception: Let's Perceive Geometric Figures Step-by-step
Viaarxiv icon

General OCR Theory: Towards OCR-2.0 via a Unified End-to-end Model

Add code
Sep 03, 2024
Viaarxiv icon

Focus Anywhere for Fine-grained Multi-page Document Understanding

Add code
May 23, 2024
Viaarxiv icon

OneChart: Purify the Chart Structural Extraction via One Auxiliary Token

Add code
Apr 15, 2024
Viaarxiv icon