Picture for Song Guo

Song Guo

MLLMs Know When Before Speaking: Revealing and Recovering Temporal Grounding via Attention Cues

Add code
May 21, 2026
Viaarxiv icon

Learning Spatiotemporal Sensitivity in Video LLMs via Counterfactual Reinforcement Learning

Add code
May 21, 2026
Viaarxiv icon

What You Think is What You See: Driving Exploration in VLM Agents via Visual-Linguistic Curiosity

Add code
May 05, 2026
Viaarxiv icon

Generative 3D Gaussian Splatting for Arbitrary-ResolutionAtmospheric Downscaling and Forecasting

Add code
Apr 09, 2026
Viaarxiv icon

TTVS: Boosting Self-Exploring Reinforcement Learning via Test-time Variational Synthesis

Add code
Apr 09, 2026
Viaarxiv icon

VisionCreator-R1: A Reflection-Enhanced Native Visual-Generation Agentic Model

Add code
Mar 09, 2026
Viaarxiv icon

VisionCreator: A Native Visual-Generation Agentic Model with Understanding, Thinking, Planning and Creation

Add code
Mar 03, 2026
Viaarxiv icon

DualSentinel: A Lightweight Framework for Detecting Targeted Attacks in Black-box LLM via Dual Entropy Lull Pattern

Add code
Mar 02, 2026
Viaarxiv icon

UNICBench: UNIfied Counting Benchmark for MLLM

Add code
Feb 28, 2026
Viaarxiv icon

Mesh-Pro: Asynchronous Advantage-guided Ranking Preference Optimization for Artist-style Quadrilateral Mesh Generation

Add code
Feb 28, 2026
Viaarxiv icon