Picture for Jingkuan Song

Jingkuan Song

Learning Generalizable and Efficient Image Watermarking via Hierarchical Two-Stage Optimization

Add code
Aug 12, 2025
Viaarxiv icon

Dynamic Pattern Alignment Learning for Pretraining Lightweight Human-Centric Vision Models

Add code
Aug 10, 2025
Viaarxiv icon

Shortcut Learning in Generalist Robot Policies: The Role of Dataset Diversity and Fragmentation

Add code
Aug 08, 2025
Viaarxiv icon

SafePTR: Token-Level Jailbreak Defense in Multimodal LLMs via Prune-then-Restore Mechanism

Add code
Jul 02, 2025
Viaarxiv icon

OmniCharacter: Towards Immersive Role-Playing Agents with Seamless Speech-Language Personality Interaction

Add code
May 26, 2025
Viaarxiv icon

Unlocking Smarter Device Control: Foresighted Planning with a World Model-Driven Code Execution Approach

Add code
May 22, 2025
Viaarxiv icon

InSpire: Vision-Language-Action Models with Intrinsic Spatial Reasoning

Add code
May 20, 2025
Viaarxiv icon

Policy Contrastive Decoding for Robotic Foundation Models

Add code
May 19, 2025
Viaarxiv icon

Towards Generalized and Training-Free Text-Guided Semantic Manipulation

Add code
Apr 24, 2025
Figure 1 for Towards Generalized and Training-Free Text-Guided Semantic Manipulation
Figure 2 for Towards Generalized and Training-Free Text-Guided Semantic Manipulation
Figure 3 for Towards Generalized and Training-Free Text-Guided Semantic Manipulation
Figure 4 for Towards Generalized and Training-Free Text-Guided Semantic Manipulation
Viaarxiv icon

Scale-Aware Pre-Training for Human-Centric Visual Perception: Enabling Lightweight and Generalizable Models

Add code
Mar 11, 2025
Viaarxiv icon