Picture for Yu-Gang Jiang

Yu-Gang Jiang

Thinking with Deltas: Incentivizing Reinforcement Learning via Differential Visual Reasoning Policy

Add code
Jan 11, 2026
Viaarxiv icon

UniRec-0.1B: Unified Text and Formula Recognition with 0.1B Parameters

Add code
Dec 24, 2025
Figure 1 for UniRec-0.1B: Unified Text and Formula Recognition with 0.1B Parameters
Figure 2 for UniRec-0.1B: Unified Text and Formula Recognition with 0.1B Parameters
Figure 3 for UniRec-0.1B: Unified Text and Formula Recognition with 0.1B Parameters
Figure 4 for UniRec-0.1B: Unified Text and Formula Recognition with 0.1B Parameters
Viaarxiv icon

Memory in the Age of AI Agents

Add code
Dec 15, 2025
Viaarxiv icon

MeViS: A Multi-Modal Dataset for Referring Motion Expression Video Segmentation

Add code
Dec 11, 2025
Figure 1 for MeViS: A Multi-Modal Dataset for Referring Motion Expression Video Segmentation
Figure 2 for MeViS: A Multi-Modal Dataset for Referring Motion Expression Video Segmentation
Figure 3 for MeViS: A Multi-Modal Dataset for Referring Motion Expression Video Segmentation
Figure 4 for MeViS: A Multi-Modal Dataset for Referring Motion Expression Video Segmentation
Viaarxiv icon

AttackVLA: Benchmarking Adversarial and Backdoor Attacks on Vision-Language-Action Models

Add code
Nov 15, 2025
Figure 1 for AttackVLA: Benchmarking Adversarial and Backdoor Attacks on Vision-Language-Action Models
Figure 2 for AttackVLA: Benchmarking Adversarial and Backdoor Attacks on Vision-Language-Action Models
Figure 3 for AttackVLA: Benchmarking Adversarial and Backdoor Attacks on Vision-Language-Action Models
Figure 4 for AttackVLA: Benchmarking Adversarial and Backdoor Attacks on Vision-Language-Action Models
Viaarxiv icon

LRANet++: Low-Rank Approximation Network for Accurate and Efficient Text Spotting

Add code
Nov 08, 2025
Viaarxiv icon

WithAnyone: Towards Controllable and ID Consistent Image Generation

Add code
Oct 16, 2025
Viaarxiv icon

Ask-to-Clarify: Resolving Instruction Ambiguity through Multi-turn Dialogue

Add code
Sep 18, 2025
Viaarxiv icon

AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning

Add code
Sep 10, 2025
Viaarxiv icon

Repeating Words for Video-Language Retrieval with Coarse-to-Fine Objectives

Add code
Aug 20, 2025
Viaarxiv icon