Picture for Zhuotao Tian

Zhuotao Tian

Can Large Language Models Resolve Semantic Discrepancy in Self-Destructive Subcultures? Evidence from Jirai Kei

Add code
Jan 08, 2026
Viaarxiv icon

Plug-and-Play Fidelity Optimization for Diffusion Transformer Acceleration via Cumulative Error Minimization

Add code
Dec 29, 2025
Viaarxiv icon

SemanticVLA: Semantic-Aligned Sparsification and Enhancement for Efficient Robotic Manipulation

Add code
Nov 13, 2025
Figure 1 for SemanticVLA: Semantic-Aligned Sparsification and Enhancement for Efficient Robotic Manipulation
Figure 2 for SemanticVLA: Semantic-Aligned Sparsification and Enhancement for Efficient Robotic Manipulation
Figure 3 for SemanticVLA: Semantic-Aligned Sparsification and Enhancement for Efficient Robotic Manipulation
Figure 4 for SemanticVLA: Semantic-Aligned Sparsification and Enhancement for Efficient Robotic Manipulation
Viaarxiv icon

Mitigating Object Hallucinations via Sentence-Level Early Intervention

Add code
Jul 16, 2025
Viaarxiv icon

Edit360: 2D Image Edits to 3D Assets from Any Angle

Add code
Jun 12, 2025
Viaarxiv icon

Omni-DPO: A Dual-Perspective Paradigm for Dynamic Preference Learning of LLMs

Add code
Jun 11, 2025
Viaarxiv icon

SafeMVDrive: Multi-view Safety-Critical Driving Video Synthesis in the Real World Domain

Add code
May 23, 2025
Figure 1 for SafeMVDrive: Multi-view Safety-Critical Driving Video Synthesis in the Real World Domain
Figure 2 for SafeMVDrive: Multi-view Safety-Critical Driving Video Synthesis in the Real World Domain
Figure 3 for SafeMVDrive: Multi-view Safety-Critical Driving Video Synthesis in the Real World Domain
Figure 4 for SafeMVDrive: Multi-view Safety-Critical Driving Video Synthesis in the Real World Domain
Viaarxiv icon

Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models

Add code
May 08, 2025
Figure 1 for Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models
Figure 2 for Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models
Figure 3 for Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models
Figure 4 for Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models
Viaarxiv icon

DeCLIP: Decoupled Learning for Open-Vocabulary Dense Perception

Add code
May 07, 2025
Viaarxiv icon

From Mapping to Composing: A Two-Stage Framework for Zero-shot Composed Image Retrieval

Add code
Apr 25, 2025
Viaarxiv icon