Picture for Tianshuo Yang

Tianshuo Yang

AnyRecon: Arbitrary-View 3D Reconstruction with Video Diffusion Model

Add code
Apr 21, 2026
Viaarxiv icon

HiVLA: A Visual-Grounded-Centric Hierarchical Embodied Manipulation System

Add code
Apr 15, 2026
Viaarxiv icon

Rein3D: Reinforced 3D Indoor Scene Generation with Panoramic Video Diffusion Models

Add code
Apr 14, 2026
Viaarxiv icon

Adapting Feature Attenuation to NLP

Add code
Jan 02, 2026
Viaarxiv icon

Discrete Diffusion VLA: Bringing Discrete Diffusion to Action Decoding in Vision-Language-Action Policies

Add code
Aug 27, 2025
Viaarxiv icon

AutoBio: A Simulation and Benchmark for Robotic Automation in Digital Biology Laboratory

Add code
May 20, 2025
Viaarxiv icon

LemmaHead: RAG Assisted Proof Generation Using Large Language Models

Add code
Jan 27, 2025
Viaarxiv icon

Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model

Add code
Jul 24, 2024
Figure 1 for Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model
Figure 2 for Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model
Figure 3 for Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model
Figure 4 for Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model
Viaarxiv icon

PhyBench: A Physical Commonsense Benchmark for Evaluating Text-to-Image Models

Add code
Jun 17, 2024
Figure 1 for PhyBench: A Physical Commonsense Benchmark for Evaluating Text-to-Image Models
Figure 2 for PhyBench: A Physical Commonsense Benchmark for Evaluating Text-to-Image Models
Figure 3 for PhyBench: A Physical Commonsense Benchmark for Evaluating Text-to-Image Models
Figure 4 for PhyBench: A Physical Commonsense Benchmark for Evaluating Text-to-Image Models
Viaarxiv icon

Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers

Add code
May 09, 2024
Figure 1 for Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers
Figure 2 for Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers
Figure 3 for Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers
Figure 4 for Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers
Viaarxiv icon