Picture for Ye Wang

Ye Wang

Perry

Composable Visual Tokenizers with Generator-Free Diagnostics of Learnability

Add code
Feb 03, 2026
Viaarxiv icon

Comparative Study of Large Language Models on Chinese Film Script Continuation: An Empirical Analysis Based on GPT-5.2 and Qwen-Max

Add code
Jan 21, 2026
Viaarxiv icon

Unlocking Large Audio-Language Models for Interactive Language Learning

Add code
Jan 21, 2026
Viaarxiv icon

Being-H0.5: Scaling Human-Centric Robot Learning for Cross-Embodiment Generalization

Add code
Jan 19, 2026
Viaarxiv icon

Role-Playing Agents Driven by Large Language Models: Current Status, Challenges, and Future Trends

Add code
Jan 15, 2026
Viaarxiv icon

Spatial-Aware VLA Pretraining through Visual-Physical Alignment from Human Videos

Add code
Dec 15, 2025
Viaarxiv icon

FreeControl: Efficient, Training-Free Structural Control via One-Step Attention Extraction

Add code
Nov 07, 2025
Figure 1 for FreeControl: Efficient, Training-Free Structural Control via One-Step Attention Extraction
Figure 2 for FreeControl: Efficient, Training-Free Structural Control via One-Step Attention Extraction
Figure 3 for FreeControl: Efficient, Training-Free Structural Control via One-Step Attention Extraction
Figure 4 for FreeControl: Efficient, Training-Free Structural Control via One-Step Attention Extraction
Viaarxiv icon

DemoGrasp: Universal Dexterous Grasping from a Single Demonstration

Add code
Sep 26, 2025
Viaarxiv icon

Being-M0.5: A Real-Time Controllable Vision-Language-Motion Model

Add code
Aug 11, 2025
Viaarxiv icon

DeepPHY: Benchmarking Agentic VLMs on Physical Reasoning

Add code
Aug 07, 2025
Viaarxiv icon