Picture for Wei Zhao

Wei Zhao

Not All Frames Deserve Full Computation: Accelerating Autoregressive Video Generation via Selective Computation and Predictive Extrapolation

Add code
Apr 03, 2026
Viaarxiv icon

MMaDA-VLA: Large Diffusion Vision-Language-Action Model with Unified Multi-Modal Instruction and Generation

Add code
Mar 27, 2026
Viaarxiv icon

What Really Controls Temporal Reasoning in Large Language Models: Tokenisation or Representation of Time?

Add code
Mar 19, 2026
Viaarxiv icon

AdaMuS: Adaptive Multi-view Sparsity Learning for Dimensionally Unbalanced Data

Add code
Mar 18, 2026
Viaarxiv icon

Differentiable Geometric Indexing for End-to-End Generative Retrieval

Add code
Mar 11, 2026
Viaarxiv icon

Rethinking the Practicality of Vision-language-action Model: A Comprehensive Benchmark and An Improved Baseline

Add code
Feb 26, 2026
Viaarxiv icon

How Do Lexical Senses Correspond Between Spoken German and German Sign Language?

Add code
Feb 14, 2026
Viaarxiv icon

BSoNet: Deep Learning Solution for Optimizing Image Quality of Portable Backscatter Imaging Systems

Add code
Feb 12, 2026
Viaarxiv icon

NECromancer: Breathing Life into Skeletons via BVH Animation

Add code
Feb 06, 2026
Viaarxiv icon

DiMo: Discrete Diffusion Modeling for Motion Generation and Understanding

Add code
Feb 04, 2026
Viaarxiv icon