Picture for Zhao Song

Zhao Song

Towards High-Order Mean Flow Generative Models: Feasibility, Expressivity, and Provably Efficient Criteria

Add code
Aug 09, 2025
Viaarxiv icon

T2VWorldBench: A Benchmark for Evaluating World Knowledge in Text-to-Video Generation

Add code
Jul 24, 2025
Viaarxiv icon

Thinking Isn't an Illusion: Overcoming the Limitations of Reasoning Models via Tool Augmentations

Add code
Jul 23, 2025
Viaarxiv icon

Minimalist Softmax Attention Provably Learns Constrained Boolean Functions

Add code
May 26, 2025
Viaarxiv icon

Only Large Weights (And Not Skip Connections) Can Prevent the Perils of Rank Collapse

Add code
May 22, 2025
Viaarxiv icon

Fast RoPE Attention: Combining the Polynomial Method and Fast Fourier Transform

Add code
May 17, 2025
Viaarxiv icon

T2VTextBench: A Human Evaluation Benchmark for Textual Control in Video Generation Models

Add code
May 08, 2025
Viaarxiv icon

T2VPhysBench: A First-Principles Benchmark for Physical Consistency in Text-to-Video Generation

Add code
May 01, 2025
Viaarxiv icon

Attention Mechanism, Max-Affine Partition, and Universal Approximation

Add code
Apr 28, 2025
Viaarxiv icon

Discriminator-Free Direct Preference Optimization for Video Diffusion

Add code
Apr 11, 2025
Viaarxiv icon