Picture for Shaofeng Zhang

Shaofeng Zhang

VideoREPA: Learning Physics for Video Generation through Relational Alignment with Foundation Models

Add code
May 29, 2025
Viaarxiv icon

Hunyuan-TurboS: Advancing Large Language Models through Mamba-Transformer Synergy and Adaptive Chain-of-Thought

Add code
May 21, 2025
Viaarxiv icon

Point2RBox-v2: Rethinking Point-supervised Oriented Object Detection with Spatial Layout Among Instances

Add code
Feb 07, 2025
Figure 1 for Point2RBox-v2: Rethinking Point-supervised Oriented Object Detection with Spatial Layout Among Instances
Figure 2 for Point2RBox-v2: Rethinking Point-supervised Oriented Object Detection with Spatial Layout Among Instances
Figure 3 for Point2RBox-v2: Rethinking Point-supervised Oriented Object Detection with Spatial Layout Among Instances
Figure 4 for Point2RBox-v2: Rethinking Point-supervised Oriented Object Detection with Spatial Layout Among Instances
Viaarxiv icon

Motion Control for Enhanced Complex Action Video Generation

Add code
Nov 13, 2024
Viaarxiv icon

Exploiting Unlabeled Data with Multiple Expert Teachers for Open Vocabulary Aerial Object Detection and Its Orientation Adaptation

Add code
Nov 04, 2024
Figure 1 for Exploiting Unlabeled Data with Multiple Expert Teachers for Open Vocabulary Aerial Object Detection and Its Orientation Adaptation
Figure 2 for Exploiting Unlabeled Data with Multiple Expert Teachers for Open Vocabulary Aerial Object Detection and Its Orientation Adaptation
Figure 3 for Exploiting Unlabeled Data with Multiple Expert Teachers for Open Vocabulary Aerial Object Detection and Its Orientation Adaptation
Figure 4 for Exploiting Unlabeled Data with Multiple Expert Teachers for Open Vocabulary Aerial Object Detection and Its Orientation Adaptation
Viaarxiv icon

PCP-MAE: Learning to Predict Centers for Point Masked Autoencoders

Add code
Aug 16, 2024
Viaarxiv icon

ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation

Add code
Jun 26, 2024
Figure 1 for ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation
Figure 2 for ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation
Figure 3 for ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation
Figure 4 for ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation
Viaarxiv icon

Continuous-Multiple Image Outpainting in One-Step via Positional Query and A Diffusion-based Approach

Add code
Jan 28, 2024
Viaarxiv icon

GMTR: Graph Matching Transformers

Add code
Nov 14, 2023
Figure 1 for GMTR: Graph Matching Transformers
Figure 2 for GMTR: Graph Matching Transformers
Figure 3 for GMTR: Graph Matching Transformers
Figure 4 for GMTR: Graph Matching Transformers
Viaarxiv icon

HAP: Structure-Aware Masked Image Modeling for Human-Centric Perception

Add code
Oct 31, 2023
Viaarxiv icon