Picture for Xiangyu Zhang

Xiangyu Zhang

Step-Video-TI2V Technical Report: A State-of-the-Art Text-Driven Image-to-Video Generation Model

Add code
Mar 14, 2025
Viaarxiv icon

Why Pre-trained Models Fail: Feature Entanglement in Multi-modal Depression Detection

Add code
Mar 09, 2025
Viaarxiv icon

Predictable Scale: Part I -- Optimal Hyperparameter Scaling Law in Large Language Model Pretraining

Add code
Mar 06, 2025
Viaarxiv icon

Foot-In-The-Door: A Multi-turn Jailbreak for LLMs

Add code
Feb 28, 2025
Viaarxiv icon

Step-Audio: Unified Understanding and Generation in Intelligent Speech Interaction

Add code
Feb 18, 2025
Viaarxiv icon

Unhackable Temporal Rewarding for Scalable Video MLLMs

Add code
Feb 17, 2025
Figure 1 for Unhackable Temporal Rewarding for Scalable Video MLLMs
Figure 2 for Unhackable Temporal Rewarding for Scalable Video MLLMs
Figure 3 for Unhackable Temporal Rewarding for Scalable Video MLLMs
Figure 4 for Unhackable Temporal Rewarding for Scalable Video MLLMs
Viaarxiv icon

PerPO: Perceptual Preference Optimization via Discriminative Rewarding

Add code
Feb 05, 2025
Figure 1 for PerPO: Perceptual Preference Optimization via Discriminative Rewarding
Figure 2 for PerPO: Perceptual Preference Optimization via Discriminative Rewarding
Figure 3 for PerPO: Perceptual Preference Optimization via Discriminative Rewarding
Figure 4 for PerPO: Perceptual Preference Optimization via Discriminative Rewarding
Viaarxiv icon

Predicting 3D representations for Dynamic Scenes

Add code
Jan 28, 2025
Figure 1 for Predicting 3D representations for Dynamic Scenes
Figure 2 for Predicting 3D representations for Dynamic Scenes
Figure 3 for Predicting 3D representations for Dynamic Scenes
Figure 4 for Predicting 3D representations for Dynamic Scenes
Viaarxiv icon

CENSOR: Defense Against Gradient Inversion via Orthogonal Subspace Bayesian Sampling

Add code
Jan 27, 2025
Figure 1 for CENSOR: Defense Against Gradient Inversion via Orthogonal Subspace Bayesian Sampling
Figure 2 for CENSOR: Defense Against Gradient Inversion via Orthogonal Subspace Bayesian Sampling
Figure 3 for CENSOR: Defense Against Gradient Inversion via Orthogonal Subspace Bayesian Sampling
Figure 4 for CENSOR: Defense Against Gradient Inversion via Orthogonal Subspace Bayesian Sampling
Viaarxiv icon

Taming Teacher Forcing for Masked Autoregressive Video Generation

Add code
Jan 21, 2025
Figure 1 for Taming Teacher Forcing for Masked Autoregressive Video Generation
Figure 2 for Taming Teacher Forcing for Masked Autoregressive Video Generation
Figure 3 for Taming Teacher Forcing for Masked Autoregressive Video Generation
Figure 4 for Taming Teacher Forcing for Masked Autoregressive Video Generation
Viaarxiv icon