Picture for Yuhta Takida

Yuhta Takida

Spatio-Temporal Audio Language Modeling for Dynamic Sound Sources

Add code
Jun 12, 2026
Viaarxiv icon

Efficient Reinforcement for Visual-Textual Thinking with Discrete Diffusion Model

Add code
Jun 11, 2026
Viaarxiv icon

Understanding and Accelerating the Training of Masked Diffusion Language Models

Add code
May 13, 2026
Viaarxiv icon

GUDA: Counterfactual Group-wise Training Data Attribution for Diffusion Models via Unlearning

Add code
Jan 30, 2026
Viaarxiv icon

Improved Object-Centric Diffusion Learning with Registers and Contrastive Alignment

Add code
Jan 03, 2026
Viaarxiv icon

PAVAS: Physics-Aware Video-to-Audio Synthesis

Add code
Dec 09, 2025
Figure 1 for PAVAS: Physics-Aware Video-to-Audio Synthesis
Figure 2 for PAVAS: Physics-Aware Video-to-Audio Synthesis
Figure 3 for PAVAS: Physics-Aware Video-to-Audio Synthesis
Figure 4 for PAVAS: Physics-Aware Video-to-Audio Synthesis
Viaarxiv icon

SONA: Learning Conditional, Unconditional, and Mismatching-Aware Discriminator

Add code
Oct 06, 2025
Viaarxiv icon

Demystifying MaskGIT Sampler and Beyond: Adaptive Order Selection in Masked Diffusion

Add code
Oct 06, 2025
Viaarxiv icon

Concept-TRAK: Understanding how diffusion models learn concepts through concept-level attribution

Add code
Jul 09, 2025
Figure 1 for Concept-TRAK: Understanding how diffusion models learn concepts through concept-level attribution
Figure 2 for Concept-TRAK: Understanding how diffusion models learn concepts through concept-level attribution
Figure 3 for Concept-TRAK: Understanding how diffusion models learn concepts through concept-level attribution
Figure 4 for Concept-TRAK: Understanding how diffusion models learn concepts through concept-level attribution
Viaarxiv icon

Denoising Multi-Beta VAE: Representation Learning for Disentanglement and Generation

Add code
Jul 09, 2025
Figure 1 for Denoising Multi-Beta VAE: Representation Learning for Disentanglement and Generation
Figure 2 for Denoising Multi-Beta VAE: Representation Learning for Disentanglement and Generation
Figure 3 for Denoising Multi-Beta VAE: Representation Learning for Disentanglement and Generation
Figure 4 for Denoising Multi-Beta VAE: Representation Learning for Disentanglement and Generation
Viaarxiv icon