Picture for Anil Kag

Anil Kag

Omni-Attribute: Open-vocabulary Attribute Encoder for Visual Concept Personalization

Add code
Dec 11, 2025
Viaarxiv icon

Taming Diffusion Transformer for Real-Time Mobile Video Generation

Add code
Jul 17, 2025
Figure 1 for Taming Diffusion Transformer for Real-Time Mobile Video Generation
Figure 2 for Taming Diffusion Transformer for Real-Time Mobile Video Generation
Figure 3 for Taming Diffusion Transformer for Real-Time Mobile Video Generation
Figure 4 for Taming Diffusion Transformer for Real-Time Mobile Video Generation
Viaarxiv icon

H3AE: High Compression, High Speed, and High Quality AutoEncoder for Video Diffusion Models

Add code
Apr 14, 2025
Viaarxiv icon

Towards Physical Understanding in Video Generation: A 3D Point Regularization Approach

Add code
Feb 05, 2025
Viaarxiv icon

SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device

Add code
Dec 13, 2024
Figure 1 for SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device
Figure 2 for SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device
Figure 3 for SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device
Figure 4 for SnapGen-V: Generating a Five-Second Video within Five Seconds on a Mobile Device
Viaarxiv icon

SnapGen: Taming High-Resolution Text-to-Image Models for Mobile Devices with Efficient Architectures and Training

Add code
Dec 12, 2024
Viaarxiv icon

AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation

Add code
Nov 07, 2024
Figure 1 for AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation
Figure 2 for AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation
Figure 3 for AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation
Figure 4 for AsCAN: Asymmetric Convolution-Attention Networks for Efficient Recognition and Generation
Viaarxiv icon

Scalable Ranked Preference Optimization for Text-to-Image Generation

Add code
Oct 23, 2024
Figure 1 for Scalable Ranked Preference Optimization for Text-to-Image Generation
Figure 2 for Scalable Ranked Preference Optimization for Text-to-Image Generation
Figure 3 for Scalable Ranked Preference Optimization for Text-to-Image Generation
Figure 4 for Scalable Ranked Preference Optimization for Text-to-Image Generation
Viaarxiv icon

Lightweight Predictive 3D Gaussian Splats

Add code
Jun 27, 2024
Figure 1 for Lightweight Predictive 3D Gaussian Splats
Figure 2 for Lightweight Predictive 3D Gaussian Splats
Figure 3 for Lightweight Predictive 3D Gaussian Splats
Figure 4 for Lightweight Predictive 3D Gaussian Splats
Viaarxiv icon

SF-V: Single Forward Video Generation Model

Add code
Jun 06, 2024
Viaarxiv icon