Picture for Wilson Yan

Wilson Yan

World Model on Million-Length Video And Language With RingAttention

Add code
Feb 13, 2024
Viaarxiv icon

Motion-Conditioned Image Animation for Video Editing

Add code
Nov 30, 2023
Viaarxiv icon

ALP: Action-Aware Embodied Learning for Perception

Add code
Jun 16, 2023
Figure 1 for ALP: Action-Aware Embodied Learning for Perception
Figure 2 for ALP: Action-Aware Embodied Learning for Perception
Figure 3 for ALP: Action-Aware Embodied Learning for Perception
Figure 4 for ALP: Action-Aware Embodied Learning for Perception
Viaarxiv icon

Video Prediction Models as Rewards for Reinforcement Learning

Add code
May 23, 2023
Figure 1 for Video Prediction Models as Rewards for Reinforcement Learning
Figure 2 for Video Prediction Models as Rewards for Reinforcement Learning
Figure 3 for Video Prediction Models as Rewards for Reinforcement Learning
Figure 4 for Video Prediction Models as Rewards for Reinforcement Learning
Viaarxiv icon

Language Quantized AutoEncoders: Towards Unsupervised Text-Image Alignment

Feb 03, 2023
Figure 1 for Language Quantized AutoEncoders: Towards Unsupervised Text-Image Alignment
Figure 2 for Language Quantized AutoEncoders: Towards Unsupervised Text-Image Alignment
Figure 3 for Language Quantized AutoEncoders: Towards Unsupervised Text-Image Alignment
Figure 4 for Language Quantized AutoEncoders: Towards Unsupervised Text-Image Alignment
Viaarxiv icon

Temporally Consistent Video Transformer for Long-Term Video Prediction

Add code
Oct 05, 2022
Figure 1 for Temporally Consistent Video Transformer for Long-Term Video Prediction
Figure 2 for Temporally Consistent Video Transformer for Long-Term Video Prediction
Figure 3 for Temporally Consistent Video Transformer for Long-Term Video Prediction
Figure 4 for Temporally Consistent Video Transformer for Long-Term Video Prediction
Viaarxiv icon

Patch-based Object-centric Transformers for Efficient Video Generation

Add code
Jun 19, 2022
Figure 1 for Patch-based Object-centric Transformers for Efficient Video Generation
Figure 2 for Patch-based Object-centric Transformers for Efficient Video Generation
Figure 3 for Patch-based Object-centric Transformers for Efficient Video Generation
Figure 4 for Patch-based Object-centric Transformers for Efficient Video Generation
Viaarxiv icon

VideoGPT: Video Generation using VQ-VAE and Transformers

Add code
Apr 20, 2021
Figure 1 for VideoGPT: Video Generation using VQ-VAE and Transformers
Figure 2 for VideoGPT: Video Generation using VQ-VAE and Transformers
Figure 3 for VideoGPT: Video Generation using VQ-VAE and Transformers
Figure 4 for VideoGPT: Video Generation using VQ-VAE and Transformers
Viaarxiv icon

Learning Predictive Representations for Deformable Objects Using Contrastive Estimation

Mar 11, 2020
Figure 1 for Learning Predictive Representations for Deformable Objects Using Contrastive Estimation
Figure 2 for Learning Predictive Representations for Deformable Objects Using Contrastive Estimation
Figure 3 for Learning Predictive Representations for Deformable Objects Using Contrastive Estimation
Figure 4 for Learning Predictive Representations for Deformable Objects Using Contrastive Estimation
Viaarxiv icon

Natural Image Manipulation for Autoregressive Models Using Fisher Scores

Add code
Nov 25, 2019
Figure 1 for Natural Image Manipulation for Autoregressive Models Using Fisher Scores
Figure 2 for Natural Image Manipulation for Autoregressive Models Using Fisher Scores
Figure 3 for Natural Image Manipulation for Autoregressive Models Using Fisher Scores
Figure 4 for Natural Image Manipulation for Autoregressive Models Using Fisher Scores
Viaarxiv icon