Picture for Lijun Yu

Lijun Yu

Towards Multi-Task Multi-Modal Models: A Video Generative Perspective

Add code
May 26, 2024
Figure 1 for Towards Multi-Task Multi-Modal Models: A Video Generative Perspective
Figure 2 for Towards Multi-Task Multi-Modal Models: A Video Generative Perspective
Figure 3 for Towards Multi-Task Multi-Modal Models: A Video Generative Perspective
Figure 4 for Towards Multi-Task Multi-Modal Models: A Video Generative Perspective
Viaarxiv icon

A Versatile Diffusion Transformer with Mixture of Noise Levels for Audiovisual Generation

Add code
May 22, 2024
Viaarxiv icon

Efficient LLM Jailbreak via Adaptive Dense-to-sparse Constrained Optimization

Add code
May 15, 2024
Viaarxiv icon

Improving and Unifying Discrete&Continuous-time Discrete Denoising Diffusion

Add code
Feb 06, 2024
Viaarxiv icon

VideoPoet: A Large Language Model for Zero-Shot Video Generation

Add code
Dec 21, 2023
Figure 1 for VideoPoet: A Large Language Model for Zero-Shot Video Generation
Figure 2 for VideoPoet: A Large Language Model for Zero-Shot Video Generation
Figure 3 for VideoPoet: A Large Language Model for Zero-Shot Video Generation
Figure 4 for VideoPoet: A Large Language Model for Zero-Shot Video Generation
Viaarxiv icon

Photorealistic Video Generation with Diffusion Models

Add code
Dec 11, 2023
Figure 1 for Photorealistic Video Generation with Diffusion Models
Figure 2 for Photorealistic Video Generation with Diffusion Models
Figure 3 for Photorealistic Video Generation with Diffusion Models
Figure 4 for Photorealistic Video Generation with Diffusion Models
Viaarxiv icon

Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation

Add code
Oct 09, 2023
Figure 1 for Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation
Figure 2 for Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation
Figure 3 for Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation
Figure 4 for Language Model Beats Diffusion -- Tokenizer is Key to Visual Generation
Viaarxiv icon

SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs

Add code
Jul 03, 2023
Figure 1 for SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs
Figure 2 for SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs
Figure 3 for SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs
Figure 4 for SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs
Viaarxiv icon

Document Entity Retrieval with Massive and Noisy Pre-training

Add code
Jun 15, 2023
Figure 1 for Document Entity Retrieval with Massive and Noisy Pre-training
Figure 2 for Document Entity Retrieval with Massive and Noisy Pre-training
Figure 3 for Document Entity Retrieval with Massive and Noisy Pre-training
Figure 4 for Document Entity Retrieval with Massive and Noisy Pre-training
Viaarxiv icon

MAGVIT: Masked Generative Video Transformer

Add code
Dec 10, 2022
Figure 1 for MAGVIT: Masked Generative Video Transformer
Figure 2 for MAGVIT: Masked Generative Video Transformer
Figure 3 for MAGVIT: Masked Generative Video Transformer
Figure 4 for MAGVIT: Masked Generative Video Transformer
Viaarxiv icon