Picture for Deli Zhao

Deli Zhao

VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

Add code
Jun 11, 2024
Viaarxiv icon

Auto Arena of LLMs: Automating LLM Evaluations with Agent Peer-battles and Committee Discussions

Add code
May 30, 2024
Viaarxiv icon

Space Group Constrained Crystal Generation

Add code
Feb 06, 2024
Viaarxiv icon

Latent Space Editing in Transformer-Based Flow Matching

Add code
Dec 17, 2023
Viaarxiv icon

I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models

Add code
Nov 07, 2023
Viaarxiv icon

Res-Tuning: A Flexible and Efficient Tuning Paradigm via Unbinding Tuner from Backbone

Add code
Oct 30, 2023
Figure 1 for Res-Tuning: A Flexible and Efficient Tuning Paradigm via Unbinding Tuner from Backbone
Figure 2 for Res-Tuning: A Flexible and Efficient Tuning Paradigm via Unbinding Tuner from Backbone
Figure 3 for Res-Tuning: A Flexible and Efficient Tuning Paradigm via Unbinding Tuner from Backbone
Figure 4 for Res-Tuning: A Flexible and Efficient Tuning Paradigm via Unbinding Tuner from Backbone
Viaarxiv icon

Few-shot Action Recognition with Captioning Foundation Models

Add code
Oct 16, 2023
Figure 1 for Few-shot Action Recognition with Captioning Foundation Models
Figure 2 for Few-shot Action Recognition with Captioning Foundation Models
Figure 3 for Few-shot Action Recognition with Captioning Foundation Models
Figure 4 for Few-shot Action Recognition with Captioning Foundation Models
Viaarxiv icon

Towards More Accurate Diffusion Model Acceleration with A Timestep Aligner

Add code
Oct 14, 2023
Figure 1 for Towards More Accurate Diffusion Model Acceleration with A Timestep Aligner
Figure 2 for Towards More Accurate Diffusion Model Acceleration with A Timestep Aligner
Figure 3 for Towards More Accurate Diffusion Model Acceleration with A Timestep Aligner
Figure 4 for Towards More Accurate Diffusion Model Acceleration with A Timestep Aligner
Viaarxiv icon

Efficient-VQGAN: Towards High-Resolution Image Generation with Efficient Vision Transformers

Add code
Oct 09, 2023
Figure 1 for Efficient-VQGAN: Towards High-Resolution Image Generation with Efficient Vision Transformers
Figure 2 for Efficient-VQGAN: Towards High-Resolution Image Generation with Efficient Vision Transformers
Figure 3 for Efficient-VQGAN: Towards High-Resolution Image Generation with Efficient Vision Transformers
Figure 4 for Efficient-VQGAN: Towards High-Resolution Image Generation with Efficient Vision Transformers
Viaarxiv icon

In-Domain GAN Inversion for Faithful Reconstruction and Editability

Add code
Sep 25, 2023
Figure 1 for In-Domain GAN Inversion for Faithful Reconstruction and Editability
Figure 2 for In-Domain GAN Inversion for Faithful Reconstruction and Editability
Figure 3 for In-Domain GAN Inversion for Faithful Reconstruction and Editability
Figure 4 for In-Domain GAN Inversion for Faithful Reconstruction and Editability
Viaarxiv icon