Picture for Shiwei Zhang

Shiwei Zhang

VideoLCM: Video Latent Consistency Model

Add code
Dec 14, 2023
Figure 1 for VideoLCM: Video Latent Consistency Model
Figure 2 for VideoLCM: Video Latent Consistency Model
Figure 3 for VideoLCM: Video Latent Consistency Model
Figure 4 for VideoLCM: Video Latent Consistency Model
Viaarxiv icon

Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation

Add code
Dec 07, 2023
Figure 1 for Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation
Figure 2 for Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation
Figure 3 for Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation
Figure 4 for Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation
Viaarxiv icon

DreamVideo: Composing Your Dream Videos with Customized Subject and Motion

Add code
Dec 07, 2023
Viaarxiv icon

Check, Locate, Rectify: A Training-Free Layout Calibration System for Text-to-Image Generation

Add code
Nov 30, 2023
Figure 1 for Check, Locate, Rectify: A Training-Free Layout Calibration System for Text-to-Image Generation
Figure 2 for Check, Locate, Rectify: A Training-Free Layout Calibration System for Text-to-Image Generation
Figure 3 for Check, Locate, Rectify: A Training-Free Layout Calibration System for Text-to-Image Generation
Figure 4 for Check, Locate, Rectify: A Training-Free Layout Calibration System for Text-to-Image Generation
Viaarxiv icon

I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models

Add code
Nov 07, 2023
Viaarxiv icon

Utilising a Large Language Model to Annotate Subject Metadata: A Case Study in an Australian National Research Data Catalogue

Add code
Oct 17, 2023
Figure 1 for Utilising a Large Language Model to Annotate Subject Metadata: A Case Study in an Australian National Research Data Catalogue
Figure 2 for Utilising a Large Language Model to Annotate Subject Metadata: A Case Study in an Australian National Research Data Catalogue
Figure 3 for Utilising a Large Language Model to Annotate Subject Metadata: A Case Study in an Australian National Research Data Catalogue
Figure 4 for Utilising a Large Language Model to Annotate Subject Metadata: A Case Study in an Australian National Research Data Catalogue
Viaarxiv icon

Few-shot Action Recognition with Captioning Foundation Models

Add code
Oct 16, 2023
Viaarxiv icon

Unleashing Potential of Evidence in Knowledge-Intensive Dialogue Generation

Add code
Sep 15, 2023
Figure 1 for Unleashing Potential of Evidence in Knowledge-Intensive Dialogue Generation
Figure 2 for Unleashing Potential of Evidence in Knowledge-Intensive Dialogue Generation
Figure 3 for Unleashing Potential of Evidence in Knowledge-Intensive Dialogue Generation
Figure 4 for Unleashing Potential of Evidence in Knowledge-Intensive Dialogue Generation
Viaarxiv icon

Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning

Add code
Sep 14, 2023
Figure 1 for Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning
Figure 2 for Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning
Figure 3 for Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning
Figure 4 for Disentangling Spatial and Temporal Learning for Efficient Image-to-Video Transfer Learning
Viaarxiv icon

Towards Real-World Visual Tracking with Temporal Contexts

Add code
Aug 20, 2023
Viaarxiv icon