Picture for Jiaxi Gu

Jiaxi Gu

A third-order finite difference weighted essentially non-oscillatory scheme with shallow neural network

Add code
Jul 10, 2024
Viaarxiv icon

MimicMotion: High-Quality Human Motion Video Generation with Confidence-aware Pose Guidance

Add code
Jun 28, 2024
Viaarxiv icon

AutoTVG: A New Vision-language Pre-training Paradigm for Temporal Video Grounding

Add code
Jun 11, 2024
Figure 1 for AutoTVG: A New Vision-language Pre-training Paradigm for Temporal Video Grounding
Figure 2 for AutoTVG: A New Vision-language Pre-training Paradigm for Temporal Video Grounding
Figure 3 for AutoTVG: A New Vision-language Pre-training Paradigm for Temporal Video Grounding
Figure 4 for AutoTVG: A New Vision-language Pre-training Paradigm for Temporal Video Grounding
Viaarxiv icon

BIVDiff: A Training-Free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models

Add code
Dec 05, 2023
Figure 1 for BIVDiff: A Training-Free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models
Figure 2 for BIVDiff: A Training-Free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models
Figure 3 for BIVDiff: A Training-Free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models
Figure 4 for BIVDiff: A Training-Free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models
Viaarxiv icon

DreamVideo: High-Fidelity Image-to-Video Generation with Image Retention and Text Guidance

Add code
Dec 05, 2023
Figure 1 for DreamVideo: High-Fidelity Image-to-Video Generation with Image Retention and Text Guidance
Figure 2 for DreamVideo: High-Fidelity Image-to-Video Generation with Image Retention and Text Guidance
Figure 3 for DreamVideo: High-Fidelity Image-to-Video Generation with Image Retention and Text Guidance
Figure 4 for DreamVideo: High-Fidelity Image-to-Video Generation with Image Retention and Text Guidance
Viaarxiv icon

VideoAssembler: Identity-Consistent Video Generation with Reference Entities using Diffusion Model

Add code
Dec 01, 2023
Viaarxiv icon

Fuse Your Latents: Video Editing with Multi-source Latent Diffusion Models

Add code
Oct 25, 2023
Figure 1 for Fuse Your Latents: Video Editing with Multi-source Latent Diffusion Models
Figure 2 for Fuse Your Latents: Video Editing with Multi-source Latent Diffusion Models
Figure 3 for Fuse Your Latents: Video Editing with Multi-source Latent Diffusion Models
Figure 4 for Fuse Your Latents: Video Editing with Multi-source Latent Diffusion Models
Viaarxiv icon

Reuse and Diffuse: Iterative Denoising for Text-to-Video Generation

Add code
Sep 07, 2023
Figure 1 for Reuse and Diffuse: Iterative Denoising for Text-to-Video Generation
Figure 2 for Reuse and Diffuse: Iterative Denoising for Text-to-Video Generation
Figure 3 for Reuse and Diffuse: Iterative Denoising for Text-to-Video Generation
Figure 4 for Reuse and Diffuse: Iterative Denoising for Text-to-Video Generation
Viaarxiv icon

Towards Universal Vision-language Omni-supervised Segmentation

Add code
Mar 12, 2023
Figure 1 for Towards Universal Vision-language Omni-supervised Segmentation
Figure 2 for Towards Universal Vision-language Omni-supervised Segmentation
Figure 3 for Towards Universal Vision-language Omni-supervised Segmentation
Figure 4 for Towards Universal Vision-language Omni-supervised Segmentation
Viaarxiv icon

Wukong: 100 Million Large-scale Chinese Cross-modal Pre-training Dataset and A Foundation Framework

Add code
Mar 10, 2022
Figure 1 for Wukong: 100 Million Large-scale Chinese Cross-modal Pre-training Dataset and A Foundation Framework
Figure 2 for Wukong: 100 Million Large-scale Chinese Cross-modal Pre-training Dataset and A Foundation Framework
Figure 3 for Wukong: 100 Million Large-scale Chinese Cross-modal Pre-training Dataset and A Foundation Framework
Figure 4 for Wukong: 100 Million Large-scale Chinese Cross-modal Pre-training Dataset and A Foundation Framework
Viaarxiv icon