Picture for Jingdong Wang

Jingdong Wang

OpenHumanVid: A Large-Scale High-Quality Dataset for Enhancing Human-Centric Video Generation

Add code
Dec 03, 2024
Figure 1 for OpenHumanVid: A Large-Scale High-Quality Dataset for Enhancing Human-Centric Video Generation
Figure 2 for OpenHumanVid: A Large-Scale High-Quality Dataset for Enhancing Human-Centric Video Generation
Figure 3 for OpenHumanVid: A Large-Scale High-Quality Dataset for Enhancing Human-Centric Video Generation
Figure 4 for OpenHumanVid: A Large-Scale High-Quality Dataset for Enhancing Human-Centric Video Generation
Viaarxiv icon

Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Diffusion Transformer Networks

Add code
Dec 01, 2024
Viaarxiv icon

TopoSD: Topology-Enhanced Lane Segment Perception with SDMap Prior

Add code
Nov 22, 2024
Figure 1 for TopoSD: Topology-Enhanced Lane Segment Perception with SDMap Prior
Figure 2 for TopoSD: Topology-Enhanced Lane Segment Perception with SDMap Prior
Figure 3 for TopoSD: Topology-Enhanced Lane Segment Perception with SDMap Prior
Figure 4 for TopoSD: Topology-Enhanced Lane Segment Perception with SDMap Prior
Viaarxiv icon

Continual SFT Matches Multimodal RLHF with Negative Supervision

Add code
Nov 22, 2024
Figure 1 for Continual SFT Matches Multimodal RLHF with Negative Supervision
Figure 2 for Continual SFT Matches Multimodal RLHF with Negative Supervision
Figure 3 for Continual SFT Matches Multimodal RLHF with Negative Supervision
Figure 4 for Continual SFT Matches Multimodal RLHF with Negative Supervision
Viaarxiv icon

DGTR: Distributed Gaussian Turbo-Reconstruction for Sparse-View Vast Scenes

Add code
Nov 20, 2024
Figure 1 for DGTR: Distributed Gaussian Turbo-Reconstruction for Sparse-View Vast Scenes
Figure 2 for DGTR: Distributed Gaussian Turbo-Reconstruction for Sparse-View Vast Scenes
Figure 3 for DGTR: Distributed Gaussian Turbo-Reconstruction for Sparse-View Vast Scenes
Figure 4 for DGTR: Distributed Gaussian Turbo-Reconstruction for Sparse-View Vast Scenes
Viaarxiv icon

MoLE: Enhancing Human-centric Text-to-image Diffusion via Mixture of Low-rank Experts

Add code
Oct 30, 2024
Figure 1 for MoLE: Enhancing Human-centric Text-to-image Diffusion via Mixture of Low-rank Experts
Figure 2 for MoLE: Enhancing Human-centric Text-to-image Diffusion via Mixture of Low-rank Experts
Figure 3 for MoLE: Enhancing Human-centric Text-to-image Diffusion via Mixture of Low-rank Experts
Figure 4 for MoLE: Enhancing Human-centric Text-to-image Diffusion via Mixture of Low-rank Experts
Viaarxiv icon

Schedule Your Edit: A Simple yet Effective Diffusion Noise Schedule for Image Editing

Add code
Oct 24, 2024
Figure 1 for Schedule Your Edit: A Simple yet Effective Diffusion Noise Schedule for Image Editing
Figure 2 for Schedule Your Edit: A Simple yet Effective Diffusion Noise Schedule for Image Editing
Figure 3 for Schedule Your Edit: A Simple yet Effective Diffusion Noise Schedule for Image Editing
Figure 4 for Schedule Your Edit: A Simple yet Effective Diffusion Noise Schedule for Image Editing
Viaarxiv icon

Improving Multi-modal Large Language Model through Boosting Vision Capabilities

Add code
Oct 17, 2024
Figure 1 for Improving Multi-modal Large Language Model through Boosting Vision Capabilities
Figure 2 for Improving Multi-modal Large Language Model through Boosting Vision Capabilities
Figure 3 for Improving Multi-modal Large Language Model through Boosting Vision Capabilities
Figure 4 for Improving Multi-modal Large Language Model through Boosting Vision Capabilities
Viaarxiv icon

TALK-Act: Enhance Textural-Awareness for 2D Speaking Avatar Reenactment with Diffusion Model

Add code
Oct 14, 2024
Figure 1 for TALK-Act: Enhance Textural-Awareness for 2D Speaking Avatar Reenactment with Diffusion Model
Figure 2 for TALK-Act: Enhance Textural-Awareness for 2D Speaking Avatar Reenactment with Diffusion Model
Figure 3 for TALK-Act: Enhance Textural-Awareness for 2D Speaking Avatar Reenactment with Diffusion Model
Figure 4 for TALK-Act: Enhance Textural-Awareness for 2D Speaking Avatar Reenactment with Diffusion Model
Viaarxiv icon

MGMapNet: Multi-Granularity Representation Learning for End-to-End Vectorized HD Map Construction

Add code
Oct 10, 2024
Figure 1 for MGMapNet: Multi-Granularity Representation Learning for End-to-End Vectorized HD Map Construction
Figure 2 for MGMapNet: Multi-Granularity Representation Learning for End-to-End Vectorized HD Map Construction
Figure 3 for MGMapNet: Multi-Granularity Representation Learning for End-to-End Vectorized HD Map Construction
Figure 4 for MGMapNet: Multi-Granularity Representation Learning for End-to-End Vectorized HD Map Construction
Viaarxiv icon