Picture for Ming Yang

Ming Yang

Animate-X: Universal Character Image Animation with Enhanced Motion Representation

Add code
Oct 14, 2024
Figure 1 for Animate-X: Universal Character Image Animation with Enhanced Motion Representation
Figure 2 for Animate-X: Universal Character Image Animation with Enhanced Motion Representation
Figure 3 for Animate-X: Universal Character Image Animation with Enhanced Motion Representation
Figure 4 for Animate-X: Universal Character Image Animation with Enhanced Motion Representation
Viaarxiv icon

End-to-end Driving in High-Interaction Traffic Scenarios with Reinforcement Learning

Add code
Oct 03, 2024
Viaarxiv icon

Self-Supervised Graph Embedding Clustering

Add code
Sep 24, 2024
Figure 1 for Self-Supervised Graph Embedding Clustering
Figure 2 for Self-Supervised Graph Embedding Clustering
Figure 3 for Self-Supervised Graph Embedding Clustering
Figure 4 for Self-Supervised Graph Embedding Clustering
Viaarxiv icon

StyleTokenizer: Defining Image Style by a Single Instance for Controlling Diffusion Models

Add code
Sep 04, 2024
Figure 1 for StyleTokenizer: Defining Image Style by a Single Instance for Controlling Diffusion Models
Figure 2 for StyleTokenizer: Defining Image Style by a Single Instance for Controlling Diffusion Models
Figure 3 for StyleTokenizer: Defining Image Style by a Single Instance for Controlling Diffusion Models
Figure 4 for StyleTokenizer: Defining Image Style by a Single Instance for Controlling Diffusion Models
Viaarxiv icon

Social Debiasing for Fair Multi-modal LLMs

Add code
Aug 13, 2024
Viaarxiv icon

Egocentric Vision Language Planning

Add code
Aug 11, 2024
Viaarxiv icon

ParkingE2E: Camera-based End-to-end Parking Network, from Images to Planning

Add code
Aug 04, 2024
Figure 1 for ParkingE2E: Camera-based End-to-end Parking Network, from Images to Planning
Figure 2 for ParkingE2E: Camera-based End-to-end Parking Network, from Images to Planning
Figure 3 for ParkingE2E: Camera-based End-to-end Parking Network, from Images to Planning
Figure 4 for ParkingE2E: Camera-based End-to-end Parking Network, from Images to Planning
Viaarxiv icon

POA: Pre-training Once for Models of All Sizes

Add code
Aug 02, 2024
Figure 1 for POA: Pre-training Once for Models of All Sizes
Figure 2 for POA: Pre-training Once for Models of All Sizes
Figure 3 for POA: Pre-training Once for Models of All Sizes
Figure 4 for POA: Pre-training Once for Models of All Sizes
Viaarxiv icon

Accelerating Pre-training of Multimodal LLMs via Chain-of-Sight

Add code
Jul 22, 2024
Figure 1 for Accelerating Pre-training of Multimodal LLMs via Chain-of-Sight
Figure 2 for Accelerating Pre-training of Multimodal LLMs via Chain-of-Sight
Figure 3 for Accelerating Pre-training of Multimodal LLMs via Chain-of-Sight
Figure 4 for Accelerating Pre-training of Multimodal LLMs via Chain-of-Sight
Viaarxiv icon

MapLocNet: Coarse-to-Fine Feature Registration for Visual Re-Localization in Navigation Maps

Add code
Jul 11, 2024
Figure 1 for MapLocNet: Coarse-to-Fine Feature Registration for Visual Re-Localization in Navigation Maps
Figure 2 for MapLocNet: Coarse-to-Fine Feature Registration for Visual Re-Localization in Navigation Maps
Figure 3 for MapLocNet: Coarse-to-Fine Feature Registration for Visual Re-Localization in Navigation Maps
Figure 4 for MapLocNet: Coarse-to-Fine Feature Registration for Visual Re-Localization in Navigation Maps
Viaarxiv icon