Picture for Wei Zhang

Wei Zhang

Alibaba Group

The 1st Solution for 4th PVUW MeViS Challenge: Unleashing the Potential of Large Multimodal Models for Referring Video Segmentation

Add code
Apr 07, 2025
Figure 1 for The 1st Solution for 4th PVUW MeViS Challenge: Unleashing the Potential of Large Multimodal Models for Referring Video Segmentation
Figure 2 for The 1st Solution for 4th PVUW MeViS Challenge: Unleashing the Potential of Large Multimodal Models for Referring Video Segmentation
Figure 3 for The 1st Solution for 4th PVUW MeViS Challenge: Unleashing the Potential of Large Multimodal Models for Referring Video Segmentation
Figure 4 for The 1st Solution for 4th PVUW MeViS Challenge: Unleashing the Potential of Large Multimodal Models for Referring Video Segmentation
Viaarxiv icon

Vision-Language Model Predictive Control for Manipulation Planning and Trajectory Generation

Add code
Apr 07, 2025
Figure 1 for Vision-Language Model Predictive Control for Manipulation Planning and Trajectory Generation
Figure 2 for Vision-Language Model Predictive Control for Manipulation Planning and Trajectory Generation
Figure 3 for Vision-Language Model Predictive Control for Manipulation Planning and Trajectory Generation
Figure 4 for Vision-Language Model Predictive Control for Manipulation Planning and Trajectory Generation
Viaarxiv icon

GROVE: A Generalized Reward for Learning Open-Vocabulary Physical Skill

Add code
Apr 05, 2025
Figure 1 for GROVE: A Generalized Reward for Learning Open-Vocabulary Physical Skill
Figure 2 for GROVE: A Generalized Reward for Learning Open-Vocabulary Physical Skill
Figure 3 for GROVE: A Generalized Reward for Learning Open-Vocabulary Physical Skill
Figure 4 for GROVE: A Generalized Reward for Learning Open-Vocabulary Physical Skill
Viaarxiv icon

Gaussian Process Tilted Nonparametric Density Estimation using Fisher Divergence Score Matching

Add code
Apr 04, 2025
Viaarxiv icon

CSF: Fixed-outline Floorplanning Based on the Conjugate Subgradient Algorithm Assisted by Q-Learning

Add code
Apr 04, 2025
Viaarxiv icon

Dexterous Manipulation through Imitation Learning: A Survey

Add code
Apr 04, 2025
Viaarxiv icon

ILLUME+: Illuminating Unified MLLM with Dual Visual Tokenization and Diffusion Refinement

Add code
Apr 03, 2025
Figure 1 for ILLUME+: Illuminating Unified MLLM with Dual Visual Tokenization and Diffusion Refinement
Figure 2 for ILLUME+: Illuminating Unified MLLM with Dual Visual Tokenization and Diffusion Refinement
Figure 3 for ILLUME+: Illuminating Unified MLLM with Dual Visual Tokenization and Diffusion Refinement
Figure 4 for ILLUME+: Illuminating Unified MLLM with Dual Visual Tokenization and Diffusion Refinement
Viaarxiv icon

Reconfigurable Codebook-Based Beamforming for RDARS-Aided mmWave MU-MIMO Systems

Add code
Apr 02, 2025
Figure 1 for Reconfigurable Codebook-Based Beamforming for RDARS-Aided mmWave MU-MIMO Systems
Figure 2 for Reconfigurable Codebook-Based Beamforming for RDARS-Aided mmWave MU-MIMO Systems
Figure 3 for Reconfigurable Codebook-Based Beamforming for RDARS-Aided mmWave MU-MIMO Systems
Figure 4 for Reconfigurable Codebook-Based Beamforming for RDARS-Aided mmWave MU-MIMO Systems
Viaarxiv icon

Hierarchical Attention Networks for Lossless Point Cloud Attribute Compression

Add code
Apr 01, 2025
Viaarxiv icon

MAER-Nav: Bidirectional Motion Learning Through Mirror-Augmented Experience Replay for Robot Navigation

Add code
Mar 31, 2025
Figure 1 for MAER-Nav: Bidirectional Motion Learning Through Mirror-Augmented Experience Replay for Robot Navigation
Figure 2 for MAER-Nav: Bidirectional Motion Learning Through Mirror-Augmented Experience Replay for Robot Navigation
Figure 3 for MAER-Nav: Bidirectional Motion Learning Through Mirror-Augmented Experience Replay for Robot Navigation
Figure 4 for MAER-Nav: Bidirectional Motion Learning Through Mirror-Augmented Experience Replay for Robot Navigation
Viaarxiv icon