Picture for Heng Wang

Heng Wang

VMAS: Video-to-Music Generation via Semantic Alignment in Web Music Videos

Add code
Sep 11, 2024
Viaarxiv icon

Future Does Matter: Boosting 3D Object Detection with Temporal Motion Estimation in Point Cloud Sequences

Add code
Sep 06, 2024
Figure 1 for Future Does Matter: Boosting 3D Object Detection with Temporal Motion Estimation in Point Cloud Sequences
Figure 2 for Future Does Matter: Boosting 3D Object Detection with Temporal Motion Estimation in Point Cloud Sequences
Figure 3 for Future Does Matter: Boosting 3D Object Detection with Temporal Motion Estimation in Point Cloud Sequences
Figure 4 for Future Does Matter: Boosting 3D Object Detection with Temporal Motion Estimation in Point Cloud Sequences
Viaarxiv icon

Multimodal Causal Reasoning Benchmark: Challenging Vision Large Language Models to Infer Causal Links Between Siamese Images

Add code
Aug 15, 2024
Figure 1 for Multimodal Causal Reasoning Benchmark: Challenging Vision Large Language Models to Infer Causal Links Between Siamese Images
Figure 2 for Multimodal Causal Reasoning Benchmark: Challenging Vision Large Language Models to Infer Causal Links Between Siamese Images
Figure 3 for Multimodal Causal Reasoning Benchmark: Challenging Vision Large Language Models to Infer Causal Links Between Siamese Images
Figure 4 for Multimodal Causal Reasoning Benchmark: Challenging Vision Large Language Models to Infer Causal Links Between Siamese Images
Viaarxiv icon

Can LLM Graph Reasoning Generalize beyond Pattern Memorization?

Add code
Jun 23, 2024
Figure 1 for Can LLM Graph Reasoning Generalize beyond Pattern Memorization?
Figure 2 for Can LLM Graph Reasoning Generalize beyond Pattern Memorization?
Figure 3 for Can LLM Graph Reasoning Generalize beyond Pattern Memorization?
Figure 4 for Can LLM Graph Reasoning Generalize beyond Pattern Memorization?
Viaarxiv icon

Autoregressive Pretraining with Mamba in Vision

Add code
Jun 11, 2024
Figure 1 for Autoregressive Pretraining with Mamba in Vision
Figure 2 for Autoregressive Pretraining with Mamba in Vision
Figure 3 for Autoregressive Pretraining with Mamba in Vision
Figure 4 for Autoregressive Pretraining with Mamba in Vision
Viaarxiv icon

Dance Any Beat: Blending Beats with Visuals in Dance Video Generation

Add code
May 15, 2024
Figure 1 for Dance Any Beat: Blending Beats with Visuals in Dance Video Generation
Figure 2 for Dance Any Beat: Blending Beats with Visuals in Dance Video Generation
Figure 3 for Dance Any Beat: Blending Beats with Visuals in Dance Video Generation
Figure 4 for Dance Any Beat: Blending Beats with Visuals in Dance Video Generation
Viaarxiv icon

Boosting 3D Neuron Segmentation with 2D Vision Transformer Pre-trained on Natural Images

Add code
May 04, 2024
Figure 1 for Boosting 3D Neuron Segmentation with 2D Vision Transformer Pre-trained on Natural Images
Figure 2 for Boosting 3D Neuron Segmentation with 2D Vision Transformer Pre-trained on Natural Images
Viaarxiv icon

HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing

Add code
Apr 15, 2024
Figure 1 for HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing
Figure 2 for HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing
Figure 3 for HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing
Figure 4 for HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing
Viaarxiv icon

Digital Twin Channel for 6G: Concepts, Architectures and Potential Applications

Add code
Mar 31, 2024
Figure 1 for Digital Twin Channel for 6G: Concepts, Architectures and Potential Applications
Figure 2 for Digital Twin Channel for 6G: Concepts, Architectures and Potential Applications
Figure 3 for Digital Twin Channel for 6G: Concepts, Architectures and Potential Applications
Figure 4 for Digital Twin Channel for 6G: Concepts, Architectures and Potential Applications
Viaarxiv icon

MMoE: Robust Spoiler Detection with Multi-modal Information and Domain-aware Mixture-of-Experts

Add code
Mar 14, 2024
Figure 1 for MMoE: Robust Spoiler Detection with Multi-modal Information and Domain-aware Mixture-of-Experts
Figure 2 for MMoE: Robust Spoiler Detection with Multi-modal Information and Domain-aware Mixture-of-Experts
Figure 3 for MMoE: Robust Spoiler Detection with Multi-modal Information and Domain-aware Mixture-of-Experts
Figure 4 for MMoE: Robust Spoiler Detection with Multi-modal Information and Domain-aware Mixture-of-Experts
Viaarxiv icon