Picture for Mengmeng Wang

Mengmeng Wang

LangSuitE: Planning, Controlling and Interacting with Large Language Models in Embodied Text Environments

Add code
Jun 24, 2024
Viaarxiv icon

SDSTrack: Self-Distillation Symmetric Adapter Learning for Multi-Modal Visual Object Tracking

Add code
Mar 28, 2024
Figure 1 for SDSTrack: Self-Distillation Symmetric Adapter Learning for Multi-Modal Visual Object Tracking
Figure 2 for SDSTrack: Self-Distillation Symmetric Adapter Learning for Multi-Modal Visual Object Tracking
Figure 3 for SDSTrack: Self-Distillation Symmetric Adapter Learning for Multi-Modal Visual Object Tracking
Viaarxiv icon

DreamSalon: A Staged Diffusion Framework for Preserving Identity-Context in Editable Face Generation

Add code
Mar 28, 2024
Figure 1 for DreamSalon: A Staged Diffusion Framework for Preserving Identity-Context in Editable Face Generation
Figure 2 for DreamSalon: A Staged Diffusion Framework for Preserving Identity-Context in Editable Face Generation
Figure 3 for DreamSalon: A Staged Diffusion Framework for Preserving Identity-Context in Editable Face Generation
Figure 4 for DreamSalon: A Staged Diffusion Framework for Preserving Identity-Context in Editable Face Generation
Viaarxiv icon

M2-CLIP: A Multimodal, Multi-task Adapting Framework for Video Action Recognition

Add code
Jan 22, 2024
Figure 1 for M2-CLIP: A Multimodal, Multi-task Adapting Framework for Video Action Recognition
Figure 2 for M2-CLIP: A Multimodal, Multi-task Adapting Framework for Video Action Recognition
Figure 3 for M2-CLIP: A Multimodal, Multi-task Adapting Framework for Video Action Recognition
Figure 4 for M2-CLIP: A Multimodal, Multi-task Adapting Framework for Video Action Recognition
Viaarxiv icon

Camera-based 3D Semantic Scene Completion with Sparse Guidance Network

Add code
Dec 10, 2023
Figure 1 for Camera-based 3D Semantic Scene Completion with Sparse Guidance Network
Figure 2 for Camera-based 3D Semantic Scene Completion with Sparse Guidance Network
Figure 3 for Camera-based 3D Semantic Scene Completion with Sparse Guidance Network
Figure 4 for Camera-based 3D Semantic Scene Completion with Sparse Guidance Network
Viaarxiv icon

Generating Action-conditioned Prompts for Open-vocabulary Video Action Recognition

Add code
Dec 04, 2023
Viaarxiv icon

LooGLE: Can Long-Context Language Models Understand Long Contexts?

Add code
Nov 08, 2023
Viaarxiv icon

Synchronize Feature Extracting and Matching: A Single Branch Framework for 3D Object Tracking

Add code
Aug 24, 2023
Figure 1 for Synchronize Feature Extracting and Matching: A Single Branch Framework for 3D Object Tracking
Figure 2 for Synchronize Feature Extracting and Matching: A Single Branch Framework for 3D Object Tracking
Figure 3 for Synchronize Feature Extracting and Matching: A Single Branch Framework for 3D Object Tracking
Figure 4 for Synchronize Feature Extracting and Matching: A Single Branch Framework for 3D Object Tracking
Viaarxiv icon

Decentralized Riemannian Conjugate Gradient Method on the Stiefel Manifold

Add code
Aug 21, 2023
Figure 1 for Decentralized Riemannian Conjugate Gradient Method on the Stiefel Manifold
Figure 2 for Decentralized Riemannian Conjugate Gradient Method on the Stiefel Manifold
Figure 3 for Decentralized Riemannian Conjugate Gradient Method on the Stiefel Manifold
Figure 4 for Decentralized Riemannian Conjugate Gradient Method on the Stiefel Manifold
Viaarxiv icon

SSMG: Spatial-Semantic Map Guided Diffusion Model for Free-form Layout-to-Image Generation

Add code
Aug 20, 2023
Figure 1 for SSMG: Spatial-Semantic Map Guided Diffusion Model for Free-form Layout-to-Image Generation
Figure 2 for SSMG: Spatial-Semantic Map Guided Diffusion Model for Free-form Layout-to-Image Generation
Figure 3 for SSMG: Spatial-Semantic Map Guided Diffusion Model for Free-form Layout-to-Image Generation
Figure 4 for SSMG: Spatial-Semantic Map Guided Diffusion Model for Free-form Layout-to-Image Generation
Viaarxiv icon