Picture for Shijie Ma

Shijie Ma

Aligned Better, Listen Better for Audio-Visual Large Language Models

Add code
Apr 02, 2025
Viaarxiv icon

GenHancer: Imperfect Generative Models are Secretly Strong Vision-Centric Enhancers

Add code
Mar 25, 2025
Viaarxiv icon

Happy: A Debiased Learning Framework for Continual Generalized Category Discovery

Add code
Oct 10, 2024
Figure 1 for Happy: A Debiased Learning Framework for Continual Generalized Category Discovery
Figure 2 for Happy: A Debiased Learning Framework for Continual Generalized Category Discovery
Figure 3 for Happy: A Debiased Learning Framework for Continual Generalized Category Discovery
Figure 4 for Happy: A Debiased Learning Framework for Continual Generalized Category Discovery
Viaarxiv icon

WPS-SAM: Towards Weakly-Supervised Part Segmentation with Foundation Models

Add code
Jul 14, 2024
Figure 1 for WPS-SAM: Towards Weakly-Supervised Part Segmentation with Foundation Models
Figure 2 for WPS-SAM: Towards Weakly-Supervised Part Segmentation with Foundation Models
Figure 3 for WPS-SAM: Towards Weakly-Supervised Part Segmentation with Foundation Models
Figure 4 for WPS-SAM: Towards Weakly-Supervised Part Segmentation with Foundation Models
Viaarxiv icon

MSPE: Multi-Scale Patch Embedding Prompts Vision Transformers to Any Resolution

Add code
May 28, 2024
Figure 1 for MSPE: Multi-Scale Patch Embedding Prompts Vision Transformers to Any Resolution
Figure 2 for MSPE: Multi-Scale Patch Embedding Prompts Vision Transformers to Any Resolution
Figure 3 for MSPE: Multi-Scale Patch Embedding Prompts Vision Transformers to Any Resolution
Figure 4 for MSPE: Multi-Scale Patch Embedding Prompts Vision Transformers to Any Resolution
Viaarxiv icon

Open-world Machine Learning: A Review and New Outlooks

Add code
Mar 15, 2024
Viaarxiv icon

Active Generalized Category Discovery

Add code
Mar 07, 2024
Figure 1 for Active Generalized Category Discovery
Figure 2 for Active Generalized Category Discovery
Figure 3 for Active Generalized Category Discovery
Figure 4 for Active Generalized Category Discovery
Viaarxiv icon

Dual Mean-Teacher: An Unbiased Semi-Supervised Framework for Audio-Visual Source Localization

Add code
Mar 05, 2024
Viaarxiv icon

Cross Pseudo-Labeling for Semi-Supervised Audio-Visual Source Localization

Add code
Mar 05, 2024
Viaarxiv icon

Optimal Noise pursuit for Augmenting Text-to-Video Generation

Add code
Nov 02, 2023
Figure 1 for Optimal Noise pursuit for Augmenting Text-to-Video Generation
Figure 2 for Optimal Noise pursuit for Augmenting Text-to-Video Generation
Figure 3 for Optimal Noise pursuit for Augmenting Text-to-Video Generation
Figure 4 for Optimal Noise pursuit for Augmenting Text-to-Video Generation
Viaarxiv icon