Picture for Ming-Hsuan Yang

Ming-Hsuan Yang

Spatial-Temporal Multi-level Association for Video Object Segmentation

Add code
Apr 09, 2024
Viaarxiv icon

Mansformer: Efficient Transformer of Mixed Attention for Image Deblurring and Beyond

Add code
Apr 09, 2024
Figure 1 for Mansformer: Efficient Transformer of Mixed Attention for Image Deblurring and Beyond
Figure 2 for Mansformer: Efficient Transformer of Mixed Attention for Image Deblurring and Beyond
Figure 3 for Mansformer: Efficient Transformer of Mixed Attention for Image Deblurring and Beyond
Figure 4 for Mansformer: Efficient Transformer of Mixed Attention for Image Deblurring and Beyond
Viaarxiv icon

HENet: Hybrid Encoding for End-to-end Multi-task 3D Perception from Multi-view Cameras

Add code
Apr 03, 2024
Figure 1 for HENet: Hybrid Encoding for End-to-end Multi-task 3D Perception from Multi-view Cameras
Figure 2 for HENet: Hybrid Encoding for End-to-end Multi-task 3D Perception from Multi-view Cameras
Figure 3 for HENet: Hybrid Encoding for End-to-end Multi-task 3D Perception from Multi-view Cameras
Figure 4 for HENet: Hybrid Encoding for End-to-end Multi-task 3D Perception from Multi-view Cameras
Viaarxiv icon

Dynamic Pre-training: Towards Efficient and Scalable All-in-One Image Restoration

Add code
Apr 02, 2024
Figure 1 for Dynamic Pre-training: Towards Efficient and Scalable All-in-One Image Restoration
Figure 2 for Dynamic Pre-training: Towards Efficient and Scalable All-in-One Image Restoration
Figure 3 for Dynamic Pre-training: Towards Efficient and Scalable All-in-One Image Restoration
Figure 4 for Dynamic Pre-training: Towards Efficient and Scalable All-in-One Image Restoration
Viaarxiv icon

RTracker: Recoverable Tracking via PN Tree Structured Memory

Add code
Mar 28, 2024
Figure 1 for RTracker: Recoverable Tracking via PN Tree Structured Memory
Figure 2 for RTracker: Recoverable Tracking via PN Tree Structured Memory
Figure 3 for RTracker: Recoverable Tracking via PN Tree Structured Memory
Figure 4 for RTracker: Recoverable Tracking via PN Tree Structured Memory
Viaarxiv icon

Efficient Video Object Segmentation via Modulated Cross-Attention Memory

Add code
Mar 26, 2024
Figure 1 for Efficient Video Object Segmentation via Modulated Cross-Attention Memory
Figure 2 for Efficient Video Object Segmentation via Modulated Cross-Attention Memory
Figure 3 for Efficient Video Object Segmentation via Modulated Cross-Attention Memory
Figure 4 for Efficient Video Object Segmentation via Modulated Cross-Attention Memory
Viaarxiv icon

Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers

Add code
Feb 29, 2024
Figure 1 for Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
Figure 2 for Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
Figure 3 for Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
Figure 4 for Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
Viaarxiv icon

Interactive Multi-Head Self-Attention with Linear Complexity

Add code
Feb 27, 2024
Figure 1 for Interactive Multi-Head Self-Attention with Linear Complexity
Figure 2 for Interactive Multi-Head Self-Attention with Linear Complexity
Figure 3 for Interactive Multi-Head Self-Attention with Linear Complexity
Figure 4 for Interactive Multi-Head Self-Attention with Linear Complexity
Viaarxiv icon

Scene Prior Filtering for Depth Map Super-Resolution

Add code
Feb 23, 2024
Figure 1 for Scene Prior Filtering for Depth Map Super-Resolution
Figure 2 for Scene Prior Filtering for Depth Map Super-Resolution
Figure 3 for Scene Prior Filtering for Depth Map Super-Resolution
Figure 4 for Scene Prior Filtering for Depth Map Super-Resolution
Viaarxiv icon

StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing

Add code
Feb 21, 2024
Figure 1 for StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing
Figure 2 for StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing
Figure 3 for StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing
Figure 4 for StyleDubber: Towards Multi-Scale Style Learning for Movie Dubbing
Viaarxiv icon