Picture for Gangshan Wu

Gangshan Wu

NTIRE 2025 Challenge on Image Super-Resolution ($\times$4): Methods and Results

Add code
Apr 20, 2025
Viaarxiv icon

AnomalyR1: A GRPO-based End-to-end MLLM for Industrial Anomaly Detection

Add code
Apr 16, 2025
Figure 1 for AnomalyR1: A GRPO-based End-to-end MLLM for Industrial Anomaly Detection
Figure 2 for AnomalyR1: A GRPO-based End-to-end MLLM for Industrial Anomaly Detection
Figure 3 for AnomalyR1: A GRPO-based End-to-end MLLM for Industrial Anomaly Detection
Figure 4 for AnomalyR1: A GRPO-based End-to-end MLLM for Industrial Anomaly Detection
Viaarxiv icon

KAN-SAM: Kolmogorov-Arnold Network Guided Segment Anything Model for RGB-T Salient Object Detection

Add code
Apr 08, 2025
Figure 1 for KAN-SAM: Kolmogorov-Arnold Network Guided Segment Anything Model for RGB-T Salient Object Detection
Figure 2 for KAN-SAM: Kolmogorov-Arnold Network Guided Segment Anything Model for RGB-T Salient Object Detection
Figure 3 for KAN-SAM: Kolmogorov-Arnold Network Guided Segment Anything Model for RGB-T Salient Object Detection
Figure 4 for KAN-SAM: Kolmogorov-Arnold Network Guided Segment Anything Model for RGB-T Salient Object Detection
Viaarxiv icon

RASA: Replace Anyone, Say Anything -- A Training-Free Framework for Audio-Driven and Universal Portrait Video Editing

Add code
Mar 14, 2025
Viaarxiv icon

CATANet: Efficient Content-Aware Token Aggregation for Lightweight Image Super-Resolution

Add code
Mar 10, 2025
Viaarxiv icon

AutoLUT: LUT-Based Image Super-Resolution with Automatic Sampling and Adaptive Residual Learning

Add code
Mar 03, 2025
Viaarxiv icon

Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning

Add code
Nov 21, 2024
Figure 1 for Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning
Figure 2 for Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning
Figure 3 for Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning
Figure 4 for Tra-MoE: Learning Trajectory Prediction Model from Multiple Domains for Adaptive Policy Conditioning
Viaarxiv icon

Efficient Test-Time Prompt Tuning for Vision-Language Models

Add code
Aug 11, 2024
Figure 1 for Efficient Test-Time Prompt Tuning for Vision-Language Models
Figure 2 for Efficient Test-Time Prompt Tuning for Vision-Language Models
Figure 3 for Efficient Test-Time Prompt Tuning for Vision-Language Models
Figure 4 for Efficient Test-Time Prompt Tuning for Vision-Language Models
Viaarxiv icon

RAVSS: Robust Audio-Visual Speech Separation in Multi-Speaker Scenarios with Missing Visual Cues

Add code
Jul 27, 2024
Figure 1 for RAVSS: Robust Audio-Visual Speech Separation in Multi-Speaker Scenarios with Missing Visual Cues
Figure 2 for RAVSS: Robust Audio-Visual Speech Separation in Multi-Speaker Scenarios with Missing Visual Cues
Figure 3 for RAVSS: Robust Audio-Visual Speech Separation in Multi-Speaker Scenarios with Missing Visual Cues
Figure 4 for RAVSS: Robust Audio-Visual Speech Separation in Multi-Speaker Scenarios with Missing Visual Cues
Viaarxiv icon

GTPT: Group-based Token Pruning Transformer for Efficient Human Pose Estimation

Add code
Jul 16, 2024
Figure 1 for GTPT: Group-based Token Pruning Transformer for Efficient Human Pose Estimation
Figure 2 for GTPT: Group-based Token Pruning Transformer for Efficient Human Pose Estimation
Figure 3 for GTPT: Group-based Token Pruning Transformer for Efficient Human Pose Estimation
Figure 4 for GTPT: Group-based Token Pruning Transformer for Efficient Human Pose Estimation
Viaarxiv icon