Picture for Jiangning Zhang

Jiangning Zhang

College of Control Science and Engineering, Zhejiang University, Hangzhou, China

Align and Surpass Human Camouflaged Perception: Visual Refocus Reinforcement Fine-Tuning

Add code
May 26, 2025
Figure 1 for Align and Surpass Human Camouflaged Perception: Visual Refocus Reinforcement Fine-Tuning
Figure 2 for Align and Surpass Human Camouflaged Perception: Visual Refocus Reinforcement Fine-Tuning
Figure 3 for Align and Surpass Human Camouflaged Perception: Visual Refocus Reinforcement Fine-Tuning
Figure 4 for Align and Surpass Human Camouflaged Perception: Visual Refocus Reinforcement Fine-Tuning
Viaarxiv icon

So-Fake: Benchmarking and Explaining Social Media Image Forgery Detection

Add code
May 24, 2025
Figure 1 for So-Fake: Benchmarking and Explaining Social Media Image Forgery Detection
Figure 2 for So-Fake: Benchmarking and Explaining Social Media Image Forgery Detection
Figure 3 for So-Fake: Benchmarking and Explaining Social Media Image Forgery Detection
Figure 4 for So-Fake: Benchmarking and Explaining Social Media Image Forgery Detection
Viaarxiv icon

Swin DiT: Diffusion Transformer using Pseudo Shifted Windows

Add code
May 19, 2025
Viaarxiv icon

Real-IAD D3: A Real-World 2D/Pseudo-3D/3D Dataset for Industrial Anomaly Detection

Add code
Apr 19, 2025
Viaarxiv icon

Decouple and Track: Benchmarking and Improving Video Diffusion Transformers for Motion Transfer

Add code
Mar 21, 2025
Viaarxiv icon

Image Inversion: A Survey from GANs to Diffusion and Beyond

Add code
Feb 17, 2025
Viaarxiv icon

RWKV-UNet: Improving UNet with Long-Range Cooperation for Effective Medical Image Segmentation

Add code
Jan 14, 2025
Figure 1 for RWKV-UNet: Improving UNet with Long-Range Cooperation for Effective Medical Image Segmentation
Figure 2 for RWKV-UNet: Improving UNet with Long-Range Cooperation for Effective Medical Image Segmentation
Figure 3 for RWKV-UNet: Improving UNet with Long-Range Cooperation for Effective Medical Image Segmentation
Figure 4 for RWKV-UNet: Improving UNet with Long-Range Cooperation for Effective Medical Image Segmentation
Viaarxiv icon

Are They the Same? Exploring Visual Correspondence Shortcomings of Multimodal LLMs

Add code
Jan 08, 2025
Figure 1 for Are They the Same? Exploring Visual Correspondence Shortcomings of Multimodal LLMs
Figure 2 for Are They the Same? Exploring Visual Correspondence Shortcomings of Multimodal LLMs
Figure 3 for Are They the Same? Exploring Visual Correspondence Shortcomings of Multimodal LLMs
Figure 4 for Are They the Same? Exploring Visual Correspondence Shortcomings of Multimodal LLMs
Viaarxiv icon

SVFR: A Unified Framework for Generalized Video Face Restoration

Add code
Jan 03, 2025
Figure 1 for SVFR: A Unified Framework for Generalized Video Face Restoration
Figure 2 for SVFR: A Unified Framework for Generalized Video Face Restoration
Figure 3 for SVFR: A Unified Framework for Generalized Video Face Restoration
Figure 4 for SVFR: A Unified Framework for Generalized Video Face Restoration
Viaarxiv icon

Improving Autoregressive Visual Generation with Cluster-Oriented Token Prediction

Add code
Jan 01, 2025
Figure 1 for Improving Autoregressive Visual Generation with Cluster-Oriented Token Prediction
Figure 2 for Improving Autoregressive Visual Generation with Cluster-Oriented Token Prediction
Figure 3 for Improving Autoregressive Visual Generation with Cluster-Oriented Token Prediction
Figure 4 for Improving Autoregressive Visual Generation with Cluster-Oriented Token Prediction
Viaarxiv icon