Picture for Fan Zhang

Fan Zhang

University of Bristol

Medical Image Registration Meets Vision Foundation Model: Prototype Learning and Contour Awareness

Add code
Feb 17, 2025
Viaarxiv icon

FCVSR: A Frequency-aware Method for Compressed Video Super-Resolution

Add code
Feb 10, 2025
Figure 1 for FCVSR: A Frequency-aware Method for Compressed Video Super-Resolution
Figure 2 for FCVSR: A Frequency-aware Method for Compressed Video Super-Resolution
Figure 3 for FCVSR: A Frequency-aware Method for Compressed Video Super-Resolution
Figure 4 for FCVSR: A Frequency-aware Method for Compressed Video Super-Resolution
Viaarxiv icon

Learning Street View Representations with Spatiotemporal Contrast

Add code
Feb 07, 2025
Figure 1 for Learning Street View Representations with Spatiotemporal Contrast
Figure 2 for Learning Street View Representations with Spatiotemporal Contrast
Figure 3 for Learning Street View Representations with Spatiotemporal Contrast
Figure 4 for Learning Street View Representations with Spatiotemporal Contrast
Viaarxiv icon

EndoChat: Grounded Multimodal Large Language Model for Endoscopic Surgery

Add code
Jan 20, 2025
Figure 1 for EndoChat: Grounded Multimodal Large Language Model for Endoscopic Surgery
Figure 2 for EndoChat: Grounded Multimodal Large Language Model for Endoscopic Surgery
Figure 3 for EndoChat: Grounded Multimodal Large Language Model for Endoscopic Surgery
Figure 4 for EndoChat: Grounded Multimodal Large Language Model for Endoscopic Surgery
Viaarxiv icon

UltraFusion: Ultra High Dynamic Imaging using Exposure Fusion

Add code
Jan 20, 2025
Figure 1 for UltraFusion: Ultra High Dynamic Imaging using Exposure Fusion
Figure 2 for UltraFusion: Ultra High Dynamic Imaging using Exposure Fusion
Figure 3 for UltraFusion: Ultra High Dynamic Imaging using Exposure Fusion
Figure 4 for UltraFusion: Ultra High Dynamic Imaging using Exposure Fusion
Viaarxiv icon

X-LeBench: A Benchmark for Extremely Long Egocentric Video Understanding

Add code
Jan 12, 2025
Viaarxiv icon

GeoPix: Multi-Modal Large Language Model for Pixel-level Image Understanding in Remote Sensing

Add code
Jan 12, 2025
Figure 1 for GeoPix: Multi-Modal Large Language Model for Pixel-level Image Understanding in Remote Sensing
Figure 2 for GeoPix: Multi-Modal Large Language Model for Pixel-level Image Understanding in Remote Sensing
Figure 3 for GeoPix: Multi-Modal Large Language Model for Pixel-level Image Understanding in Remote Sensing
Figure 4 for GeoPix: Multi-Modal Large Language Model for Pixel-level Image Understanding in Remote Sensing
Viaarxiv icon

FlipedRAG: Black-Box Opinion Manipulation Attacks to Retrieval-Augmented Generation of Large Language Models

Add code
Jan 06, 2025
Viaarxiv icon

Artificial Intelligence in Creative Industries: Advances Prior to 2025

Add code
Jan 06, 2025
Viaarxiv icon

SGTC: Semantic-Guided Triplet Co-training for Sparsely Annotated Semi-Supervised Medical Image Segmentation

Add code
Dec 20, 2024
Figure 1 for SGTC: Semantic-Guided Triplet Co-training for Sparsely Annotated Semi-Supervised Medical Image Segmentation
Figure 2 for SGTC: Semantic-Guided Triplet Co-training for Sparsely Annotated Semi-Supervised Medical Image Segmentation
Figure 3 for SGTC: Semantic-Guided Triplet Co-training for Sparsely Annotated Semi-Supervised Medical Image Segmentation
Figure 4 for SGTC: Semantic-Guided Triplet Co-training for Sparsely Annotated Semi-Supervised Medical Image Segmentation
Viaarxiv icon