Picture for Bin Li

Bin Li

Member, IEEE

MSA-UNet3+: Multi-Scale Attention UNet3+ with New Supervised Prototypical Contrastive Loss for Coronary DSA Image Segmentation

Add code
Apr 07, 2025
Figure 1 for MSA-UNet3+: Multi-Scale Attention UNet3+ with New Supervised Prototypical Contrastive Loss for Coronary DSA Image Segmentation
Figure 2 for MSA-UNet3+: Multi-Scale Attention UNet3+ with New Supervised Prototypical Contrastive Loss for Coronary DSA Image Segmentation
Figure 3 for MSA-UNet3+: Multi-Scale Attention UNet3+ with New Supervised Prototypical Contrastive Loss for Coronary DSA Image Segmentation
Figure 4 for MSA-UNet3+: Multi-Scale Attention UNet3+ with New Supervised Prototypical Contrastive Loss for Coronary DSA Image Segmentation
Viaarxiv icon

Instruction-Aligned Visual Attention for Mitigating Hallucinations in Large Vision-Language Models

Add code
Mar 24, 2025
Viaarxiv icon

Multi-modal Multi-platform Person Re-Identification: Benchmark and Method

Add code
Mar 21, 2025
Figure 1 for Multi-modal Multi-platform Person Re-Identification: Benchmark and Method
Figure 2 for Multi-modal Multi-platform Person Re-Identification: Benchmark and Method
Figure 3 for Multi-modal Multi-platform Person Re-Identification: Benchmark and Method
Figure 4 for Multi-modal Multi-platform Person Re-Identification: Benchmark and Method
Viaarxiv icon

UMIT: Unifying Medical Imaging Tasks via Vision-Language Models

Add code
Mar 20, 2025
Figure 1 for UMIT: Unifying Medical Imaging Tasks via Vision-Language Models
Figure 2 for UMIT: Unifying Medical Imaging Tasks via Vision-Language Models
Figure 3 for UMIT: Unifying Medical Imaging Tasks via Vision-Language Models
Figure 4 for UMIT: Unifying Medical Imaging Tasks via Vision-Language Models
Viaarxiv icon

DRoPE: Directional Rotary Position Embedding for Efficient Agent Interaction Modeling

Add code
Mar 19, 2025
Viaarxiv icon

Robot Skin with Touch and Bend Sensing using Electrical Impedance Tomography

Add code
Mar 17, 2025
Viaarxiv icon

Proxy-Tuning: Tailoring Multimodal Autoregressive Models for Subject-Driven Image Generation

Add code
Mar 13, 2025
Figure 1 for Proxy-Tuning: Tailoring Multimodal Autoregressive Models for Subject-Driven Image Generation
Figure 2 for Proxy-Tuning: Tailoring Multimodal Autoregressive Models for Subject-Driven Image Generation
Figure 3 for Proxy-Tuning: Tailoring Multimodal Autoregressive Models for Subject-Driven Image Generation
Figure 4 for Proxy-Tuning: Tailoring Multimodal Autoregressive Models for Subject-Driven Image Generation
Viaarxiv icon

VLForgery Face Triad: Detection, Localization and Attribution via Multimodal Large Language Models

Add code
Mar 08, 2025
Figure 1 for VLForgery Face Triad: Detection, Localization and Attribution via Multimodal Large Language Models
Figure 2 for VLForgery Face Triad: Detection, Localization and Attribution via Multimodal Large Language Models
Figure 3 for VLForgery Face Triad: Detection, Localization and Attribution via Multimodal Large Language Models
Figure 4 for VLForgery Face Triad: Detection, Localization and Attribution via Multimodal Large Language Models
Viaarxiv icon

EVE: Towards End-to-End Video Subtitle Extraction with Vision-Language Models

Add code
Mar 06, 2025
Viaarxiv icon

Small but Mighty: Enhancing Time Series Forecasting with Lightweight LLMs

Add code
Mar 05, 2025
Viaarxiv icon