Picture for Bin Li

Bin Li

Member, IEEE

Overview of the NLPCC 2025 Shared Task 4: Multi-modal, Multilingual, and Multi-hop Medical Instructional Video Question Answering Challenge

Add code
May 11, 2025
Viaarxiv icon

Emotion-Qwen: Training Hybrid Experts for Unified Emotion and General Vision-Language Understanding

Add code
May 10, 2025
Viaarxiv icon

PICD: Versatile Perceptual Image Compression with Diffusion Rendering

Add code
May 09, 2025
Viaarxiv icon

ReGraP-LLaVA: Reasoning enabled Graph-based Personalized Large Language and Vision Assistant

Add code
May 06, 2025
Viaarxiv icon

XeMap: Contextual Referring in Large-Scale Remote Sensing Environments

Add code
Apr 30, 2025
Viaarxiv icon

Ask2Loc: Learning to Locate Instructional Visual Answers by Asking Questions

Add code
Apr 22, 2025
Viaarxiv icon

Hierarchical Modeling for Medical Visual Question Answering with Cross-Attention Fusion

Add code
Apr 10, 2025
Viaarxiv icon

Event Signal Filtering via Probability Flux Estimation

Add code
Apr 10, 2025
Viaarxiv icon

EffOWT: Transfer Visual Language Models to Open-World Tracking Efficiently and Effectively

Add code
Apr 09, 2025
Viaarxiv icon

MSA-UNet3+: Multi-Scale Attention UNet3+ with New Supervised Prototypical Contrastive Loss for Coronary DSA Image Segmentation

Add code
Apr 07, 2025
Viaarxiv icon