Picture for Bin Li

Bin Li

Member, IEEE

MentalMAC: Enhancing Large Language Models for Detecting Mental Manipulation via Multi-Task Anti-Curriculum Distillation

Add code
May 21, 2025
Viaarxiv icon

Overview of the NLPCC 2025 Shared Task 4: Multi-modal, Multilingual, and Multi-hop Medical Instructional Video Question Answering Challenge

Add code
May 11, 2025
Viaarxiv icon

Emotion-Qwen: Training Hybrid Experts for Unified Emotion and General Vision-Language Understanding

Add code
May 10, 2025
Viaarxiv icon

PICD: Versatile Perceptual Image Compression with Diffusion Rendering

Add code
May 09, 2025
Figure 1 for PICD: Versatile Perceptual Image Compression with Diffusion Rendering
Figure 2 for PICD: Versatile Perceptual Image Compression with Diffusion Rendering
Figure 3 for PICD: Versatile Perceptual Image Compression with Diffusion Rendering
Figure 4 for PICD: Versatile Perceptual Image Compression with Diffusion Rendering
Viaarxiv icon

ReGraP-LLaVA: Reasoning enabled Graph-based Personalized Large Language and Vision Assistant

Add code
May 06, 2025
Figure 1 for ReGraP-LLaVA: Reasoning enabled Graph-based Personalized Large Language and Vision Assistant
Figure 2 for ReGraP-LLaVA: Reasoning enabled Graph-based Personalized Large Language and Vision Assistant
Figure 3 for ReGraP-LLaVA: Reasoning enabled Graph-based Personalized Large Language and Vision Assistant
Figure 4 for ReGraP-LLaVA: Reasoning enabled Graph-based Personalized Large Language and Vision Assistant
Viaarxiv icon

XeMap: Contextual Referring in Large-Scale Remote Sensing Environments

Add code
Apr 30, 2025
Figure 1 for XeMap: Contextual Referring in Large-Scale Remote Sensing Environments
Figure 2 for XeMap: Contextual Referring in Large-Scale Remote Sensing Environments
Figure 3 for XeMap: Contextual Referring in Large-Scale Remote Sensing Environments
Figure 4 for XeMap: Contextual Referring in Large-Scale Remote Sensing Environments
Viaarxiv icon

Ask2Loc: Learning to Locate Instructional Visual Answers by Asking Questions

Add code
Apr 22, 2025
Viaarxiv icon

Hierarchical Modeling for Medical Visual Question Answering with Cross-Attention Fusion

Add code
Apr 10, 2025
Figure 1 for Hierarchical Modeling for Medical Visual Question Answering with Cross-Attention Fusion
Figure 2 for Hierarchical Modeling for Medical Visual Question Answering with Cross-Attention Fusion
Figure 3 for Hierarchical Modeling for Medical Visual Question Answering with Cross-Attention Fusion
Figure 4 for Hierarchical Modeling for Medical Visual Question Answering with Cross-Attention Fusion
Viaarxiv icon

Event Signal Filtering via Probability Flux Estimation

Add code
Apr 10, 2025
Viaarxiv icon

EffOWT: Transfer Visual Language Models to Open-World Tracking Efficiently and Effectively

Add code
Apr 09, 2025
Viaarxiv icon