Picture for Bo Wang

Bo Wang

Tencent, WeChat Pay

Exploring the Design Space of 3D MLLMs for CT Report Generation

Add code
Jun 26, 2025
Viaarxiv icon

Commander-GPT: Dividing and Routing for Multimodal Sarcasm Detection

Add code
Jun 24, 2025
Viaarxiv icon

jina-embeddings-v4: Universal Embeddings for Multimodal Multilingual Retrieval

Add code
Jun 24, 2025
Viaarxiv icon

NTIRE 2025 Image Shadow Removal Challenge Report

Add code
Jun 18, 2025
Viaarxiv icon

GenBreak: Red Teaming Text-to-Image Generators Using Large Language Models

Add code
Jun 11, 2025
Viaarxiv icon

GUI-Reflection: Empowering Multimodal GUI Models with Self-Reflection Behavior

Add code
Jun 09, 2025
Viaarxiv icon

Decoding Knowledge Attribution in Mixture-of-Experts: A Framework of Basic-Refinement Collaboration and Efficiency Analysis

Add code
May 30, 2025
Viaarxiv icon

BioReason: Incentivizing Multimodal Biological Reasoning within a DNA-LLM Model

Add code
May 29, 2025
Viaarxiv icon

Frame-Level Captions for Long Video Generation with Complex Multi Scenes

Add code
May 27, 2025
Viaarxiv icon

REARANK: Reasoning Re-ranking Agent via Reinforcement Learning

Add code
May 26, 2025
Viaarxiv icon