Picture for Peng Zhang

Peng Zhang

Are MLMs Trapped in the Visual Room?

Add code
May 29, 2025
Viaarxiv icon

Emotion-o1: Adaptive Long Reasoning for Emotion Understanding in LLMs

Add code
May 28, 2025
Viaarxiv icon

Exploring Timeline Control for Facial Motion Generation

Add code
May 27, 2025
Viaarxiv icon

Bidirectional Knowledge Distillation for Enhancing Sequential Recommendation with Large Language Models

Add code
May 23, 2025
Viaarxiv icon

Ultrasound Report Generation with Multimodal Large Language Models for Standardized Texts

Add code
May 13, 2025
Viaarxiv icon

Advancing Generalizable Tumor Segmentation with Anomaly-Aware Open-Vocabulary Attention Maps and Frozen Foundation Diffusion Models

Add code
May 05, 2025
Viaarxiv icon

Fast-Slow Thinking for Large Vision-Language Model Reasoning

Add code
Apr 25, 2025
Viaarxiv icon

FedCIA: Federated Collaborative Information Aggregation for Privacy-Preserving Recommendation

Add code
Apr 19, 2025
Viaarxiv icon

Self-Supervised Enhancement of Forward-Looking Sonar Images: Bridging Cross-Modal Degradation Gaps through Feature Space Transformation and Multi-Frame Fusion

Add code
Apr 16, 2025
Viaarxiv icon

Span-level Emotion-Cause-Category Triplet Extraction with Instruction Tuning LLMs and Data Augmentation

Add code
Apr 13, 2025
Viaarxiv icon