Picture for Fei Ma

Fei Ma

Active Multimodal Distillation for Few-shot Action Recognition

Add code
Jun 16, 2025
Viaarxiv icon

VReST: Enhancing Reasoning in Large Vision-Language Models through Tree Search and Self-Reward Mechanism

Add code
Jun 10, 2025
Viaarxiv icon

SatelliteFormula: Multi-Modal Symbolic Regression from Remote Sensing Imagery for Physics Discovery

Add code
Jun 06, 2025
Viaarxiv icon

Universal Visuo-Tactile Video Understanding for Embodied Interaction

Add code
May 28, 2025
Viaarxiv icon

Beyond Empathy: Integrating Diagnostic and Therapeutic Reasoning with Large Language Models for Mental Health Counseling

Add code
May 21, 2025
Viaarxiv icon

RD-UIE: Relation-Driven State Space Modeling for Underwater Image Enhancement

Add code
May 02, 2025
Viaarxiv icon

Hierarchical Attention Fusion of Visual and Textual Representations for Cross-Domain Sequential Recommendation

Add code
Apr 21, 2025
Viaarxiv icon

MuseFace: Text-driven Face Editing via Diffusion-based Mask Generation Approach

Add code
Mar 31, 2025
Viaarxiv icon

Object Isolated Attention for Consistent Story Visualization

Add code
Mar 30, 2025
Viaarxiv icon

UniSync: A Unified Framework for Audio-Visual Synchronization

Add code
Mar 20, 2025
Viaarxiv icon