Picture for Fei Ma

Fei Ma

Hierarchical Attention Fusion of Visual and Textual Representations for Cross-Domain Sequential Recommendation

Add code
Apr 21, 2025
Viaarxiv icon

MuseFace: Text-driven Face Editing via Diffusion-based Mask Generation Approach

Add code
Mar 31, 2025
Viaarxiv icon

Object Isolated Attention for Consistent Story Visualization

Add code
Mar 30, 2025
Viaarxiv icon

UniSync: A Unified Framework for Audio-Visual Synchronization

Add code
Mar 20, 2025
Viaarxiv icon

Observation-Graph Interaction and Key-Detail Guidance for Vision and Language Navigation

Add code
Mar 14, 2025
Viaarxiv icon

Exploring Embodied Multimodal Large Models: Development, Datasets, and Future Directions

Add code
Feb 21, 2025
Viaarxiv icon

Inter3D: A Benchmark and Strong Baseline for Human-Interactive 3D Object Reconstruction

Add code
Feb 19, 2025
Viaarxiv icon

EmoBench-M: Benchmarking Emotional Intelligence for Multimodal Large Language Models

Add code
Feb 06, 2025
Figure 1 for EmoBench-M: Benchmarking Emotional Intelligence for Multimodal Large Language Models
Figure 2 for EmoBench-M: Benchmarking Emotional Intelligence for Multimodal Large Language Models
Figure 3 for EmoBench-M: Benchmarking Emotional Intelligence for Multimodal Large Language Models
Figure 4 for EmoBench-M: Benchmarking Emotional Intelligence for Multimodal Large Language Models
Viaarxiv icon

Frequency-aware Event Cloud Network

Add code
Dec 30, 2024
Figure 1 for Frequency-aware Event Cloud Network
Figure 2 for Frequency-aware Event Cloud Network
Figure 3 for Frequency-aware Event Cloud Network
Figure 4 for Frequency-aware Event Cloud Network
Viaarxiv icon

Image Augmentation Agent for Weakly Supervised Semantic Segmentation

Add code
Dec 29, 2024
Viaarxiv icon