Picture for Fei Su

Fei Su

R1-ShareVL: Incentivizing Reasoning Capability of Multimodal Large Language Models via Share-GRPO

Add code
May 22, 2025
Viaarxiv icon

Multi-Channel Sequence-to-Sequence Neural Diarization: Experimental Results for The MISP 2025 Challenge

Add code
May 22, 2025
Viaarxiv icon

Beyond Time: Cross-Dimensional Frequency Supervision for Time Series Forecasting

Add code
May 16, 2025
Viaarxiv icon

CPAny: Couple With Any Encoder to Refer Multi-Object Tracking

Add code
Mar 10, 2025
Viaarxiv icon

Data-Efficient Generalization for Zero-shot Composed Image Retrieval

Add code
Mar 07, 2025
Viaarxiv icon

ICFNet: Integrated Cross-modal Fusion Network for Survival Prediction

Add code
Jan 06, 2025
Viaarxiv icon

Filter or Compensate: Towards Invariant Representation from Distribution Shift for Anomaly Detection

Add code
Dec 13, 2024
Viaarxiv icon

UFO: Enhancing Diffusion-Based Video Generation with a Uniform Frame Organizer

Add code
Dec 12, 2024
Viaarxiv icon

MoTaDual: Modality-Task Dual Alignment for Enhanced Zero-shot Composed Image Retrieval

Add code
Oct 31, 2024
Figure 1 for MoTaDual: Modality-Task Dual Alignment for Enhanced Zero-shot Composed Image Retrieval
Figure 2 for MoTaDual: Modality-Task Dual Alignment for Enhanced Zero-shot Composed Image Retrieval
Figure 3 for MoTaDual: Modality-Task Dual Alignment for Enhanced Zero-shot Composed Image Retrieval
Figure 4 for MoTaDual: Modality-Task Dual Alignment for Enhanced Zero-shot Composed Image Retrieval
Viaarxiv icon

Contactless Fingerprint Recognition Using 3D Graph Matching

Add code
Sep 13, 2024
Viaarxiv icon