Picture for Taihao Li

Taihao Li

Follow the Clues, Frame the Truth: Hybrid-evidential Deductive Reasoning in Open-Vocabulary Multimodal Emotion Recognition

Add code
Mar 17, 2026
Viaarxiv icon

ProtRLSearch: A Multi-Round Multimodal Protein Search Agent with Large Language Models Trained via Reinforcement Learning

Add code
Mar 02, 2026
Viaarxiv icon

From Coarse to Nuanced: Cross-Modal Alignment of Fine-Grained Linguistic Cues and Visual Salient Regions for Dynamic Emotion Recognition

Add code
Jul 16, 2025
Viaarxiv icon

CORECODE: A Common Sense Annotated Dialogue Dataset with Benchmark Tasks for Chinese Large Language Models

Add code
Dec 20, 2023
Figure 1 for CORECODE: A Common Sense Annotated Dialogue Dataset with Benchmark Tasks for Chinese Large Language Models
Figure 2 for CORECODE: A Common Sense Annotated Dialogue Dataset with Benchmark Tasks for Chinese Large Language Models
Figure 3 for CORECODE: A Common Sense Annotated Dialogue Dataset with Benchmark Tasks for Chinese Large Language Models
Figure 4 for CORECODE: A Common Sense Annotated Dialogue Dataset with Benchmark Tasks for Chinese Large Language Models
Viaarxiv icon

RedCore: Relative Advantage Aware Cross-modal Representation Learning for Missing Modalities with Imbalanced Missing Rates

Add code
Dec 16, 2023
Figure 1 for RedCore: Relative Advantage Aware Cross-modal Representation Learning for Missing Modalities with Imbalanced Missing Rates
Figure 2 for RedCore: Relative Advantage Aware Cross-modal Representation Learning for Missing Modalities with Imbalanced Missing Rates
Figure 3 for RedCore: Relative Advantage Aware Cross-modal Representation Learning for Missing Modalities with Imbalanced Missing Rates
Figure 4 for RedCore: Relative Advantage Aware Cross-modal Representation Learning for Missing Modalities with Imbalanced Missing Rates
Viaarxiv icon

ShapeGPT: 3D Shape Generation with A Unified Multi-modal Language Model

Add code
Dec 01, 2023
Figure 1 for ShapeGPT: 3D Shape Generation with A Unified Multi-modal Language Model
Figure 2 for ShapeGPT: 3D Shape Generation with A Unified Multi-modal Language Model
Figure 3 for ShapeGPT: 3D Shape Generation with A Unified Multi-modal Language Model
Figure 4 for ShapeGPT: 3D Shape Generation with A Unified Multi-modal Language Model
Viaarxiv icon

Frame Pairwise Distance Loss for Weakly-supervised Sound Event Detection

Add code
Sep 21, 2023
Figure 1 for Frame Pairwise Distance Loss for Weakly-supervised Sound Event Detection
Figure 2 for Frame Pairwise Distance Loss for Weakly-supervised Sound Event Detection
Figure 3 for Frame Pairwise Distance Loss for Weakly-supervised Sound Event Detection
Figure 4 for Frame Pairwise Distance Loss for Weakly-supervised Sound Event Detection
Viaarxiv icon

Vote2Cap-DETR++: Decoupling Localization and Describing for End-to-End 3D Dense Captioning

Add code
Sep 06, 2023
Figure 1 for Vote2Cap-DETR++: Decoupling Localization and Describing for End-to-End 3D Dense Captioning
Figure 2 for Vote2Cap-DETR++: Decoupling Localization and Describing for End-to-End 3D Dense Captioning
Figure 3 for Vote2Cap-DETR++: Decoupling Localization and Describing for End-to-End 3D Dense Captioning
Figure 4 for Vote2Cap-DETR++: Decoupling Localization and Describing for End-to-End 3D Dense Captioning
Viaarxiv icon

Disentangling Prosody Representations with Unsupervised Speech Reconstruction

Add code
Dec 14, 2022
Figure 1 for Disentangling Prosody Representations with Unsupervised Speech Reconstruction
Figure 2 for Disentangling Prosody Representations with Unsupervised Speech Reconstruction
Figure 3 for Disentangling Prosody Representations with Unsupervised Speech Reconstruction
Figure 4 for Disentangling Prosody Representations with Unsupervised Speech Reconstruction
Viaarxiv icon

Parameter-Efficient Tuning on Layer Normalization for Pre-trained Language Models

Add code
Dec 09, 2022
Figure 1 for Parameter-Efficient Tuning on Layer Normalization for Pre-trained Language Models
Figure 2 for Parameter-Efficient Tuning on Layer Normalization for Pre-trained Language Models
Figure 3 for Parameter-Efficient Tuning on Layer Normalization for Pre-trained Language Models
Figure 4 for Parameter-Efficient Tuning on Layer Normalization for Pre-trained Language Models
Viaarxiv icon