Picture for Yaoting Wang

Yaoting Wang

School of computer science and technology, Tiangong University, Tianjin Key Laboratory of Autonomous Intelligence Technology and Systems

On Path to Multimodal Generalist: General-Level and General-Bench

Add code
May 07, 2025
Viaarxiv icon

Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

Add code
Mar 16, 2025
Viaarxiv icon

AVTrustBench: Assessing and Enhancing Reliability and Robustness in Audio-Visual LLMs

Add code
Jan 03, 2025
Figure 1 for AVTrustBench: Assessing and Enhancing Reliability and Robustness in Audio-Visual LLMs
Figure 2 for AVTrustBench: Assessing and Enhancing Reliability and Robustness in Audio-Visual LLMs
Figure 3 for AVTrustBench: Assessing and Enhancing Reliability and Robustness in Audio-Visual LLMs
Figure 4 for AVTrustBench: Assessing and Enhancing Reliability and Robustness in Audio-Visual LLMs
Viaarxiv icon

Stepping Stones: A Progressive Training Strategy for Audio-Visual Semantic Segmentation

Add code
Jul 16, 2024
Viaarxiv icon

Ref-AVS: Refer and Segment Objects in Audio-Visual Scenes

Add code
Jul 15, 2024
Viaarxiv icon

Can Textual Semantics Mitigate Sounding Object Segmentation Preference?

Add code
Jul 15, 2024
Viaarxiv icon

Prompting Segmentation with Sound is Generalizable Audio-Visual Source Localizer

Add code
Sep 18, 2023
Viaarxiv icon

Cross-Attention is Not Enough: Incongruity-Aware Multimodal Sentiment Analysis and Emotion Recognition

Add code
May 23, 2023
Viaarxiv icon

Aesthetic Quality Assessment for Group photograph

Add code
Feb 04, 2020
Figure 1 for Aesthetic Quality Assessment for Group photograph
Figure 2 for Aesthetic Quality Assessment for Group photograph
Figure 3 for Aesthetic Quality Assessment for Group photograph
Figure 4 for Aesthetic Quality Assessment for Group photograph
Viaarxiv icon