Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Junyeop Cha

Stage-Adaptive Reliability Modeling for Continuous Valence-Arousal Estimation

Mar 12, 2026

Yubeen Lee, Sangeun Lee, Junyeop Cha, Eunil Park

Abstract:Continuous valence-arousal estimation in real-world environments is challenging due to inconsistent modality reliability and interaction-dependent variability in audio-visual signals. Existing approaches primarily focus on modeling temporal dynamics, often overlooking the fact that modality reliability can vary substantially across interaction stages. To address this issue, we propose SAGE, a Stage-Adaptive reliability modeling framework that explicitly estimates and calibrates modality-wise confidence during multimodal integration. SAGE introduces a reliability-aware fusion mechanism that dynamically rebalances audio and visual representations according to their stage-dependent informativeness, preventing unreliable signals from dominating the prediction process. By separating reliability estimation from feature representation, the proposed framework enables more stable emotion estimation under cross-modal noise, occlusion, and varying interaction conditions. Extensive experiments on the Aff-Wild2 benchmark demonstrate that SAGE consistently improves concordance correlation coefficient scores compared with existing multimodal fusion approaches, highlighting the effectiveness of reliability-driven modeling for continuous affect prediction.

* 8 pages, 3 figures, 2 pages

Via

Access Paper or Ask Questions

MOGAM: A Multimodal Object-oriented Graph Attention Model for Depression Detection

Mar 21, 2024

Junyeop Cha, Seoyun Kim, Dongjae Kim, Eunil Park

Figure 1 for MOGAM: A Multimodal Object-oriented Graph Attention Model for Depression Detection

Figure 2 for MOGAM: A Multimodal Object-oriented Graph Attention Model for Depression Detection

Figure 3 for MOGAM: A Multimodal Object-oriented Graph Attention Model for Depression Detection

Figure 4 for MOGAM: A Multimodal Object-oriented Graph Attention Model for Depression Detection

Abstract:Early detection plays a crucial role in the treatment of depression. Therefore, numerous studies have focused on social media platforms, where individuals express their emotions, aiming to achieve early detection of depression. However, the majority of existing approaches often rely on specific features, leading to limited scalability across different types of social media datasets, such as text, images, or videos. To overcome this limitation, we introduce a Multimodal Object-Oriented Graph Attention Model (MOGAM), which can be applied to diverse types of data, offering a more scalable and versatile solution. Furthermore, to ensure that our model can capture authentic symptoms of depression, we only include vlogs from users with a clinical diagnosis. To leverage the diverse features of vlogs, we adopt a multimodal approach and collect additional metadata such as the title, description, and duration of the vlogs. To effectively aggregate these multimodal features, we employed a cross-attention mechanism. MOGAM achieved an accuracy of 0.871 and an F1-score of 0.888. Moreover, to validate the scalability of MOGAM, we evaluated its performance with a benchmark dataset and achieved comparable results with prior studies (0.61 F1-score). In conclusion, we believe that the proposed model, MOGAM, is an effective solution for detecting depression in social media, offering potential benefits in the early detection and treatment of this mental health condition.

* 12 pages, 3 figures, 4 tables

Via

Access Paper or Ask Questions