Social Media Image Understanding


Towards Automated Community Notes Generation with Large Vision Language Models for Combating Contextual Deception

Add code
Mar 23, 2026
Viaarxiv icon

Xray-Visual Models: Scaling Vision models on Industry Scale Data

Add code
Feb 18, 2026
Viaarxiv icon

Multimodal Climate Disinformation Detection: Integrating Vision-Language Models with External Knowledge Sources

Add code
Jan 22, 2026
Viaarxiv icon

Enhancing Meme Emotion Understanding with Multi-Level Modality Enhancement and Dual-Stage Modal Fusion

Add code
Nov 14, 2025
Viaarxiv icon

Do Multimodal LLMs See Sentiment?

Add code
Aug 23, 2025
Viaarxiv icon

From Pixels to Damage Severity: Estimating Earthquake Impacts Using Semantic Segmentation of Social Media Images

Add code
Jul 03, 2025
Viaarxiv icon

KEVER^2: Knowledge-Enhanced Visual Emotion Reasoning and Retrieval

Add code
May 30, 2025
Viaarxiv icon

SNS-Bench-VL: Benchmarking Multimodal Large Language Models in Social Networking Services

Add code
May 29, 2025
Viaarxiv icon

LENS: Multi-level Evaluation of Multimodal Reasoning with Large Language Models

Add code
May 21, 2025
Viaarxiv icon

MT$^{3}$: Scaling MLLM-based Text Image Machine Translation via Multi-Task Reinforcement Learning

Add code
May 26, 2025
Figure 1 for MT$^{3}$: Scaling MLLM-based Text Image Machine Translation via Multi-Task Reinforcement Learning
Figure 2 for MT$^{3}$: Scaling MLLM-based Text Image Machine Translation via Multi-Task Reinforcement Learning
Figure 3 for MT$^{3}$: Scaling MLLM-based Text Image Machine Translation via Multi-Task Reinforcement Learning
Figure 4 for MT$^{3}$: Scaling MLLM-based Text Image Machine Translation via Multi-Task Reinforcement Learning
Viaarxiv icon