Picture for Zhe Chen

Zhe Chen

Smile upon the Face but Sadness in the Eyes: Emotion Recognition based on Facial Expressions and Eye Behaviors

Add code
Nov 19, 2024
Figure 1 for Smile upon the Face but Sadness in the Eyes: Emotion Recognition based on Facial Expressions and Eye Behaviors
Figure 2 for Smile upon the Face but Sadness in the Eyes: Emotion Recognition based on Facial Expressions and Eye Behaviors
Figure 3 for Smile upon the Face but Sadness in the Eyes: Emotion Recognition based on Facial Expressions and Eye Behaviors
Figure 4 for Smile upon the Face but Sadness in the Eyes: Emotion Recognition based on Facial Expressions and Eye Behaviors
Viaarxiv icon

Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization

Add code
Nov 15, 2024
Figure 1 for Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization
Figure 2 for Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization
Figure 3 for Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization
Figure 4 for Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization
Viaarxiv icon

Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance

Add code
Oct 21, 2024
Figure 1 for Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance
Figure 2 for Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance
Figure 3 for Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance
Figure 4 for Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance
Viaarxiv icon

MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding

Add code
Oct 15, 2024
Figure 1 for MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding
Figure 2 for MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding
Figure 3 for MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding
Figure 4 for MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding
Viaarxiv icon

Tracing Human Stress from Physiological Signals using UWB Radar

Add code
Oct 14, 2024
Viaarxiv icon

t-READi: Transformer-Powered Robust and Efficient Multimodal Inference for Autonomous Driving

Add code
Oct 13, 2024
Figure 1 for t-READi: Transformer-Powered Robust and Efficient Multimodal Inference for Autonomous Driving
Figure 2 for t-READi: Transformer-Powered Robust and Efficient Multimodal Inference for Autonomous Driving
Figure 3 for t-READi: Transformer-Powered Robust and Efficient Multimodal Inference for Autonomous Driving
Figure 4 for t-READi: Transformer-Powered Robust and Efficient Multimodal Inference for Autonomous Driving
Viaarxiv icon

Learning to Discover Generalized Facial Expressions

Add code
Sep 30, 2024
Figure 1 for Learning to Discover Generalized Facial Expressions
Figure 2 for Learning to Discover Generalized Facial Expressions
Figure 3 for Learning to Discover Generalized Facial Expressions
Figure 4 for Learning to Discover Generalized Facial Expressions
Viaarxiv icon

Towards Modality-agnostic Label-efficient Segmentation with Entropy-Regularized Distribution Alignment

Add code
Aug 29, 2024
Figure 1 for Towards Modality-agnostic Label-efficient Segmentation with Entropy-Regularized Distribution Alignment
Figure 2 for Towards Modality-agnostic Label-efficient Segmentation with Entropy-Regularized Distribution Alignment
Figure 3 for Towards Modality-agnostic Label-efficient Segmentation with Entropy-Regularized Distribution Alignment
Figure 4 for Towards Modality-agnostic Label-efficient Segmentation with Entropy-Regularized Distribution Alignment
Viaarxiv icon

ChemVLM: Exploring the Power of Multimodal Large Language Models in Chemistry Area

Add code
Aug 16, 2024
Figure 1 for ChemVLM: Exploring the Power of Multimodal Large Language Models in Chemistry Area
Figure 2 for ChemVLM: Exploring the Power of Multimodal Large Language Models in Chemistry Area
Figure 3 for ChemVLM: Exploring the Power of Multimodal Large Language Models in Chemistry Area
Figure 4 for ChemVLM: Exploring the Power of Multimodal Large Language Models in Chemistry Area
Viaarxiv icon

Seeing and Understanding: Bridging Vision with Chemical Knowledge Via ChemVLM

Add code
Aug 14, 2024
Figure 1 for Seeing and Understanding: Bridging Vision with Chemical Knowledge Via ChemVLM
Figure 2 for Seeing and Understanding: Bridging Vision with Chemical Knowledge Via ChemVLM
Figure 3 for Seeing and Understanding: Bridging Vision with Chemical Knowledge Via ChemVLM
Figure 4 for Seeing and Understanding: Bridging Vision with Chemical Knowledge Via ChemVLM
Viaarxiv icon