Picture for Xiaojiang Peng

Xiaojiang Peng

FlexEdit: Marrying Free-Shape Masks to VLLM for Flexible Image Editing

Add code
Aug 22, 2024
Figure 1 for FlexEdit: Marrying Free-Shape Masks to VLLM for Flexible Image Editing
Figure 2 for FlexEdit: Marrying Free-Shape Masks to VLLM for Flexible Image Editing
Figure 3 for FlexEdit: Marrying Free-Shape Masks to VLLM for Flexible Image Editing
Figure 4 for FlexEdit: Marrying Free-Shape Masks to VLLM for Flexible Image Editing
Viaarxiv icon

SZTU-CMU at MER2024: Improving Emotion-LLaMA with Conv-Attention for Multimodal Emotion Recognition

Add code
Aug 21, 2024
Viaarxiv icon

DSMix: Distortion-Induced Sensitivity Map Based Pre-training for No-Reference Image Quality Assessment

Add code
Jul 04, 2024
Viaarxiv icon

Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning

Add code
Jun 17, 2024
Figure 1 for Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning
Figure 2 for Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning
Figure 3 for Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning
Figure 4 for Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning
Viaarxiv icon

Dataset Growth

Add code
May 28, 2024
Figure 1 for Dataset Growth
Figure 2 for Dataset Growth
Figure 3 for Dataset Growth
Figure 4 for Dataset Growth
Viaarxiv icon

A Closer Look at Time Steps is Worthy of Triple Speed-Up for Diffusion Model Training

Add code
May 27, 2024
Viaarxiv icon

MM-TTS: A Unified Framework for Multimodal, Prompt-Induced Emotional Text-to-Speech Synthesis

Add code
Apr 29, 2024
Figure 1 for MM-TTS: A Unified Framework for Multimodal, Prompt-Induced Emotional Text-to-Speech Synthesis
Figure 2 for MM-TTS: A Unified Framework for Multimodal, Prompt-Induced Emotional Text-to-Speech Synthesis
Figure 3 for MM-TTS: A Unified Framework for Multimodal, Prompt-Induced Emotional Text-to-Speech Synthesis
Figure 4 for MM-TTS: A Unified Framework for Multimodal, Prompt-Induced Emotional Text-to-Speech Synthesis
Viaarxiv icon

Two in One Go: Single-stage Emotion Recognition with Decoupled Subject-context Transformer

Add code
Apr 29, 2024
Viaarxiv icon

LEAF: Unveiling Two Sides of the Same Coin in Semi-supervised Facial Expression Recognition

Add code
Apr 26, 2024
Viaarxiv icon

MIPS at SemEval-2024 Task 3: Multimodal Emotion-Cause Pair Extraction in Conversations with Multimodal Language Models

Add code
Apr 11, 2024
Viaarxiv icon