Picture for Xiongkuo Min

Xiongkuo Min

LMM4LMM: Benchmarking and Evaluating Large-multimodal Image Generation with LMMs

Add code
Apr 11, 2025
Figure 1 for LMM4LMM: Benchmarking and Evaluating Large-multimodal Image Generation with LMMs
Figure 2 for LMM4LMM: Benchmarking and Evaluating Large-multimodal Image Generation with LMMs
Figure 3 for LMM4LMM: Benchmarking and Evaluating Large-multimodal Image Generation with LMMs
Figure 4 for LMM4LMM: Benchmarking and Evaluating Large-multimodal Image Generation with LMMs
Viaarxiv icon

Q-Agent: Quality-Driven Chain-of-Thought Image Restoration Agent through Robust Multimodal Large Language Model

Add code
Apr 09, 2025
Viaarxiv icon

Mesh Mamba: A Unified State Space Model for Saliency Prediction in Non-Textured and Textured Meshes

Add code
Apr 02, 2025
Viaarxiv icon

Mitigating Low-Level Visual Hallucinations Requires Self-Awareness: Database, Model and Training Strategy

Add code
Mar 27, 2025
Figure 1 for Mitigating Low-Level Visual Hallucinations Requires Self-Awareness: Database, Model and Training Strategy
Figure 2 for Mitigating Low-Level Visual Hallucinations Requires Self-Awareness: Database, Model and Training Strategy
Figure 3 for Mitigating Low-Level Visual Hallucinations Requires Self-Awareness: Database, Model and Training Strategy
Figure 4 for Mitigating Low-Level Visual Hallucinations Requires Self-Awareness: Database, Model and Training Strategy
Viaarxiv icon

Information Density Principle for MLLM Benchmarks

Add code
Mar 13, 2025
Viaarxiv icon

Image Quality Assessment: From Human to Machine Preference

Add code
Mar 13, 2025
Viaarxiv icon

Q-Eval-100K: Evaluating Visual Quality and Alignment Level for Text-to-Vision Content

Add code
Mar 05, 2025
Figure 1 for Q-Eval-100K: Evaluating Visual Quality and Alignment Level for Text-to-Vision Content
Figure 2 for Q-Eval-100K: Evaluating Visual Quality and Alignment Level for Text-to-Vision Content
Figure 3 for Q-Eval-100K: Evaluating Visual Quality and Alignment Level for Text-to-Vision Content
Figure 4 for Q-Eval-100K: Evaluating Visual Quality and Alignment Level for Text-to-Vision Content
Viaarxiv icon

Multi-Dimensional Quality Assessment for Text-to-3D Assets: Dataset and Model

Add code
Feb 24, 2025
Viaarxiv icon

AGAV-Rater: Adapting Large Multimodal Model for AI-Generated Audio-Visual Quality Assessment

Add code
Jan 30, 2025
Figure 1 for AGAV-Rater: Adapting Large Multimodal Model for AI-Generated Audio-Visual Quality Assessment
Figure 2 for AGAV-Rater: Adapting Large Multimodal Model for AI-Generated Audio-Visual Quality Assessment
Figure 3 for AGAV-Rater: Adapting Large Multimodal Model for AI-Generated Audio-Visual Quality Assessment
Figure 4 for AGAV-Rater: Adapting Large Multimodal Model for AI-Generated Audio-Visual Quality Assessment
Viaarxiv icon

Facial Attractiveness Prediction in Live Streaming: A New Benchmark and Multi-modal Method

Add code
Jan 05, 2025
Figure 1 for Facial Attractiveness Prediction in Live Streaming: A New Benchmark and Multi-modal Method
Figure 2 for Facial Attractiveness Prediction in Live Streaming: A New Benchmark and Multi-modal Method
Figure 3 for Facial Attractiveness Prediction in Live Streaming: A New Benchmark and Multi-modal Method
Figure 4 for Facial Attractiveness Prediction in Live Streaming: A New Benchmark and Multi-modal Method
Viaarxiv icon