Picture for Bo Zhang

Bo Zhang

ETSCL: An Evidence Theory-Based Supervised Contrastive Learning Framework for Multi-modal Glaucoma Grading

Add code
Jul 19, 2024
Viaarxiv icon

KUNPENG: An Embodied Large Model for Intelligent Maritime

Add code
Jul 12, 2024
Viaarxiv icon

Edge-guided and Cross-scale Feature Fusion Network for Efficient Multi-contrast MRI Super-Resolution

Add code
Jul 07, 2024
Viaarxiv icon

SUPER: Seated Upper Body Pose Estimation using mmWave Radars

Add code
Jul 02, 2024
Figure 1 for SUPER: Seated Upper Body Pose Estimation using mmWave Radars
Figure 2 for SUPER: Seated Upper Body Pose Estimation using mmWave Radars
Figure 3 for SUPER: Seated Upper Body Pose Estimation using mmWave Radars
Figure 4 for SUPER: Seated Upper Body Pose Estimation using mmWave Radars
Viaarxiv icon

A Comprehensive Taxonomy and Analysis of Talking Head Synthesis: Techniques for Portrait Generation, Driving Mechanisms, and Editing

Add code
Jun 18, 2024
Figure 1 for A Comprehensive Taxonomy and Analysis of Talking Head Synthesis: Techniques for Portrait Generation, Driving Mechanisms, and Editing
Figure 2 for A Comprehensive Taxonomy and Analysis of Talking Head Synthesis: Techniques for Portrait Generation, Driving Mechanisms, and Editing
Figure 3 for A Comprehensive Taxonomy and Analysis of Talking Head Synthesis: Techniques for Portrait Generation, Driving Mechanisms, and Editing
Figure 4 for A Comprehensive Taxonomy and Analysis of Talking Head Synthesis: Techniques for Portrait Generation, Driving Mechanisms, and Editing
Viaarxiv icon

DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models

Add code
Jun 17, 2024
Figure 1 for DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models
Figure 2 for DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models
Figure 3 for DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models
Figure 4 for DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models
Viaarxiv icon

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

Add code
Jun 13, 2024
Figure 1 for OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Figure 2 for OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Figure 3 for OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Figure 4 for OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Viaarxiv icon

Research on Early Warning Model of Cardiovascular Disease Based on Computer Deep Learning

Add code
Jun 13, 2024
Viaarxiv icon

OmniCorpus: An Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

Add code
Jun 12, 2024
Figure 1 for OmniCorpus: An Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Figure 2 for OmniCorpus: An Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Figure 3 for OmniCorpus: An Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Figure 4 for OmniCorpus: An Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Viaarxiv icon

MLVU: A Comprehensive Benchmark for Multi-Task Long Video Understanding

Add code
Jun 06, 2024
Viaarxiv icon