Multi Modal Document Classification


Towards Scalable and Cross-Lingual Specialist Language Models for Oncology

Add code
Mar 11, 2025
Viaarxiv icon

Predicting Cardiopulmonary Exercise Testing Outcomes in Congenital Heart Disease Through Multi-modal Data Integration and Geometric Learning

Add code
Mar 18, 2025
Figure 1 for Predicting Cardiopulmonary Exercise Testing Outcomes in Congenital Heart Disease Through Multi-modal Data Integration and Geometric Learning
Figure 2 for Predicting Cardiopulmonary Exercise Testing Outcomes in Congenital Heart Disease Through Multi-modal Data Integration and Geometric Learning
Figure 3 for Predicting Cardiopulmonary Exercise Testing Outcomes in Congenital Heart Disease Through Multi-modal Data Integration and Geometric Learning
Figure 4 for Predicting Cardiopulmonary Exercise Testing Outcomes in Congenital Heart Disease Through Multi-modal Data Integration and Geometric Learning
Viaarxiv icon

WordVIS: A Color Worth A Thousand Words

Add code
Dec 13, 2024
Figure 1 for WordVIS: A Color Worth A Thousand Words
Figure 2 for WordVIS: A Color Worth A Thousand Words
Figure 3 for WordVIS: A Color Worth A Thousand Words
Figure 4 for WordVIS: A Color Worth A Thousand Words
Viaarxiv icon

Deep BI-RADS Network for Improved Cancer Detection from Mammograms

Add code
Nov 16, 2024
Viaarxiv icon

Out-of-Distribution Detection with Attention Head Masking for Multimodal Document Classification

Add code
Aug 20, 2024
Figure 1 for Out-of-Distribution Detection with Attention Head Masking for Multimodal Document Classification
Figure 2 for Out-of-Distribution Detection with Attention Head Masking for Multimodal Document Classification
Figure 3 for Out-of-Distribution Detection with Attention Head Masking for Multimodal Document Classification
Figure 4 for Out-of-Distribution Detection with Attention Head Masking for Multimodal Document Classification
Viaarxiv icon

Hierarchical Multi-modal Transformer for Cross-modal Long Document Classification

Add code
Jul 14, 2024
Figure 1 for Hierarchical Multi-modal Transformer for Cross-modal Long Document Classification
Figure 2 for Hierarchical Multi-modal Transformer for Cross-modal Long Document Classification
Figure 3 for Hierarchical Multi-modal Transformer for Cross-modal Long Document Classification
Figure 4 for Hierarchical Multi-modal Transformer for Cross-modal Long Document Classification
Viaarxiv icon

FinEmbedDiff: A Cost-Effective Approach of Classifying Financial Documents with Vector Sampling using Multi-modal Embedding Models

Add code
May 28, 2024
Figure 1 for FinEmbedDiff: A Cost-Effective Approach of Classifying Financial Documents with Vector Sampling using Multi-modal Embedding Models
Figure 2 for FinEmbedDiff: A Cost-Effective Approach of Classifying Financial Documents with Vector Sampling using Multi-modal Embedding Models
Figure 3 for FinEmbedDiff: A Cost-Effective Approach of Classifying Financial Documents with Vector Sampling using Multi-modal Embedding Models
Figure 4 for FinEmbedDiff: A Cost-Effective Approach of Classifying Financial Documents with Vector Sampling using Multi-modal Embedding Models
Viaarxiv icon

FungiTastic: A multi-modal dataset and benchmark for image categorization

Add code
Aug 24, 2024
Figure 1 for FungiTastic: A multi-modal dataset and benchmark for image categorization
Figure 2 for FungiTastic: A multi-modal dataset and benchmark for image categorization
Figure 3 for FungiTastic: A multi-modal dataset and benchmark for image categorization
Figure 4 for FungiTastic: A multi-modal dataset and benchmark for image categorization
Viaarxiv icon

DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models

Add code
Jun 17, 2024
Figure 1 for DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models
Figure 2 for DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models
Figure 3 for DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models
Figure 4 for DocGenome: An Open Large-scale Scientific Document Benchmark for Training and Testing Multi-modal Large Language Models
Viaarxiv icon

BuDDIE: A Business Document Dataset for Multi-task Information Extraction

Add code
Apr 05, 2024
Viaarxiv icon