Picture for Guoxin Wang

Guoxin Wang

PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation

Add code
Mar 14, 2024
Figure 1 for PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation
Figure 2 for PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation
Figure 3 for PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation
Figure 4 for PYRA: Parallel Yielding Re-Activation for Training-Inference Efficient Task Adaptation
Viaarxiv icon

A Bi-Pyramid Multimodal Fusion Method for the Diagnosis of Bipolar Disorders

Add code
Jan 15, 2024
Figure 1 for A Bi-Pyramid Multimodal Fusion Method for the Diagnosis of Bipolar Disorders
Figure 2 for A Bi-Pyramid Multimodal Fusion Method for the Diagnosis of Bipolar Disorders
Figure 3 for A Bi-Pyramid Multimodal Fusion Method for the Diagnosis of Bipolar Disorders
Figure 4 for A Bi-Pyramid Multimodal Fusion Method for the Diagnosis of Bipolar Disorders
Viaarxiv icon

Unsupervised Pre-Training Using Masked Autoencoders for ECG Analysis

Add code
Oct 17, 2023
Figure 1 for Unsupervised Pre-Training Using Masked Autoencoders for ECG Analysis
Figure 2 for Unsupervised Pre-Training Using Masked Autoencoders for ECG Analysis
Figure 3 for Unsupervised Pre-Training Using Masked Autoencoders for ECG Analysis
Figure 4 for Unsupervised Pre-Training Using Masked Autoencoders for ECG Analysis
Viaarxiv icon

Multi-Dimension-Embedding-Aware Modality Fusion Transformer for Psychiatric Disorder Clasification

Add code
Oct 04, 2023
Figure 1 for Multi-Dimension-Embedding-Aware Modality Fusion Transformer for Psychiatric Disorder Clasification
Figure 2 for Multi-Dimension-Embedding-Aware Modality Fusion Transformer for Psychiatric Disorder Clasification
Figure 3 for Multi-Dimension-Embedding-Aware Modality Fusion Transformer for Psychiatric Disorder Clasification
Figure 4 for Multi-Dimension-Embedding-Aware Modality Fusion Transformer for Psychiatric Disorder Clasification
Viaarxiv icon

Kosmos-2.5: A Multimodal Literate Model

Add code
Sep 20, 2023
Figure 1 for Kosmos-2.5: A Multimodal Literate Model
Figure 2 for Kosmos-2.5: A Multimodal Literate Model
Figure 3 for Kosmos-2.5: A Multimodal Literate Model
Figure 4 for Kosmos-2.5: A Multimodal Literate Model
Viaarxiv icon

Unifying Vision, Text, and Layout for Universal Document Processing

Add code
Dec 20, 2022
Figure 1 for Unifying Vision, Text, and Layout for Universal Document Processing
Figure 2 for Unifying Vision, Text, and Layout for Universal Document Processing
Figure 3 for Unifying Vision, Text, and Layout for Universal Document Processing
Figure 4 for Unifying Vision, Text, and Layout for Universal Document Processing
Viaarxiv icon

Understanding Long Documents with Different Position-Aware Attentions

Add code
Aug 17, 2022
Figure 1 for Understanding Long Documents with Different Position-Aware Attentions
Figure 2 for Understanding Long Documents with Different Position-Aware Attentions
Figure 3 for Understanding Long Documents with Different Position-Aware Attentions
Figure 4 for Understanding Long Documents with Different Position-Aware Attentions
Viaarxiv icon

BoningKnife: Joint Entity Mention Detection and Typing for Nested NER via prior Boundary Knowledge

Add code
Jul 20, 2021
Figure 1 for BoningKnife: Joint Entity Mention Detection and Typing for Nested NER via prior Boundary Knowledge
Figure 2 for BoningKnife: Joint Entity Mention Detection and Typing for Nested NER via prior Boundary Knowledge
Figure 3 for BoningKnife: Joint Entity Mention Detection and Typing for Nested NER via prior Boundary Knowledge
Figure 4 for BoningKnife: Joint Entity Mention Detection and Typing for Nested NER via prior Boundary Knowledge
Viaarxiv icon

Understanding Chinese Video and Language via Contrastive Multimodal Pre-Training

Add code
Apr 19, 2021
Figure 1 for Understanding Chinese Video and Language via Contrastive Multimodal Pre-Training
Figure 2 for Understanding Chinese Video and Language via Contrastive Multimodal Pre-Training
Figure 3 for Understanding Chinese Video and Language via Contrastive Multimodal Pre-Training
Figure 4 for Understanding Chinese Video and Language via Contrastive Multimodal Pre-Training
Viaarxiv icon

LayoutXLM: Multimodal Pre-training for Multilingual Visually-rich Document Understanding

Add code
Apr 18, 2021
Figure 1 for LayoutXLM: Multimodal Pre-training for Multilingual Visually-rich Document Understanding
Figure 2 for LayoutXLM: Multimodal Pre-training for Multilingual Visually-rich Document Understanding
Figure 3 for LayoutXLM: Multimodal Pre-training for Multilingual Visually-rich Document Understanding
Figure 4 for LayoutXLM: Multimodal Pre-training for Multilingual Visually-rich Document Understanding
Viaarxiv icon