Picture for Mingfei Gao

Mingfei Gao

4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities

Add code
Jun 14, 2024
Viaarxiv icon

4M: Massively Multimodal Masked Modeling

Add code
Dec 11, 2023
Viaarxiv icon

Mask-free OVIS: Open-Vocabulary Instance Segmentation without Manual Mask Annotations

Add code
Mar 29, 2023
Figure 1 for Mask-free OVIS: Open-Vocabulary Instance Segmentation without Manual Mask Annotations
Figure 2 for Mask-free OVIS: Open-Vocabulary Instance Segmentation without Manual Mask Annotations
Figure 3 for Mask-free OVIS: Open-Vocabulary Instance Segmentation without Manual Mask Annotations
Figure 4 for Mask-free OVIS: Open-Vocabulary Instance Segmentation without Manual Mask Annotations
Viaarxiv icon

ULIP: Learning Unified Representation of Language, Image and Point Cloud for 3D Understanding

Add code
Dec 10, 2022
Figure 1 for ULIP: Learning Unified Representation of Language, Image and Point Cloud for 3D Understanding
Figure 2 for ULIP: Learning Unified Representation of Language, Image and Point Cloud for 3D Understanding
Figure 3 for ULIP: Learning Unified Representation of Language, Image and Point Cloud for 3D Understanding
Figure 4 for ULIP: Learning Unified Representation of Language, Image and Point Cloud for 3D Understanding
Viaarxiv icon

TAG: Boosting Text-VQA via Text-aware Visual Question-answer Generation

Add code
Aug 14, 2022
Figure 1 for TAG: Boosting Text-VQA via Text-aware Visual Question-answer Generation
Figure 2 for TAG: Boosting Text-VQA via Text-aware Visual Question-answer Generation
Figure 3 for TAG: Boosting Text-VQA via Text-aware Visual Question-answer Generation
Figure 4 for TAG: Boosting Text-VQA via Text-aware Visual Question-answer Generation
Viaarxiv icon

Value Retrieval with Arbitrary Queries for Form-like Documents

Add code
Dec 15, 2021
Figure 1 for Value Retrieval with Arbitrary Queries for Form-like Documents
Figure 2 for Value Retrieval with Arbitrary Queries for Form-like Documents
Figure 3 for Value Retrieval with Arbitrary Queries for Form-like Documents
Figure 4 for Value Retrieval with Arbitrary Queries for Form-like Documents
Viaarxiv icon

Burn After Reading: Online Adaptation for Cross-domain Streaming Data

Add code
Dec 08, 2021
Figure 1 for Burn After Reading: Online Adaptation for Cross-domain Streaming Data
Figure 2 for Burn After Reading: Online Adaptation for Cross-domain Streaming Data
Figure 3 for Burn After Reading: Online Adaptation for Cross-domain Streaming Data
Figure 4 for Burn After Reading: Online Adaptation for Cross-domain Streaming Data
Viaarxiv icon

Towards Open Vocabulary Object Detection without Human-provided Bounding Boxes

Add code
Nov 18, 2021
Figure 1 for Towards Open Vocabulary Object Detection without Human-provided Bounding Boxes
Figure 2 for Towards Open Vocabulary Object Detection without Human-provided Bounding Boxes
Figure 3 for Towards Open Vocabulary Object Detection without Human-provided Bounding Boxes
Figure 4 for Towards Open Vocabulary Object Detection without Human-provided Bounding Boxes
Viaarxiv icon

Robustness Evaluation of Transformer-based Form Field Extractors via Form Attacks

Add code
Oct 08, 2021
Figure 1 for Robustness Evaluation of Transformer-based Form Field Extractors via Form Attacks
Figure 2 for Robustness Evaluation of Transformer-based Form Field Extractors via Form Attacks
Figure 3 for Robustness Evaluation of Transformer-based Form Field Extractors via Form Attacks
Figure 4 for Robustness Evaluation of Transformer-based Form Field Extractors via Form Attacks
Viaarxiv icon

Field Extraction from Forms with Unlabeled Data

Add code
Oct 08, 2021
Figure 1 for Field Extraction from Forms with Unlabeled Data
Figure 2 for Field Extraction from Forms with Unlabeled Data
Figure 3 for Field Extraction from Forms with Unlabeled Data
Figure 4 for Field Extraction from Forms with Unlabeled Data
Viaarxiv icon