Picture for Houdong Hu

Houdong Hu

Training Small Multimodal Models to Bridge Biomedical Competency Gap: A Case Study in Radiology Imaging

Add code
Mar 20, 2024
Figure 1 for Training Small Multimodal Models to Bridge Biomedical Competency Gap: A Case Study in Radiology Imaging
Figure 2 for Training Small Multimodal Models to Bridge Biomedical Competency Gap: A Case Study in Radiology Imaging
Figure 3 for Training Small Multimodal Models to Bridge Biomedical Competency Gap: A Case Study in Radiology Imaging
Figure 4 for Training Small Multimodal Models to Bridge Biomedical Competency Gap: A Case Study in Radiology Imaging
Viaarxiv icon

Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks

Nov 10, 2023
Figure 1 for Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
Figure 2 for Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
Figure 3 for Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
Figure 4 for Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks
Viaarxiv icon

ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models

Add code
Apr 20, 2022
Figure 1 for ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models
Figure 2 for ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models
Figure 3 for ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models
Figure 4 for ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models
Viaarxiv icon

MMPTRACK: Large-scale Densely Annotated Multi-camera Multiple People Tracking Benchmark

Nov 30, 2021
Figure 1 for MMPTRACK: Large-scale Densely Annotated Multi-camera Multiple People Tracking Benchmark
Figure 2 for MMPTRACK: Large-scale Densely Annotated Multi-camera Multiple People Tracking Benchmark
Figure 3 for MMPTRACK: Large-scale Densely Annotated Multi-camera Multiple People Tracking Benchmark
Figure 4 for MMPTRACK: Large-scale Densely Annotated Multi-camera Multiple People Tracking Benchmark
Viaarxiv icon

Florence: A New Foundation Model for Computer Vision

Nov 22, 2021
Figure 1 for Florence: A New Foundation Model for Computer Vision
Figure 2 for Florence: A New Foundation Model for Computer Vision
Figure 3 for Florence: A New Foundation Model for Computer Vision
Figure 4 for Florence: A New Foundation Model for Computer Vision
Viaarxiv icon

Image Scene Graph Generation (SGG) Benchmark

Add code
Jul 27, 2021
Figure 1 for Image Scene Graph Generation (SGG) Benchmark
Figure 2 for Image Scene Graph Generation (SGG) Benchmark
Figure 3 for Image Scene Graph Generation (SGG) Benchmark
Figure 4 for Image Scene Graph Generation (SGG) Benchmark
Viaarxiv icon

Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks

Add code
May 18, 2020
Figure 1 for Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks
Figure 2 for Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks
Figure 3 for Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks
Figure 4 for Oscar: Object-Semantics Aligned Pre-training for Vision-Language Tasks
Viaarxiv icon

Applications of Generative Adversarial Models in Visual Search Reformulation

Oct 28, 2019
Figure 1 for Applications of Generative Adversarial Models in Visual Search Reformulation
Figure 2 for Applications of Generative Adversarial Models in Visual Search Reformulation
Figure 3 for Applications of Generative Adversarial Models in Visual Search Reformulation
Viaarxiv icon

Unified Vision-Language Pre-Training for Image Captioning and VQA

Add code
Oct 03, 2019
Figure 1 for Unified Vision-Language Pre-Training for Image Captioning and VQA
Figure 2 for Unified Vision-Language Pre-Training for Image Captioning and VQA
Figure 3 for Unified Vision-Language Pre-Training for Image Captioning and VQA
Figure 4 for Unified Vision-Language Pre-Training for Image Captioning and VQA
Viaarxiv icon

Learning Visual Relation Priors for Image-Text Matching and Image Captioning with Neural Scene Graph Generators

Add code
Sep 22, 2019
Figure 1 for Learning Visual Relation Priors for Image-Text Matching and Image Captioning with Neural Scene Graph Generators
Figure 2 for Learning Visual Relation Priors for Image-Text Matching and Image Captioning with Neural Scene Graph Generators
Figure 3 for Learning Visual Relation Priors for Image-Text Matching and Image Captioning with Neural Scene Graph Generators
Figure 4 for Learning Visual Relation Priors for Image-Text Matching and Image Captioning with Neural Scene Graph Generators
Viaarxiv icon