Picture for Alan Yuille

Alan Yuille

Johns Hopkins University

ExoViP: Step-by-step Verification and Exploration with Exoskeleton Modules for Compositional Visual Reasoning

Add code
Aug 05, 2024
Viaarxiv icon

AbdomenAtlas: A Large-Scale, Detailed-Annotated, & Multi-Center Dataset for Efficient Transfer Learning and Open Algorithmic Benchmarking

Add code
Jul 23, 2024
Figure 1 for AbdomenAtlas: A Large-Scale, Detailed-Annotated, & Multi-Center Dataset for Efficient Transfer Learning and Open Algorithmic Benchmarking
Figure 2 for AbdomenAtlas: A Large-Scale, Detailed-Annotated, & Multi-Center Dataset for Efficient Transfer Learning and Open Algorithmic Benchmarking
Figure 3 for AbdomenAtlas: A Large-Scale, Detailed-Annotated, & Multi-Center Dataset for Efficient Transfer Learning and Open Algorithmic Benchmarking
Figure 4 for AbdomenAtlas: A Large-Scale, Detailed-Annotated, & Multi-Center Dataset for Efficient Transfer Learning and Open Algorithmic Benchmarking
Viaarxiv icon

Rethinking Video-Text Understanding: Retrieval from Counterfactually Augmented Data

Add code
Jul 18, 2024
Viaarxiv icon

iNeMo: Incremental Neural Mesh Models for Robust Class-Incremental Learning

Add code
Jul 12, 2024
Viaarxiv icon

CamFreeDiff: Camera-free Image to Panorama Generation with Diffusion Model

Add code
Jul 09, 2024
Figure 1 for CamFreeDiff: Camera-free Image to Panorama Generation with Diffusion Model
Figure 2 for CamFreeDiff: Camera-free Image to Panorama Generation with Diffusion Model
Figure 3 for CamFreeDiff: Camera-free Image to Panorama Generation with Diffusion Model
Figure 4 for CamFreeDiff: Camera-free Image to Panorama Generation with Diffusion Model
Viaarxiv icon

Embracing Massive Medical Data

Add code
Jul 05, 2024
Viaarxiv icon

LLaVolta: Efficient Multi-modal Models via Stage-wise Visual Context Compression

Add code
Jun 28, 2024
Figure 1 for LLaVolta: Efficient Multi-modal Models via Stage-wise Visual Context Compression
Figure 2 for LLaVolta: Efficient Multi-modal Models via Stage-wise Visual Context Compression
Figure 3 for LLaVolta: Efficient Multi-modal Models via Stage-wise Visual Context Compression
Figure 4 for LLaVolta: Efficient Multi-modal Models via Stage-wise Visual Context Compression
Viaarxiv icon

ImageNet3D: Towards General-Purpose Object-Level 3D Understanding

Add code
Jun 13, 2024
Figure 1 for ImageNet3D: Towards General-Purpose Object-Level 3D Understanding
Figure 2 for ImageNet3D: Towards General-Purpose Object-Level 3D Understanding
Figure 3 for ImageNet3D: Towards General-Purpose Object-Level 3D Understanding
Figure 4 for ImageNet3D: Towards General-Purpose Object-Level 3D Understanding
Viaarxiv icon

Autoregressive Pretraining with Mamba in Vision

Add code
Jun 11, 2024
Figure 1 for Autoregressive Pretraining with Mamba in Vision
Figure 2 for Autoregressive Pretraining with Mamba in Vision
Figure 3 for Autoregressive Pretraining with Mamba in Vision
Figure 4 for Autoregressive Pretraining with Mamba in Vision
Viaarxiv icon

Medical Vision Generalist: Unifying Medical Imaging Tasks in Context

Add code
Jun 08, 2024
Figure 1 for Medical Vision Generalist: Unifying Medical Imaging Tasks in Context
Figure 2 for Medical Vision Generalist: Unifying Medical Imaging Tasks in Context
Figure 3 for Medical Vision Generalist: Unifying Medical Imaging Tasks in Context
Figure 4 for Medical Vision Generalist: Unifying Medical Imaging Tasks in Context
Viaarxiv icon