Picture for Samuel Albanie

Samuel Albanie

Michael Pokorny

A SOUND APPROACH: Using Large Language Models to generate audio descriptions for egocentric text-audio retrieval

Add code
Feb 29, 2024
Viaarxiv icon

Lifelong Benchmarks: Efficient Model Evaluation in an Era of Rapid Progress

Add code
Feb 29, 2024
Figure 1 for Lifelong Benchmarks: Efficient Model Evaluation in an Era of Rapid Progress
Figure 2 for Lifelong Benchmarks: Efficient Model Evaluation in an Era of Rapid Progress
Figure 3 for Lifelong Benchmarks: Efficient Model Evaluation in an Era of Rapid Progress
Figure 4 for Lifelong Benchmarks: Efficient Model Evaluation in an Era of Rapid Progress
Viaarxiv icon

InstructVideo: Instructing Video Diffusion Models with Human Feedback

Add code
Dec 19, 2023
Figure 1 for InstructVideo: Instructing Video Diffusion Models with Human Feedback
Figure 2 for InstructVideo: Instructing Video Diffusion Models with Human Feedback
Figure 3 for InstructVideo: Instructing Video Diffusion Models with Human Feedback
Figure 4 for InstructVideo: Instructing Video Diffusion Models with Human Feedback
Viaarxiv icon

Charting New Territories: Exploring the Geographic and Geospatial Capabilities of Multimodal LLMs

Add code
Nov 30, 2023
Figure 1 for Charting New Territories: Exploring the Geographic and Geospatial Capabilities of Multimodal LLMs
Figure 2 for Charting New Territories: Exploring the Geographic and Geospatial Capabilities of Multimodal LLMs
Figure 3 for Charting New Territories: Exploring the Geographic and Geospatial Capabilities of Multimodal LLMs
Figure 4 for Charting New Territories: Exploring the Geographic and Geospatial Capabilities of Multimodal LLMs
Viaarxiv icon

Visual Data-Type Understanding does not emerge from Scaling Vision-Language Models

Add code
Oct 16, 2023
Figure 1 for Visual Data-Type Understanding does not emerge from Scaling Vision-Language Models
Figure 2 for Visual Data-Type Understanding does not emerge from Scaling Vision-Language Models
Figure 3 for Visual Data-Type Understanding does not emerge from Scaling Vision-Language Models
Figure 4 for Visual Data-Type Understanding does not emerge from Scaling Vision-Language Models
Viaarxiv icon

Simple Baselines for Interactive Video Retrieval with Questions and Answers

Add code
Aug 21, 2023
Figure 1 for Simple Baselines for Interactive Video Retrieval with Questions and Answers
Figure 2 for Simple Baselines for Interactive Video Retrieval with Questions and Answers
Figure 3 for Simple Baselines for Interactive Video Retrieval with Questions and Answers
Figure 4 for Simple Baselines for Interactive Video Retrieval with Questions and Answers
Viaarxiv icon

RLIPv2: Fast Scaling of Relational Language-Image Pre-training

Add code
Aug 18, 2023
Figure 1 for RLIPv2: Fast Scaling of Relational Language-Image Pre-training
Figure 2 for RLIPv2: Fast Scaling of Relational Language-Image Pre-training
Figure 3 for RLIPv2: Fast Scaling of Relational Language-Image Pre-training
Figure 4 for RLIPv2: Fast Scaling of Relational Language-Image Pre-training
Viaarxiv icon

arXiVeri: Automatic table verification with GPT

Add code
Jun 13, 2023
Viaarxiv icon

GPT4GEO: How a Language Model Sees the World's Geography

Add code
May 30, 2023
Figure 1 for GPT4GEO: How a Language Model Sees the World's Geography
Figure 2 for GPT4GEO: How a Language Model Sees the World's Geography
Figure 3 for GPT4GEO: How a Language Model Sees the World's Geography
Figure 4 for GPT4GEO: How a Language Model Sees the World's Geography
Viaarxiv icon

Zero-shot Unsupervised Transfer Instance Segmentation

Add code
Apr 27, 2023
Figure 1 for Zero-shot Unsupervised Transfer Instance Segmentation
Figure 2 for Zero-shot Unsupervised Transfer Instance Segmentation
Figure 3 for Zero-shot Unsupervised Transfer Instance Segmentation
Figure 4 for Zero-shot Unsupervised Transfer Instance Segmentation
Viaarxiv icon