Picture for Ali Farhadi

Ali Farhadi

From an Image to a Scene: Learning to Imagine the World from a Million 360 Videos

Add code
Dec 10, 2024
Viaarxiv icon

ActionAtlas: A VideoQA Benchmark for Domain-specialized Action Recognition

Add code
Oct 08, 2024
Figure 1 for ActionAtlas: A VideoQA Benchmark for Domain-specialized Action Recognition
Figure 2 for ActionAtlas: A VideoQA Benchmark for Domain-specialized Action Recognition
Figure 3 for ActionAtlas: A VideoQA Benchmark for Domain-specialized Action Recognition
Figure 4 for ActionAtlas: A VideoQA Benchmark for Domain-specialized Action Recognition
Viaarxiv icon

Learning to Build by Building Your Own Instructions

Add code
Oct 01, 2024
Figure 1 for Learning to Build by Building Your Own Instructions
Figure 2 for Learning to Build by Building Your Own Instructions
Figure 3 for Learning to Build by Building Your Own Instructions
Figure 4 for Learning to Build by Building Your Own Instructions
Viaarxiv icon

FLaRe: Achieving Masterful and Adaptive Robot Policies with Large-Scale Reinforcement Learning Fine-Tuning

Add code
Sep 25, 2024
Figure 1 for FLaRe: Achieving Masterful and Adaptive Robot Policies with Large-Scale Reinforcement Learning Fine-Tuning
Figure 2 for FLaRe: Achieving Masterful and Adaptive Robot Policies with Large-Scale Reinforcement Learning Fine-Tuning
Figure 3 for FLaRe: Achieving Masterful and Adaptive Robot Policies with Large-Scale Reinforcement Learning Fine-Tuning
Figure 4 for FLaRe: Achieving Masterful and Adaptive Robot Policies with Large-Scale Reinforcement Learning Fine-Tuning
Viaarxiv icon

Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models

Add code
Sep 25, 2024
Figure 1 for Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models
Figure 2 for Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models
Figure 3 for Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models
Figure 4 for Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Multimodal Models
Viaarxiv icon

OLMoE: Open Mixture-of-Experts Language Models

Add code
Sep 03, 2024
Figure 1 for OLMoE: Open Mixture-of-Experts Language Models
Figure 2 for OLMoE: Open Mixture-of-Experts Language Models
Figure 3 for OLMoE: Open Mixture-of-Experts Language Models
Figure 4 for OLMoE: Open Mixture-of-Experts Language Models
Viaarxiv icon

Task Me Anything

Add code
Jun 17, 2024
Figure 1 for Task Me Anything
Figure 2 for Task Me Anything
Figure 3 for Task Me Anything
Figure 4 for Task Me Anything
Viaarxiv icon

Superposed Decoding: Multiple Generations from a Single Autoregressive Inference Pass

Add code
May 29, 2024
Figure 1 for Superposed Decoding: Multiple Generations from a Single Autoregressive Inference Pass
Figure 2 for Superposed Decoding: Multiple Generations from a Single Autoregressive Inference Pass
Figure 3 for Superposed Decoding: Multiple Generations from a Single Autoregressive Inference Pass
Figure 4 for Superposed Decoding: Multiple Generations from a Single Autoregressive Inference Pass
Viaarxiv icon

Localized Symbolic Knowledge Distillation for Visual Commonsense Models

Add code
Dec 12, 2023
Figure 1 for Localized Symbolic Knowledge Distillation for Visual Commonsense Models
Figure 2 for Localized Symbolic Knowledge Distillation for Visual Commonsense Models
Figure 3 for Localized Symbolic Knowledge Distillation for Visual Commonsense Models
Figure 4 for Localized Symbolic Knowledge Distillation for Visual Commonsense Models
Viaarxiv icon

Are "Hierarchical" Visual Representations Hierarchical?

Add code
Nov 23, 2023
Viaarxiv icon