Alert button
Picture for Gavin Costello

Gavin Costello

Alert button

Navigates Like Me: Understanding How People Evaluate Human-Like AI in Video Games

Mar 02, 2023
Stephanie Milani, Arthur Juliani, Ida Momennejad, Raluca Georgescu, Jaroslaw Rzpecki, Alison Shaw, Gavin Costello, Fei Fang, Sam Devlin, Katja Hofmann

Figure 1 for Navigates Like Me: Understanding How People Evaluate Human-Like AI in Video Games
Figure 2 for Navigates Like Me: Understanding How People Evaluate Human-Like AI in Video Games
Figure 3 for Navigates Like Me: Understanding How People Evaluate Human-Like AI in Video Games
Figure 4 for Navigates Like Me: Understanding How People Evaluate Human-Like AI in Video Games

We aim to understand how people assess human likeness in navigation produced by people and artificially intelligent (AI) agents in a video game. To this end, we propose a novel AI agent with the goal of generating more human-like behavior. We collect hundreds of crowd-sourced assessments comparing the human-likeness of navigation behavior generated by our agent and baseline AI agents with human-generated behavior. Our proposed agent passes a Turing Test, while the baseline agents do not. By passing a Turing Test, we mean that human judges could not quantitatively distinguish between videos of a person and an AI agent navigating. To understand what people believe constitutes human-like navigation, we extensively analyze the justifications of these assessments. This work provides insights into the characteristics that people consider human-like in the context of goal-directed video game navigation, which is a key step for further improving human interactions with AI agents.

* 18 pages; accepted at CHI 2023 
Viaarxiv icon

Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation

May 20, 2021
Sam Devlin, Raluca Georgescu, Ida Momennejad, Jaroslaw Rzepecki, Evelyn Zuniga, Gavin Costello, Guy Leroy, Ali Shaw, Katja Hofmann

Figure 1 for Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation
Figure 2 for Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation
Figure 3 for Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation
Figure 4 for Navigation Turing Test (NTT): Learning to Evaluate Human-Like Navigation

A key challenge on the path to developing agents that learn complex human-like behavior is the need to quickly and accurately quantify human-likeness. While human assessments of such behavior can be highly accurate, speed and scalability are limited. We address these limitations through a novel automated Navigation Turing Test (ANTT) that learns to predict human judgments of human-likeness. We demonstrate the effectiveness of our automated NTT on a navigation task in a complex 3D environment. We investigate six classification models to shed light on the types of architectures best suited to this task, and validate them against data collected through a human NTT. Our best models achieve high accuracy when distinguishing true human and agent behavior. At the same time, we show that predicting finer-grained human assessment of agents' progress towards human-like behavior remains unsolved. Our work takes an important step towards agents that more effectively learn complex human-like behavior.

Viaarxiv icon