Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Joohwan Kim

NitroGen: An Open Foundation Model for Generalist Gaming Agents

Jan 04, 2026

Loïc Magne, Anas Awadalla, Guanzhi Wang, Yinzhen Xu, Joshua Belofsky, Fengyuan Hu, Joohwan Kim, Ludwig Schmidt, Georgia Gkioxari, Jan Kautz(+4 more)

Abstract:We introduce NitroGen, a vision-action foundation model for generalist gaming agents that is trained on 40,000 hours of gameplay videos across more than 1,000 games. We incorporate three key ingredients: 1) an internet-scale video-action dataset constructed by automatically extracting player actions from publicly available gameplay videos, 2) a multi-game benchmark environment that can measure cross-game generalization, and 3) a unified vision-action model trained with large-scale behavior cloning. NitroGen exhibits strong competence across diverse domains, including combat encounters in 3D action games, high-precision control in 2D platformers, and exploration in procedurally generated worlds. It transfers effectively to unseen games, achieving up to 52% relative improvement in task success rates over models trained from scratch. We release the dataset, evaluation suite, and model weights to advance research on generalist embodied agents.

* 16 pages, 7 figures

Via

Access Paper or Ask Questions

Learning to Move Like Professional Counter-Strike Players

Aug 25, 2024

David Durst, Feng Xie, Vishnu Sarukkai, Brennan Shacklett, Iuri Frosio, Chen Tessler, Joohwan Kim, Carly Taylor, Gilbert Bernstein, Sanjiban Choudhury(+2 more)

Figure 1 for Learning to Move Like Professional Counter-Strike Players

Figure 2 for Learning to Move Like Professional Counter-Strike Players

Figure 3 for Learning to Move Like Professional Counter-Strike Players

Figure 4 for Learning to Move Like Professional Counter-Strike Players

Abstract:In multiplayer, first-person shooter games like Counter-Strike: Global Offensive (CS:GO), coordinated movement is a critical component of high-level strategic play. However, the complexity of team coordination and the variety of conditions present in popular game maps make it impractical to author hand-crafted movement policies for every scenario. We show that it is possible to take a data-driven approach to creating human-like movement controllers for CS:GO. We curate a team movement dataset comprising 123 hours of professional game play traces, and use this dataset to train a transformer-based movement model that generates human-like team movement for all players in a "Retakes" round of the game. Importantly, the movement prediction model is efficient. Performing inference for all players takes less than 0.5 ms per game step (amortized cost) on a single CPU core, making it plausible for use in commercial games today. Human evaluators assess that our model behaves more like humans than both commercially-available bots and procedural movement controllers scripted by experts (16% to 59% higher by TrueSkill rating of "human-like"). Using experiments involving in-game bot vs. bot self-play, we demonstrate that our model performs simple forms of teamwork, makes fewer common movement mistakes, and yields movement distributions, player lifetimes, and kill locations similar to those observed in professional CS:GO match play.

* ACM SIGGRAPH / Eurographics Symposium on Computer Animation (SCA), August 21-23, 2024, Montreal, Canada
* The project website is at https://davidbdurst.com/mlmove/

Via

Access Paper or Ask Questions

Noise-Aware Saliency Prediction for Videos with Incomplete Gaze Data

Apr 16, 2021

Ekta Prashnani, Orazio Gallo, Joohwan Kim, Josef Spjut, Pradeep Sen, Iuri Frosio

Figure 1 for Noise-Aware Saliency Prediction for Videos with Incomplete Gaze Data

Figure 2 for Noise-Aware Saliency Prediction for Videos with Incomplete Gaze Data

Figure 3 for Noise-Aware Saliency Prediction for Videos with Incomplete Gaze Data

Figure 4 for Noise-Aware Saliency Prediction for Videos with Incomplete Gaze Data

Abstract:Deep-learning-based algorithms have led to impressive results in visual-saliency prediction, but the impact of noise in training gaze data has been largely overlooked. This issue is especially relevant for videos, where the gaze data tends to be incomplete, and thus noisier, compared to images. Therefore, we propose a noise-aware training (NAT) paradigm for visual-saliency prediction that quantifies the uncertainty arising from gaze data incompleteness and inaccuracy, and accounts for it in training. We demonstrate the advantage of NAT independently of the adopted model architecture, loss function, or training dataset. Given its robustness to the noise in incomplete training datasets, NAT ushers in the possibility of designing gaze datasets with fewer human subjects. We also introduce the first dataset that offers a video-game context for video-saliency research, with rich temporal semantics, and multiple gaze attractors per frame.

Via

Access Paper or Ask Questions

Robust Vision-Based Cheat Detection in Competitive Gaming

Mar 27, 2021

Aditya Jonnalagadda, Iuri Frosio, Seth Schneider, Morgan McGuire, Joohwan Kim

Figure 1 for Robust Vision-Based Cheat Detection in Competitive Gaming

Figure 2 for Robust Vision-Based Cheat Detection in Competitive Gaming

Figure 3 for Robust Vision-Based Cheat Detection in Competitive Gaming

Figure 4 for Robust Vision-Based Cheat Detection in Competitive Gaming

Abstract:Game publishers and anti-cheat companies have been unsuccessful in blocking cheating in online gaming. We propose a novel, vision-based approach that captures the final state of the frame buffer and detects illicit overlays. To this aim, we train and evaluate a DNN detector on a new dataset, collected using two first-person shooter games and three cheating software. We study the advantages and disadvantages of different DNN architectures operating on a local or global scale. We use output confidence analysis to avoid unreliable detections and inform when network retraining is required. In an ablation study, we show how to use Interval Bound Propagation to build a detector that is also resistant to potential adversarial attacks and study its interaction with confidence analysis. Our results show that robust and effective anti-cheating through machine learning is practically feasible and can be used to guarantee fair play in online gaming.

* 17 pages, 4 figures

Via

Access Paper or Ask Questions