Abstract:Visual glitches in video games degrade player experience and perceived quality, yet manual quality assurance cannot scale to the growing test surface of modern game development. Prior automation efforts, particularly those using vision-language models (VLMs), largely operate on single frames or rely on limited video-level baselines that struggle under realistic scene variation, making robust video-level glitch detection challenging. We present RESP, a practical multi-frame framework for gameplay glitch detection with VLMs. Our key idea is reference-guided prompting: for each test frame, we select a reference frame from earlier in the same video, establishing a visual baseline and reframing detection as within-video comparison rather than isolated classification. RESP sequentially prompts the VLM with reference/test pairs and aggregates noisy frame predictions into a stable video-level decision without fine-tuning the VLM. To enable controlled analysis of reference effects, we introduce RefGlitch, a synthetic dataset of manually labeled reference/test frame pairs with balanced coverage across five glitch types. Experiments across five VLMs and three datasets (one synthetic, two real-world) show that reference guidance consistently strengthens frame-level detection and that the improved frame-level evidence reliably transfers to stronger video-level triage under realistic QA conditions. Code and data are available at: \href{https://github.com/PipiZong/RESP_code.git}{this https URL}.


Abstract:We present World of Bugs (WOB), an open platform that aims to support Automated Bug Detection (ABD) research in video games. We discuss some open problems in ABD and how they relate to the platform's design, arguing that learning-based solutions are required if further progress is to be made. The platform's key feature is a growing collection of common video game bugs that may be used for training and evaluating ABD approaches.




Abstract:Automated Bug Detection (ABD) in video games is composed of two distinct but complementary problems: automated game exploration and bug identification. Automated game exploration has received much recent attention, spurred on by developments in fields such as reinforcement learning. The complementary problem of identifying the bugs present in a player's experience has for the most part relied on the manual specification of rules. Although it is widely recognised that many bugs of interest cannot be identified with such methods, little progress has been made in this direction. In this work we show that it is possible to identify a range of perceptual bugs using learning-based methods by making use of only the rendered game screen as seen by the player. To support our work, we have developed World of Bugs (WOB) an open platform for testing ABD methods in 3D game environments.




Abstract:With the aim of designing automated tools that assist in the video game quality assurance process, we frame the problem of identifying bugs in video games as an anomaly detection (AD) problem. We develop State-State Siamese Networks (S3N) as an efficient deep metric learning approach to AD in this context and explore how it may be used as part of an automated testing tool. Finally, we show by empirical evaluation on a series of Atari games, that S3N is able to learn a meaningful embedding, and consequently is able to identify various common types of video game bugs.