Abstract:Public datasets such as DLC-2021, SynID, and KID34K have significantly contributed to research on presentation attack detection for identity documents, including screen replay attacks. However, evaluation of out-of-domain (OOD) robustness remains insufficiently explored, especially under realistic domain shifts. In this work, we introduce Receipt Replay OOD, a small out-of-domain benchmark for screen replay detection. Receipts share several characteristics with identity documents, including planar geometry, curved corners, wear-and-tear artifacts, and text or logo patterns, while avoiding personally identifiable information constraints commonly associated with identity documents. We evaluate document replay detection models under cross-domain conditions and demonstrate the impact of domain shift on generalization performance. The dataset is publicly available.
Abstract:Generative image models have recently shown significant progress in image realism, leading to public concerns about their potential misuse for document forgery. This paper explores whether contemporary open-source and publicly accessible diffusion-based generative models can produce identity document forgeries that could realistically bypass human or automated verification systems. We evaluate text-to-image and image-to-image generation pipelines using multiple publicly available generative model families, including Stable Diffusion, Qwen, Flux, Nano-Banana, and others. The findings indicate that while current generative models can simulate surface-level document aesthetics, they fail to reproduce structural and forensic authenticity. Consequently, the risk of generative identity document deepfakes achieving forensic-level authenticity may be overestimated, underscoring the value of collaboration between machine learning practitioners and document-forensics experts in realistic risk assessment.
Abstract:In this article, we explore the use of various matrix norms for optimizing functions of weight matrices, a crucial problem in training large language models. Moving beyond the spectral norm underlying the Muon update, we leverage duals of the Ky Fan $k$-norms to introduce a family of Muon-like algorithms we name Fanions, which are closely related to Dion. By working with duals of convex combinations of the Ky Fan $k$-norms with either the Frobenius norm or the $l_\infty$ norm, we construct the families of F-Fanions and S-Fanions, respectively. Their most prominent members are F-Muon and S-Muon. We complement our theoretical analysis with an extensive empirical study of these algorithms across a wide range of tasks and settings, demonstrating that F-Muon and S-Muon consistently match Muon's performance, while outperforming vanilla Muon on a synthetic linear least squares problem.