Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Bogdan Georgiev

MaD Physics: Evaluating information seeking under constraints in physical environments

May 11, 2026

Moksh Jain, Mehdi Bennani, Johannes Bausch, Yuri Chervonyi, Bogdan Georgiev, Simon Osindero, Nenad Tomašev

Abstract:Scientific discovery is fundamentally a resource-constrained process that requires navigating complex trade-offs between the quality and quantity of measurements due to physical and cost constraints. Measurements drive the scientific process by revealing novel phenomena to improve our understanding. Existing benchmarks for evaluating agents for scientific discovery focus on either static knowledge-based reasoning or unconstrained experimental design tasks, and do not capture the ability to make measurements and plan under constraints. To bridge this gap, we propose Measuring and Discovering Physics (MaD Physics), a benchmark to evaluate the ability of agents to make informative measurements and conclusions subject to constraints on the quality and quantity of measurements. The benchmark consists of three environments, each based on a distinct physical law. To mitigate contamination from existing knowledge, MaD Physics includes altered physical laws. In each trial, the agent makes measurements of the system until it exhausts an allotted budget and then the agent has to infer the underlying physical law to make predictions about the state of the system in the future. MaD Physics evaluates two fundamental capabilities of scientific agents: inferring models from data and planning under constraints. We also demonstrate how MaD Physics can be used to evaluate other capabilities such as multimodality and in-context learning. We benchmark agents on MaD Physics using four Gemini models (2.5 Flash Lite, 2.5 Flash, 2.5 Pro, and 3 Flash), identifying shortcomings in their structured exploration and data collection capabilities and highlighting directions to improve their scientific reasoning.

* 64 pages, 10 figures. Project page: https://mad-physics.github.io/

Via

Access Paper or Ask Questions

AI Co-Mathematician: Accelerating Mathematicians with Agentic AI

May 07, 2026

Daniel Zheng, Ingrid von Glehn, Yori Zwols, Iuliya Beloshapka, Lars Buesing, Daniel M. Roy, Martin Wattenberg, Bogdan Georgiev, Tatiana Schmidt, Andrew Cowie(+8 more)

Abstract:We introduce the AI co-mathematician, a workbench for mathematicians to interactively leverage AI agents to pursue open-ended research. The AI co-mathematician is optimized to provide holistic support for the exploratory and iterative reality of mathematical workflows, including ideation, literature search, computational exploration, theorem proving and theory building. By providing an asynchronous, stateful workspace that manages uncertainty, refines user intent, tracks failed hypotheses, and outputs native mathematical artifacts, the system mirrors human collaborative workflows. In early tests, the AI co-mathematician helped researchers solve open problems, identify new research directions, and uncover overlooked literature references. Besides demonstrating a highly interactive paradigm for AI-assisted mathematical discovery, the AI co-mathematician also achieves state of the art results on hard problem-solving benchmarks, including scoring 48% on FrontierMath Tier 4, a new high score among all AI systems evaluated.

* 22 pages

Via

Access Paper or Ask Questions

Combining expert knowledge and neural networks to model environmental stresses in agriculture

Oct 26, 2021

Kostadin Cvejoski, Jannis Schuecker, Anne-Katrin Mahlein, Bogdan Georgiev

Figure 1 for Combining expert knowledge and neural networks to model environmental stresses in agriculture

Figure 2 for Combining expert knowledge and neural networks to model environmental stresses in agriculture

Figure 3 for Combining expert knowledge and neural networks to model environmental stresses in agriculture

Figure 4 for Combining expert knowledge and neural networks to model environmental stresses in agriculture

Abstract:In this work we combine representation learning capabilities of neural network with agricultural knowledge from experts to model environmental heat and drought stresses. We first design deterministic expert models which serve as a benchmark and inform the design of flexible neural-network architectures. Finally, a sensitivity analysis of the latter allows a clustering of hybrids into susceptible and resistant ones.

* 19 pages, Winners of the 2019 Syngenta Crop Challenge

Via

Access Paper or Ask Questions

On the Impact of Stable Ranks in Deep Nets

Oct 05, 2021

Bogdan Georgiev, Lukas Franken, Mayukh Mukherjee, Georgios Arvanitidis

Figure 1 for On the Impact of Stable Ranks in Deep Nets

Figure 2 for On the Impact of Stable Ranks in Deep Nets

Figure 3 for On the Impact of Stable Ranks in Deep Nets

Figure 4 for On the Impact of Stable Ranks in Deep Nets

Abstract:A recent line of work has established intriguing connections between the generalization/compression properties of a deep neural network (DNN) model and the so-called layer weights' stable ranks. Intuitively, the latter are indicators of the effective number of parameters in the net. In this work, we address some natural questions regarding the space of DNNs conditioned on the layers' stable rank, where we study feed-forward dynamics, initialization, training and expressivity. To this end, we first propose a random DNN model with a new sampling scheme based on stable rank. Then, we show how feed-forward maps are affected by the constraint and how training evolves in the overparametrized regime (via Neural Tangent Kernels). Our results imply that stable ranks appear layerwise essentially as linear factors whose effect accumulates exponentially depthwise. Moreover, we provide empirical analysis suggesting that stable rank initialization alone can lead to convergence speed ups.

* 24 pages, 8 figures, comments welcome!

Via

Access Paper or Ask Questions

A prior-based approximate latent Riemannian metric

Mar 09, 2021

Georgios Arvanitidis, Bogdan Georgiev, Bernhard Schölkopf

Figure 1 for A prior-based approximate latent Riemannian metric

Figure 2 for A prior-based approximate latent Riemannian metric

Figure 3 for A prior-based approximate latent Riemannian metric

Figure 4 for A prior-based approximate latent Riemannian metric

Abstract:Stochastic generative models enable us to capture the geometric structure of a data manifold lying in a high dimensional space through a Riemannian metric in the latent space. However, its practical use is rather limited mainly due to inevitable complexity. In this work we propose a surrogate conformal Riemannian metric in the latent space of a generative model that is simple, efficient and robust. This metric is based on a learnable prior that we propose to learn using a basic energy-based model. We theoretically analyze the behavior of the proposed metric and show that it is sensible to use in practice. We demonstrate experimentally the efficiency and robustness, as well as the behavior of the new approximate metric. Also, we show the applicability of the proposed methodology for data analysis in the life sciences.

Via

Access Paper or Ask Questions

Heating up decision boundaries: isocapacitory saturation, adversarial scenarios and generalization bounds

Jan 15, 2021

Bogdan Georgiev, Lukas Franken, Mayukh Mukherjee

Figure 1 for Heating up decision boundaries: isocapacitory saturation, adversarial scenarios and generalization bounds

Figure 2 for Heating up decision boundaries: isocapacitory saturation, adversarial scenarios and generalization bounds

Figure 3 for Heating up decision boundaries: isocapacitory saturation, adversarial scenarios and generalization bounds

Figure 4 for Heating up decision boundaries: isocapacitory saturation, adversarial scenarios and generalization bounds

Abstract:In the present work we study classifiers' decision boundaries via Brownian motion processes in ambient data space and associated probabilistic techniques. Intuitively, our ideas correspond to placing a heat source at the decision boundary and observing how effectively the sample points warm up. We are largely motivated by the search for a soft measure that sheds further light on the decision boundary's geometry. En route, we bridge aspects of potential theory and geometric analysis (Mazya, 2011, Grigoryan-Saloff-Coste, 2002) with active fields of ML research such as adversarial examples and generalization bounds. First, we focus on the geometric behavior of decision boundaries in the light of adversarial attack/defense mechanisms. Experimentally, we observe a certain capacitory trend over different adversarial defense strategies: decision boundaries locally become flatter as measured by isoperimetric inequalities (Ford et al, 2019); however, our more sensitive heat-diffusion metrics extend this analysis and further reveal that some non-trivial geometry invisible to plain distance-based methods is still preserved. Intuitively, we provide evidence that the decision boundaries nevertheless retain many persistent "wiggly and fuzzy" regions on a finer scale. Second, we show how Brownian hitting probabilities translate to soft generalization bounds which are in turn connected to compression and noise stability (Arora et al, 2018), and these bounds are significantly stronger if the decision boundary has controlled geometric features.

* Accepted as conference paper at ICLR 2021. 36 pages, 16 figures, comments welcome!

Via

Access Paper or Ask Questions

Generative Deep Learning Techniques for Password Generation

Dec 16, 2020

David Biesner, Kostadin Cvejoski, Bogdan Georgiev, Rafet Sifa, Erik Krupicka

Figure 1 for Generative Deep Learning Techniques for Password Generation

Figure 2 for Generative Deep Learning Techniques for Password Generation

Figure 3 for Generative Deep Learning Techniques for Password Generation

Figure 4 for Generative Deep Learning Techniques for Password Generation

Abstract:Password guessing approaches via deep learning have recently been investigated with significant breakthroughs in their ability to generate novel, realistic password candidates. In the present work we study a broad collection of deep learning and probabilistic based models in the light of password guessing: attention-based deep neural networks, autoencoding mechanisms and generative adversarial networks. We provide novel generative deep-learning models in terms of variational autoencoders exhibiting state-of-art sampling performance, yielding additional latent-space features such as interpolations and targeted sampling. Lastly, we perform a thorough empirical analysis in a unified controlled framework over well-known datasets (RockYou, LinkedIn, Youku, Zomato, Pwnd). Our results not only identify the most promising schemes driven by deep neural networks, but also illustrate the strengths of each approach in terms of generation variability and sample uniqueness.

* 25 pages, 13 figures. Comments welcome!

Via

Access Paper or Ask Questions

Recurrent Point Review Models

Dec 10, 2020

Kostadin Cvejoski, Ramses J. Sanchez, Bogdan Georgiev, Christian Bauckhage, Cesar Ojeda

Figure 1 for Recurrent Point Review Models

Figure 2 for Recurrent Point Review Models

Figure 3 for Recurrent Point Review Models

Figure 4 for Recurrent Point Review Models

Abstract:Deep neural network models represent the state-of-the-art methodologies for natural language processing. Here we build on top of these methodologies to incorporate temporal information and model how to review data changes with time. Specifically, we use the dynamic representations of recurrent point process models, which encode the history of how business or service reviews are received in time, to generate instantaneous language models with improved prediction capabilities. Simultaneously, our methodologies enhance the predictive power of our point process models by incorporating summarized review content representations. We provide recurrent network and temporal convolution solutions for modeling the review content. We deploy our methodologies in the context of recommender systems, effectively characterizing the change in preference and taste of users as time evolves. Source code is available at [1].

* 2020 International Joint Conference on Neural Networks (IJCNN), Glasgow, United Kingdom, 2020, pp. 1-8
* 8 pages, 6 figures, Published in: 2020 International Joint Conference on Neural Networks (IJCNN)

Via

Access Paper or Ask Questions

Neural Abstract Reasoner

Nov 12, 2020

Victor Kolev, Bogdan Georgiev, Svetlin Penkov

Abstract:Abstract reasoning and logic inference are difficult problems for neural networks, yet essential to their applicability in highly structured domains. In this work we demonstrate that a well known technique such as spectral regularization can significantly boost the capabilities of a neural learner. We introduce the Neural Abstract Reasoner (NAR), a memory augmented architecture capable of learning and using abstract rules. We show that, when trained with spectral regularization, NAR achieves $78.8\%$ accuracy on the Abstraction and Reasoning Corpus, improving performance 4 times over the best known human hand-crafted symbolic solvers. We provide some intuition for the effects of spectral regularization in the domain of abstract reasoning based on theoretical generalization bounds and Solomonoff's theory of inductive inference.

* 12 pages, 8 figures

Via

Access Paper or Ask Questions

Recurrent Point Processes for Dynamic Review Models

Jan 15, 2020

Kostadin Cvejoski, Ramses J. Sanchez, Bogdan Georgiev, Jannis Schuecker, Christian Bauckhage, Cesar Ojeda

Figure 1 for Recurrent Point Processes for Dynamic Review Models

Figure 2 for Recurrent Point Processes for Dynamic Review Models

Abstract:Recent progress in recommender system research has shown the importance of including temporal representations to improve interpretability and performance. Here, we incorporate temporal representations in continuous time via recurrent point process for a dynamical model of reviews. Our goal is to characterize how changes in perception, user interest and seasonal effects affect review text.

* Presented at the AAAI 2020 Workshop on Interactive and Conversational Recommendation Systems

Via

Access Paper or Ask Questions