Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Vasiliy A. Es'kin

Dynamical Systems Theory Behind a Hierarchical Reasoning Model

Mar 24, 2026

Vasiliy A. Es'kin, Mikhail E. Smorkalov

Abstract:Current large language models (LLMs) primarily rely on linear sequence generation and massive parameter counts, yet they severely struggle with complex algorithmic reasoning. While recent reasoning architectures, such as the Hierarchical Reasoning Model (HRM) and Tiny Recursive Model (TRM), demonstrate that compact recursive networks can tackle these tasks, their training dynamics often lack rigorous mathematical guarantees, leading to instability and representational collapse. We propose the Contraction Mapping Model (CMM), a novel architecture that reformulates discrete recursive reasoning into continuous Neural Ordinary and Stochastic Differential Equations (NODEs/NSDEs). By explicitly enforcing the convergence of the latent phase point to a stable equilibrium state and mitigating feature collapse with a hyperspherical repulsion loss, the CMM provides a mathematically grounded and highly stable reasoning engine. On the Sudoku-Extreme benchmark, a 5M-parameter CMM achieves a state-of-the-art accuracy of 93.7 %, outperforming the 27M-parameter HRM (55.0 %) and 5M-parameter TRM (87.4 %). Remarkably, even when aggressively compressed to an ultra-tiny footprint of just 0.26M parameters, the CMM retains robust predictive power, achieving 85.4 % on Sudoku-Extreme and 82.2 % on the Maze benchmark. These results establish a new frontier for extreme parameter efficiency, proving that mathematically rigorous latent dynamics can effectively replace brute-force scaling in artificial reasoning.

Via

Access Paper or Ask Questions

Physics-Informed Neural Systems for the Simulation of EUV Electromagnetic Wave Diffraction from a Lithography Mask

Mar 17, 2026

Vasiliy A. Es'kin, Egor V. Ivanov

Abstract:Physics-informed neural networks (PINNs) and neural operators (NOs) for solving the problem of diffraction of Extreme Ultraviolet (EUV) electromagnetic waves from contemporary lithography masks are presented. A novel hybrid Waveguide Neural Operator (WGNO) is introduced, based on a waveguide method with its most computationally expensive components replaced by a neural network. To evaluate performance, the accuracy and inference time of PINNs and NOs are compared against modern numerical solvers for a series of problems with known exact solutions. The emphasis is placed on investigation of solution accuracy by considered artificial neural systems for 13.5 nm and 11.2 nm wavelengths. Numerical experiments on realistic 2D and 3D masks demonstrate that PINNs and neural operators achieve competitive accuracy and significantly reduced prediction times, with the proposed WGNO architecture reaching state-of-the-art performance. The presented neural operator has pronounced generalizing properties, meaning that for unseen problem parameters it delivers a solution accuracy close to that for parameters seen in the training dataset. These results provide a highly efficient solution for accelerating the design and optimization workflows of next-generation lithography masks.

* arXiv admin note: substantial text overlap with arXiv:2507.04153

Via

Access Paper or Ask Questions

About rectified sigmoid function for enhancing the accuracy of Physics-Informed Neural Networks

Dec 30, 2024

Vasiliy A. Es'kin, Alexey O. Malkhanov, Mikhail E. Smorkalov

Figure 1 for About rectified sigmoid function for enhancing the accuracy of Physics-Informed Neural Networks

Figure 2 for About rectified sigmoid function for enhancing the accuracy of Physics-Informed Neural Networks

Figure 3 for About rectified sigmoid function for enhancing the accuracy of Physics-Informed Neural Networks

Abstract:The article is devoted to the study of neural networks with one hidden layer and a modified activation function for solving physical problems. A rectified sigmoid activation function has been proposed to solve physical problems described by the ODE with neural networks. Algorithms for physics-informed data-driven initialization of a neural network and a neuron-by-neuron gradient-free fitting method have been presented for the neural network with this activation function. Numerical experiments demonstrate the superiority of neural networks with a rectified sigmoid function over neural networks with a sigmoid function in the accuracy of solving physical problems (harmonic oscillator, relativistic slingshot, and Lorentz system).

* 9 pages, 1 figure, 2 tables, 4 algthorithms. arXiv admin note: substantial text overlap with arXiv:2412.19235

Via

Access Paper or Ask Questions

Are Two Hidden Layers Still Enough for the Physics-Informed Neural Networks?

Dec 26, 2024

Vasiliy A. Es'kin, Alexey O. Malkhanov, Mikhail E. Smorkalov

Figure 1 for Are Two Hidden Layers Still Enough for the Physics-Informed Neural Networks?

Figure 2 for Are Two Hidden Layers Still Enough for the Physics-Informed Neural Networks?

Figure 3 for Are Two Hidden Layers Still Enough for the Physics-Informed Neural Networks?

Figure 4 for Are Two Hidden Layers Still Enough for the Physics-Informed Neural Networks?

Abstract:The article discusses the development of various methods and techniques for initializing and training neural networks with a single hidden layer, as well as training a separable physics-informed neural network consisting of neural networks with a single hidden layer to solve physical problems described by ordinary differential equations (ODEs) and partial differential equations (PDEs). A method for strictly deterministic initialization of a neural network with one hidden layer for solving physical problems described by an ODE is proposed. Modifications to existing methods for weighting the loss function are given, as well as new methods developed for training strictly deterministic-initialized neural networks to solve ODEs (detaching, additional weighting based on the second derivative, predicted solution-based weighting, relative residuals). An algorithm for physics-informed data-driven initialization of a neural network with one hidden layer is proposed. A neural network with pronounced generalizing properties is presented, whose generalizing abilities of which can be precisely controlled by adjusting network parameters. A metric for measuring the generalization of such neural network has been introduced. A gradient-free neuron-by-neuron fitting method has been developed for adjusting the parameters of a single-hidden-layer neural network, which does not require the use of an optimizer or solver for its implementation. The proposed methods have been extended to 2D problems using the separable physics-informed neural networks approach. Numerous experiments have been carried out to develop the above methods and approaches. Experiments on physical problems, such as solving various ODEs and PDEs, have demonstrated that these methods for initializing and training neural networks with one or two hidden layers (SPINN) achieve competitive accuracy and, in some cases, state-of-the-art results.

* 45 pages, 36 figures, 9 tables

Via

Access Paper or Ask Questions

Separable Physics-Informed Neural Networks for the solution of elasticity problems

Jan 24, 2024

Vasiliy A. Es'kin, Danil V. Davydov, Julia V. Gur'eva, Alexey O. Malkhanov, Mikhail E. Smorkalov

Figure 1 for Separable Physics-Informed Neural Networks for the solution of elasticity problems

Figure 2 for Separable Physics-Informed Neural Networks for the solution of elasticity problems

Figure 3 for Separable Physics-Informed Neural Networks for the solution of elasticity problems

Figure 4 for Separable Physics-Informed Neural Networks for the solution of elasticity problems

Abstract:A method for solving elasticity problems based on separable physics-informed neural networks (SPINN) in conjunction with the deep energy method (DEM) is presented. Numerical experiments have been carried out for a number of problems showing that this method has a significantly higher convergence rate and accuracy than the vanilla physics-informed neural networks (PINN) and even SPINN based on a system of partial differential equations (PDEs). In addition, using the SPINN in the framework of DEM approach it is possible to solve problems of the linear theory of elasticity on complex geometries, which is unachievable with the help of PINNs in frames of partial differential equations. Considered problems are very close to the industrial problems in terms of geometry, loading, and material parameters.

Via

Access Paper or Ask Questions

About optimal loss function for training physics-informed neural networks under respecting causality

Apr 05, 2023

Vasiliy A. Es'kin, Danil V. Davydov, Ekaterina D. Egorova, Alexey O. Malkhanov, Mikhail A. Akhukov, Mikhail E. Smorkalov

Figure 1 for About optimal loss function for training physics-informed neural networks under respecting causality

Figure 2 for About optimal loss function for training physics-informed neural networks under respecting causality

Figure 3 for About optimal loss function for training physics-informed neural networks under respecting causality

Figure 4 for About optimal loss function for training physics-informed neural networks under respecting causality

Abstract:A method is presented that allows to reduce a problem described by differential equations with initial and boundary conditions to the problem described only by differential equations. The advantage of using the modified problem for physics-informed neural networks (PINNs) methodology is that it becomes possible to represent the loss function in the form of a single term associated with differential equations, thus eliminating the need to tune the scaling coefficients for the terms related to boundary and initial conditions. The weighted loss functions respecting causality were modified and new weighted loss functions based on generalized functions are derived. Numerical experiments have been carried out for a number of problems, demonstrating the accuracy of the proposed methods.

* 25 pages, 7 figures, 6 tables

Via

Access Paper or Ask Questions