Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Dean Price

Techno-economic optimization of a heat-pipe microreactor, part I: theory and cost optimization

Dec 17, 2025

Paul Seurin, Dean Price, Luis Nunez

Abstract:Microreactors, particularly heat-pipe microreactors (HPMRs), are compact, transportable, self-regulated power systems well-suited for access-challenged remote areas where costly fossil fuels dominate. However, they suffer from diseconomies of scale, and their financial viability remains unconvincing. One step in addressing this shortcoming is to design these reactors with comprehensive economic and physics analyses informing early-stage design iteration. In this work, we present a novel unifying geometric design optimization approach that accounts for techno-economic considerations. We start by generating random samples to train surrogate models, including Gaussian processes (GPs) and multi-layer perceptrons (MLPs). We then deploy these surrogates within a reinforcement learning (RL)-based optimization framework to optimize the levelized cost of electricity (LCOE), all the while imposing constraints on the fuel lifetime, shutdown margin (SDM), peak heat flux, and rod-integrated peaking factor. We study two cases: one in which the axial reflector cost is very high, and one in which it is inexpensive. We found that the operation and maintenance and capital costs are the primary contributors to the overall LCOE particularly the cost of the axial reflectors (for the first case) and the control drum materials. The optimizer cleverly changes the design parameters so as to minimize one of them while still satisfying the constraints, ultimately reducing the LCOE by more than 57% in both instances. A comprehensive integration of fuel and HP performance with multi-objective optimization is currently being pursued to fully understand the interaction between constraints and cost performance.

Via

Access Paper or Ask Questions

Multistep Criticality Search and Power Shaping in Microreactors with Reinforcement Learning

Jun 22, 2024

Majdi I. Radaideh, Leo Tunkle, Dean Price, Kamal Abdulraheem, Linyu Lin, Moutaz Elias

Figure 1 for Multistep Criticality Search and Power Shaping in Microreactors with Reinforcement Learning

Figure 2 for Multistep Criticality Search and Power Shaping in Microreactors with Reinforcement Learning

Figure 3 for Multistep Criticality Search and Power Shaping in Microreactors with Reinforcement Learning

Figure 4 for Multistep Criticality Search and Power Shaping in Microreactors with Reinforcement Learning

Abstract:Reducing operation and maintenance costs is a key objective for advanced reactors in general and microreactors in particular. To achieve this reduction, developing robust autonomous control algorithms is essential to ensure safe and autonomous reactor operation. Recently, artificial intelligence and machine learning algorithms, specifically reinforcement learning (RL) algorithms, have seen rapid increased application to control problems, such as plasma control in fusion tokamaks and building energy management. In this work, we introduce the use of RL for intelligent control in nuclear microreactors. The RL agent is trained using proximal policy optimization (PPO) and advantage actor-critic (A2C), cutting-edge deep RL techniques, based on a high-fidelity simulation of a microreactor design inspired by the Westinghouse eVinci\textsuperscript{TM} design. We utilized a Serpent model to generate data on drum positions, core criticality, and core power distribution for training a feedforward neural network surrogate model. This surrogate model was then used to guide a PPO and A2C control policies in determining the optimal drum position across various reactor burnup states, ensuring critical core conditions and symmetrical power distribution across all six core portions. The results demonstrate the excellent performance of PPO in identifying optimal drum positions, achieving a hextant power tilt ratio of approximately 1.002 (within the limit of $<$ 1.02) and maintaining criticality within a 10 pcm range. A2C did not provide as competitive of a performance as PPO in terms of performance metrics for all burnup steps considered in the cycle. Additionally, the results highlight the capability of well-trained RL control policies to quickly identify control actions, suggesting a promising approach for enabling real-time autonomous control through digital twins.

* 15 pages, 3 figures, and 2 tables

Via

Access Paper or Ask Questions