Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Emily Dolson

ECLIPSE: An Evolutionary Computation Library for Instrumentation Prototyping in Scientific Engineering

Jan 08, 2026

Max Foreback, Evan Imata, Vincent Ragusa, Jacob Weiler, Christina Shao, Joey Wagner, Katherine G. Skocelas, Jonathan Sy, Aman Hafez, Wolfgang Banzhaf(+10 more)

Abstract:Designing scientific instrumentation often requires exploring large, highly constrained design spaces using computationally expensive physics simulations. These simulators pose substantial challenges for integrating evolutionary computation (EC) into scientific design workflows. Evolutionary computation typically requires numerous design evaluations, making the integration of slow, low-throughput simulators particularly challenging, as they are optimized for accuracy and ease of use rather than throughput. We present ECLIPSE, an evolutionary computation framework built to interface directly with complex, domain-specific simulation tools while supporting flexible geometric and parametric representations of scientific hardware. ECLIPSE provides a modular architecture consisting of (1) Individuals, which encode hardware designs using domain-aware, physically constrained representations; (2) Evaluators, which prepare simulation inputs, invoke external simulators, and translate the simulator's outputs into fitness measures; and (3) Evolvers, which implement EC algorithms suitable for high-cost, limited-throughput environments. We demonstrate the utility of ECLIPSE across several active space-science applications, including evolved 3D antennas and spacecraft geometries optimized for drag reduction in very low Earth orbit. We further discuss the practical challenges encountered when coupling EC with scientific simulation workflows, including interoperability constraints, parallelization limits, and extreme evaluation costs, and outline ongoing efforts to combat these challenges. ECLIPSE enables interdisciplinary teams of physicists, engineers, and EC researchers to collaboratively explore unconventional designs for scientific hardware while leveraging existing domain-specific simulation software.

Via

Access Paper or Ask Questions

A Scalable Trie Building Algorithm for High-Throughput Phyloanalysis of Wafer-Scale Digital Evolution Experiments

Aug 20, 2025

Vivaan Singhvi, Joey Wagner, Emily Dolson, Luis Zaman, Matthew Andres Moreno

Figure 1 for A Scalable Trie Building Algorithm for High-Throughput Phyloanalysis of Wafer-Scale Digital Evolution Experiments

Figure 2 for A Scalable Trie Building Algorithm for High-Throughput Phyloanalysis of Wafer-Scale Digital Evolution Experiments

Figure 3 for A Scalable Trie Building Algorithm for High-Throughput Phyloanalysis of Wafer-Scale Digital Evolution Experiments

Figure 4 for A Scalable Trie Building Algorithm for High-Throughput Phyloanalysis of Wafer-Scale Digital Evolution Experiments

Abstract:Agent-based simulation platforms play a key role in enabling fast-to-run evolution experiments that can be precisely controlled and observed in detail. Availability of high-resolution snapshots of lineage ancestries from digital experiments, in particular, is key to investigations of evolvability and open-ended evolution, as well as in providing a validation testbed for bioinformatics method development. Ongoing advances in AI/ML hardware accelerator devices, such as the 850,000-processor Cerebras Wafer-Scale Engine (WSE), are poised to broaden the scope of evolutionary questions that can be investigated in silico. However, constraints in memory capacity and locality characteristic of these systems introduce difficulties in exhaustively tracking phylogenies at runtime. To overcome these challenges, recent work on hereditary stratigraphy algorithms has developed space-efficient genetic markers to facilitate fully decentralized estimation of relatedness among digital organisms. However, in existing work, compute time to reconstruct phylogenies from these genetic markers has proven a limiting factor in achieving large-scale phyloanalyses. Here, we detail an improved trie-building algorithm designed to produce reconstructions equivalent to existing approaches. For modestly-sized 10,000-tip trees, the proposed approach achieves a 300-fold speedup versus existing state-of-the-art. Finally, using 1 billion genome datasets drawn from WSE simulations encompassing 954 trillion replication events, we report a pair of large-scale phylogeny reconstruction trials, achieving end-to-end reconstruction times of 2.6 and 2.9 hours. In substantially improving reconstruction scaling and throughput, presented work establishes a key foundation to enable powerful high-throughput phyloanalysis techniques in large-scale digital evolution experiments.

* Accepted by ALIFE 2025

Via

Access Paper or Ask Questions

Extending a Phylogeny-based Method for Detecting Signatures of Multi-level Selection for Applications in Artificial Life

Aug 19, 2025

Matthew Andres Moreno, Sanaz Hasanzadeh Fard, Luis Zaman, Emily Dolson

Abstract:Multilevel selection occurs when short-term individual-level reproductive interests conflict with longer-term group-level fitness effects. Detecting and quantifying this phenomenon is key to understanding evolution of traits ranging from multicellularity to pathogen virulence. Multilevel selection is particularly important in artificial life research due to its connection to major evolutionary transitions, a hallmark of open-ended evolution. Bonetti Franceschi & Volz (2024) proposed to detect multilevel selection dynamics by screening for mutations that appear more often in a population than expected by chance (due to individual-level fitness benefits) but are ultimately associated with negative longer-term fitness outcomes (i.e., smaller, shorter-lived descendant clades). Here, we use agent-based modeling with known ground truth to assess the efficacy of this approach. To test these methods under challenging conditions broadly comparable to the original dataset explored by Bonetti Franceschi & Volz (2024), we use an epidemiological framework to model multilevel selection in trade-offs between within-host growth rate and between-host transmissibility. To achieve success on our in silico data, we develop an alternate normalization procedure for identifying clade-level fitness effects. We find the method to be sensitive in detecting genome sites under multilevel selection with 30% effect sizes on fitness, but do not see sensitivity to smaller 10% mutation effect sizes. To test the robustness of this methodology, we conduct additional experiments incorporating extrinsic, time-varying environmental changes and adaptive turnover in population compositions, and find that screen performance remains generally consistent with baseline conditions. This work represents a promising step towards rigorous generalizable quantification of multilevel selection effects.

Via

Access Paper or Ask Questions

The Robustness of Structural Features in Species Interaction Networks

Feb 24, 2025

Sanaz Hasanzadeh Fard, Emily Dolson

Abstract:Species interaction networks are a powerful tool for describing ecological communities; they typically contain nodes representing species, and edges representing interactions between those species. For the purposes of drawing abstract inferences about groups of similar networks, ecologists often use graph topology metrics to summarize structural features. However, gathering the data that underlies these networks is challenging, which can lead to some interactions being missed. Thus, it is important to understand how much different structural metrics are affected by missing data. To address this question, we analyzed a database of 148 real-world bipartite networks representing four different types of species interactions (pollination, host-parasite, plant-ant, and seed-dispersal). For each network, we measured six different topological properties: number of connected components, variance in node betweenness, variance in node PageRank, largest Eigenvalue, the number of non-zero Eigenvalues, and community detection as determined by four different algorithms. We then tested how these properties change as additional edges -- representing data that may have been missed -- are added to the networks. We found substantial variation in how robust different properties were to the missing data. For example, the Clauset-Newman-Moore and Louvain community detection algorithms showed much more gradual change as edges were added than the label propagation and Girvan-Newman algorithms did, suggesting that the former are more robust. Robustness also varied for some metrics based on interaction type. These results provide a foundation for selecting network properties to use when analyzing messy ecological network data.

Via

Access Paper or Ask Questions

A Guide to Tracking Phylogenies in Parallel and Distributed Agent-based Evolution Models

May 16, 2024

Matthew Andres Moreno, Anika Ranjan, Emily Dolson, Luis Zaman

Figure 1 for A Guide to Tracking Phylogenies in Parallel and Distributed Agent-based Evolution Models

Figure 2 for A Guide to Tracking Phylogenies in Parallel and Distributed Agent-based Evolution Models

Figure 3 for A Guide to Tracking Phylogenies in Parallel and Distributed Agent-based Evolution Models

Figure 4 for A Guide to Tracking Phylogenies in Parallel and Distributed Agent-based Evolution Models

Abstract:Computer simulations are an important tool for studying the mechanics of biological evolution. In particular, in silico work with agent-based models provides an opportunity to collect high-quality records of ancestry relationships among simulated agents. Such phylogenies can provide insight into evolutionary dynamics within these simulations. Existing work generally tracks lineages directly, yielding an exact phylogenetic record of evolutionary history. However, direct tracking can be inefficient for large-scale, many-processor evolutionary simulations. An alternate approach to extracting phylogenetic information from simulation that scales more favorably is post hoc estimation, akin to how bioinformaticians build phylogenies by assessing genetic similarities between organisms. Recently introduced ``hereditary stratigraphy'' algorithms provide means for efficient inference of phylogenetic history from non-coding annotations on simulated organisms' genomes. A number of options exist in configuring hereditary stratigraphy methodology, but no work has yet tested how they impact reconstruction quality. To address this question, we surveyed reconstruction accuracy under alternate configurations across a matrix of evolutionary conditions varying in selection pressure, spatial structure, and ecological dynamics. We synthesize results from these experiments to suggest a prescriptive system of best practices for work with hereditary stratigraphy, ultimately guiding researchers in choosing appropriate instrumentation for large-scale simulation studies.

Via

Access Paper or Ask Questions

Phylotrack: C++ and Python libraries for in silico phylogenetic tracking

May 15, 2024

Emily Dolson, Santiago Rodriguez-Papa, Matthew Andres Moreno

Figure 1 for Phylotrack: C++ and Python libraries for in silico phylogenetic tracking

Figure 2 for Phylotrack: C++ and Python libraries for in silico phylogenetic tracking

Abstract:In silico evolution instantiates the processes of heredity, variation, and differential reproductive success (the three "ingredients" for evolution by natural selection) within digital populations of computational agents. Consequently, these populations undergo evolution, and can be used as virtual model systems for studying evolutionary dynamics. This experimental paradigm -- used across biological modeling, artificial life, and evolutionary computation -- complements research done using in vitro and in vivo systems by enabling experiments that would be impossible in the lab or field. One key benefit is complete, exact observability. For example, it is possible to perfectly record all parent-child relationships across simulation history, yielding complete phylogenies (ancestry trees). This information reveals when traits were gained or lost, and also facilitates inference of underlying evolutionary dynamics. The Phylotrack project provides libraries for tracking and analyzing phylogenies in in silico evolution. The project is composed of 1) Phylotracklib: a header-only C++ library, developed under the umbrella of the Empirical project, and 2) Phylotrackpy: a Python wrapper around Phylotracklib, created with Pybind11. Both components supply a public-facing API to attach phylogenetic tracking to digital evolution systems, as well as a stand-alone interface for measuring a variety of popular phylogenetic topology metrics. Underlying design and C++ implementation prioritizes efficiency, allowing for fast generational turnover for agent populations numbering in the tens of thousands. Several explicit features (e.g., phylogeny pruning and abstraction, etc.) are provided for reducing the memory footprint of phylogenetic information.

Via

Access Paper or Ask Questions

Ecology, Spatial Structure, and Selection Pressure Induce Strong Signatures in Phylogenetic Structure

May 12, 2024

Matthew Andres Moreno, Santiago Rodriguez-Papa, Emily Dolson

Figure 1 for Ecology, Spatial Structure, and Selection Pressure Induce Strong Signatures in Phylogenetic Structure

Figure 2 for Ecology, Spatial Structure, and Selection Pressure Induce Strong Signatures in Phylogenetic Structure

Figure 3 for Ecology, Spatial Structure, and Selection Pressure Induce Strong Signatures in Phylogenetic Structure

Figure 4 for Ecology, Spatial Structure, and Selection Pressure Induce Strong Signatures in Phylogenetic Structure

Abstract:Evolutionary dynamics are shaped by a variety of fundamental, generic drivers, including spatial structure, ecology, and selection pressure. These drivers impact the trajectory of evolution, and have been hypothesized to influence phylogenetic structure. Here, we set out to assess (1) if spatial structure, ecology, and selection pressure leave detectable signatures in phylogenetic structure, (2) the extent, in particular, to which ecology can be detected and discerned in the presence of spatial structure, and (3) the extent to which these phylogenetic signatures generalize across evolutionary systems. To this end, we analyze phylogenies generated by manipulating spatial structure, ecology, and selection pressure within three computational models of varied scope and sophistication. We find that selection pressure, spatial structure, and ecology have characteristic effects on phylogenetic metrics, although these effects are complex and not always intuitive. Signatures have some consistency across systems when using equivalent taxonomic unit definitions (e.g., individual, genotype, species). Further, we find that sufficiently strong ecology can be detected in the presence of spatial structure. We also find that, while low-resolution phylogenetic reconstructions can bias some phylogenetic metrics, high-resolution reconstructions recapitulate them faithfully. Although our results suggest potential for evolutionary inference of spatial structure, ecology, and selection pressure through phylogenetic analysis, further methods development is needed to distinguish these drivers' phylometric signatures from each other and to appropriately normalize phylogenetic metrics. With such work, phylogenetic analysis could provide a versatile toolkit to study large-scale evolving populations.

Via

Access Paper or Ask Questions

Trackable Island-model Genetic Algorithms at Wafer Scale

May 06, 2024

Matthew Andres Moreno, Connor Yang, Emily Dolson, Luis Zaman

Figure 1 for Trackable Island-model Genetic Algorithms at Wafer Scale

Figure 2 for Trackable Island-model Genetic Algorithms at Wafer Scale

Abstract:Emerging ML/AI hardware accelerators, like the 850,000 processor Cerebras Wafer-Scale Engine (WSE), hold great promise to scale up the capabilities of evolutionary computation. However, challenges remain in maintaining visibility into underlying evolutionary processes while efficiently utilizing these platforms' large processor counts. Here, we focus on the problem of extracting phylogenetic information from digital evolution on the WSE platform. We present a tracking-enabled asynchronous island-based genetic algorithm (GA) framework for WSE hardware. Emulated and on-hardware GA benchmarks with a simple tracking-enabled agent model clock upwards of 1 million generations a minute for population sizes reaching 16 million. This pace enables quadrillions of evaluations a day. We validate phylogenetic reconstructions from these trials and demonstrate their suitability for inference of underlying evolutionary conditions. In particular, we demonstrate extraction of clear phylometric signals that differentiate wafer-scale runs with adaptive dynamics enabled versus disabled. Together, these benchmark and validation trials reflect strong potential for highly scalable evolutionary computation that is both efficient and observable. Kernel code implementing the island-model GA supports drop-in customization to support any fixed-length genome content and fitness criteria, allowing it to be leveraged to advance research interests across the community.

* arXiv admin note: substantial text overlap with arXiv:2404.10861

Via

Access Paper or Ask Questions

Trackable Agent-based Evolution Models at Wafer Scale

Apr 16, 2024

Matthew Andres Moreno, Connor Yang, Emily Dolson, Luis Zaman

Figure 1 for Trackable Agent-based Evolution Models at Wafer Scale

Figure 2 for Trackable Agent-based Evolution Models at Wafer Scale

Figure 3 for Trackable Agent-based Evolution Models at Wafer Scale

Figure 4 for Trackable Agent-based Evolution Models at Wafer Scale

Abstract:Continuing improvements in computing hardware are poised to transform capabilities for in silico modeling of cross-scale phenomena underlying major open questions in evolutionary biology and artificial life, such as transitions in individuality, eco-evolutionary dynamics, and rare evolutionary events. Emerging ML/AI-oriented hardware accelerators, like the 850,000 processor Cerebras Wafer Scale Engine (WSE), hold particular promise. However, practical challenges remain in conducting informative evolution experiments that efficiently utilize these platforms' large processor counts. Here, we focus on the problem of extracting phylogenetic information from agent-based evolution on the WSE platform. This goal drove significant refinements to decentralized in silico phylogenetic tracking, reported here. These improvements yield order-of-magnitude performance improvements. We also present an asynchronous island-based genetic algorithm (GA) framework for WSE hardware. Emulated and on-hardware GA benchmarks with a simple tracking-enabled agent model clock upwards of 1 million generations a minute for population sizes reaching 16 million agents. We validate phylogenetic reconstructions from these trials and demonstrate their suitability for inference of underlying evolutionary conditions. In particular, we demonstrate extraction, from wafer-scale simulation, of clear phylometric signals that differentiate runs with adaptive dynamics enabled versus disabled. Together, these benchmark and validation trials reflect strong potential for highly scalable agent-based evolution simulation that is both efficient and observable. Developed capabilities will bring entirely new classes of previously intractable research questions within reach, benefiting further explorations within the evolutionary biology and artificial life communities across a variety of emerging high-performance computing platforms.

Via

Access Paper or Ask Questions

On the Robustness of Lexicase Selection to Contradictory Objectives

Mar 11, 2024

Shakiba Shahbandegan, Emily Dolson

Abstract:Lexicase and epsilon-lexicase selection are state of the art parent selection techniques for problems featuring multiple selection criteria. Originally, lexicase selection was developed for cases where these selection criteria are unlikely to be in conflict with each other, but preliminary work suggests it is also a highly effective many-objective optimization algorithm. However, to predict whether these results generalize, we must understand lexicase selection's performance on contradictory objectives. Prior work has shown mixed results on this question. Here, we develop theory identifying circumstances under which lexicase selection will succeed or fail to find a Pareto-optimal solution. To make this analysis tractable, we restrict our investigation to a theoretical problem with maximally contradictory objectives. Ultimately, we find that lexicase and epsilon-lexicase selection each have a region of parameter space where they are incapable of optimizing contradictory objectives. Outside of this region, however, they perform well despite the presence of contradictory objectives. Based on these findings, we propose theoretically-backed guidelines for parameter choice. Additionally, we identify other properties that may affect whether a many-objective optimization problem is a good fit for lexicase or epsilon-lexicase selection.

Via

Access Paper or Ask Questions