Alert button
Picture for Jeffrey L. Krichmar

Jeffrey L. Krichmar

Alert button

Policy Distillation with Selective Input Gradient Regularization for Efficient Interpretability

May 18, 2022
Jinwei Xing, Takashi Nagata, Xinyun Zou, Emre Neftci, Jeffrey L. Krichmar

Figure 1 for Policy Distillation with Selective Input Gradient Regularization for Efficient Interpretability
Figure 2 for Policy Distillation with Selective Input Gradient Regularization for Efficient Interpretability
Figure 3 for Policy Distillation with Selective Input Gradient Regularization for Efficient Interpretability
Figure 4 for Policy Distillation with Selective Input Gradient Regularization for Efficient Interpretability

Although deep Reinforcement Learning (RL) has proven successful in a wide range of tasks, one challenge it faces is interpretability when applied to real-world problems. Saliency maps are frequently used to provide interpretability for deep neural networks. However, in the RL domain, existing saliency map approaches are either computationally expensive and thus cannot satisfy the real-time requirement of real-world scenarios or cannot produce interpretable saliency maps for RL policies. In this work, we propose an approach of Distillation with selective Input Gradient Regularization (DIGR) which uses policy distillation and input gradient regularization to produce new policies that achieve both high interpretability and computation efficiency in generating saliency maps. Our approach is also found to improve the robustness of RL policies to multiple adversarial attacks. We conduct experiments on three tasks, MiniGrid (Fetch Object), Atari (Breakout) and CARLA Autonomous Driving, to demonstrate the importance and effectiveness of our approach.

Viaarxiv icon

Edelman's Steps Toward a Conscious Artifact

May 25, 2021
Jeffrey L. Krichmar

Figure 1 for Edelman's Steps Toward a Conscious Artifact
Figure 2 for Edelman's Steps Toward a Conscious Artifact

In 2006, during a meeting of a working group of scientists in La Jolla, California at The Neurosciences Institute (NSI), Gerald Edelman described a roadmap towards the creation of a Conscious Artifact. As far as I know, this roadmap was not published. However, it did shape my thinking and that of many others in the years since that meeting. This short paper, which is based on my notes taken during the meeting, describes the key steps in this roadmap. I believe it is as groundbreaking today as it was more than 15 years ago.

* 7 pages, 1 figure, 1 table 
Viaarxiv icon

Dynamic Reliability Management in Neuromorphic Computing

May 05, 2021
Shihao Song, Jui Hanamshet, Adarsha Balaji, Anup Das, Jeffrey L. Krichmar, Nikil D. Dutt, Nagarajan Kandasamy, Francky Catthoor

Figure 1 for Dynamic Reliability Management in Neuromorphic Computing
Figure 2 for Dynamic Reliability Management in Neuromorphic Computing
Figure 3 for Dynamic Reliability Management in Neuromorphic Computing
Figure 4 for Dynamic Reliability Management in Neuromorphic Computing

Neuromorphic computing systems uses non-volatile memory (NVM) to implement high-density and low-energy synaptic storage. Elevated voltages and currents needed to operate NVMs cause aging of CMOS-based transistors in each neuron and synapse circuit in the hardware, drifting the transistor's parameters from their nominal values. Aggressive device scaling increases power density and temperature, which accelerates the aging, challenging the reliable operation of neuromorphic systems. Existing reliability-oriented techniques periodically de-stress all neuron and synapse circuits in the hardware at fixed intervals, assuming worst-case operating conditions, without actually tracking their aging at run time. To de-stress these circuits, normal operation must be interrupted, which introduces latency in spike generation and propagation, impacting the inter-spike interval and hence, performance, e.g., accuracy. We propose a new architectural technique to mitigate the aging-related reliability problems in neuromorphic systems, by designing an intelligent run-time manager (NCRTM), which dynamically destresses neuron and synapse circuits in response to the short-term aging in their CMOS transistors during the execution of machine learning workloads, with the objective of meeting a reliability target. NCRTM de-stresses these circuits only when it is absolutely necessary to do so, otherwise reducing the performance impact by scheduling de-stress operations off the critical path. We evaluate NCRTM with state-of-the-art machine learning workloads on a neuromorphic hardware. Our results demonstrate that NCRTM significantly improves the reliability of neuromorphic hardware, with marginal impact on performance.

* Accepted in ACM JETC 
Viaarxiv icon

Neuroevolution of a Recurrent Neural Network for Spatial and Working Memory in a Simulated Robotic Environment

Feb 25, 2021
Xinyun Zou, Eric O. Scott, Alexander B. Johnson, Kexin Chen, Douglas A. Nitz, Kenneth A. De Jong, Jeffrey L. Krichmar

Figure 1 for Neuroevolution of a Recurrent Neural Network for Spatial and Working Memory in a Simulated Robotic Environment
Figure 2 for Neuroevolution of a Recurrent Neural Network for Spatial and Working Memory in a Simulated Robotic Environment
Figure 3 for Neuroevolution of a Recurrent Neural Network for Spatial and Working Memory in a Simulated Robotic Environment
Figure 4 for Neuroevolution of a Recurrent Neural Network for Spatial and Working Memory in a Simulated Robotic Environment

Animals ranging from rats to humans can demonstrate cognitive map capabilities. We evolved weights in a biologically plausible recurrent neural network (RNN) using an evolutionary algorithm to replicate the behavior and neural activity observed in rats during a spatial and working memory task in a triple T-maze. The rat was simulated in the Webots robot simulator and used vision, distance and accelerometer sensors to navigate a virtual maze. After evolving weights from sensory inputs to the RNN, within the RNN, and from the RNN to the robot's motors, the Webots agent successfully navigated the space to reach all four reward arms with minimal repeats before time-out. Our current findings suggest that it is the RNN dynamics that are key to performance, and that performance is not dependent on any one sensory type, which suggests that neurons in the RNN are performing mixed selectivity and conjunctive coding. Moreover, the RNN activity resembles spatial information and trajectory-dependent coding observed in the hippocampus. Collectively, the evolved RNN exhibits navigation skills, spatial memory, and working memory. Our method demonstrates how the dynamic activity in evolved RNNs can capture interesting and complex cognitive behavior and may be used to create RNN controllers for robotic applications.

Viaarxiv icon

Domain Adaptation In Reinforcement Learning Via Latent Unified State Representation

Feb 10, 2021
Jinwei Xing, Takashi Nagata, Kexin Chen, Xinyun Zou, Emre Neftci, Jeffrey L. Krichmar

Figure 1 for Domain Adaptation In Reinforcement Learning Via Latent Unified State Representation
Figure 2 for Domain Adaptation In Reinforcement Learning Via Latent Unified State Representation
Figure 3 for Domain Adaptation In Reinforcement Learning Via Latent Unified State Representation
Figure 4 for Domain Adaptation In Reinforcement Learning Via Latent Unified State Representation

Despite the recent success of deep reinforcement learning (RL), domain adaptation remains an open problem. Although the generalization ability of RL agents is critical for the real-world applicability of Deep RL, zero-shot policy transfer is still a challenging problem since even minor visual changes could make the trained agent completely fail in the new task. To address this issue, we propose a two-stage RL agent that first learns a latent unified state representation (LUSR) which is consistent across multiple domains in the first stage, and then do RL training in one source domain based on LUSR in the second stage. The cross-domain consistency of LUSR allows the policy acquired from the source domain to generalize to other target domains without extra training. We first demonstrate our approach in variants of CarRacing games with customized manipulations, and then verify it in CARLA, an autonomous driving simulator with more complex and realistic visual observations. Our results show that this approach can achieve state-of-the-art domain adaptation performance in related RL tasks and outperforms prior approaches based on latent-representation based RL and image-to-image translation.

* Accepted by AAAI 2021 
Viaarxiv icon

PyCARL: A PyNN Interface for Hardware-Software Co-Simulation of Spiking Neural Network

Mar 21, 2020
Adarsha Balaji, Prathyusha Adiraju, Hirak J. Kashyap, Anup Das, Jeffrey L. Krichmar, Nikil D. Dutt, Francky Catthoor

Figure 1 for PyCARL: A PyNN Interface for Hardware-Software Co-Simulation of Spiking Neural Network
Figure 2 for PyCARL: A PyNN Interface for Hardware-Software Co-Simulation of Spiking Neural Network
Figure 3 for PyCARL: A PyNN Interface for Hardware-Software Co-Simulation of Spiking Neural Network
Figure 4 for PyCARL: A PyNN Interface for Hardware-Software Co-Simulation of Spiking Neural Network

We present PyCARL, a PyNN-based common Python programming interface for hardware-software co-simulation of spiking neural network (SNN). Through PyCARL, we make the following two key contributions. First, we provide an interface of PyNN to CARLsim, a computationally-efficient, GPU-accelerated and biophysically-detailed SNN simulator. PyCARL facilitates joint development of machine learning models and code sharing between CARLsim and PyNN users, promoting an integrated and larger neuromorphic community. Second, we integrate cycle-accurate models of state-of-the-art neuromorphic hardware such as TrueNorth, Loihi, and DynapSE in PyCARL, to accurately model hardware latencies that delay spikes between communicating neurons and degrade performance. PyCARL allows users to analyze and optimize the performance difference between software-only simulation and hardware-software co-simulation of their machine learning models. We show that system designers can also use PyCARL to perform design-space exploration early in the product development stage, facilitating faster time-to-deployment of neuromorphic products. We evaluate the memory usage and simulation time of PyCARL using functionality tests, synthetic SNNs, and realistic applications. Our results demonstrate that for large SNNs, PyCARL does not lead to any significant overhead compared to CARLsim. We also use PyCARL to analyze these SNNs for a state-of-the-art neuromorphic hardware and demonstrate a significant performance deviation from software-only simulations. PyCARL allows to evaluate and minimize such differences early during model development.

* 10 pages, 25 figures. Accepted for publication at International Joint Conference on Neural Networks (IJCNN) 2020 
Viaarxiv icon

Deep Reinforcement Learning with Modulated Hebbian plus Q Network Architecture

Sep 21, 2019
Pawel Ladosz, Eseoghene Ben-Iwhiwhu, Yang Hu, Nicholas Ketz, Soheil Kolouri, Jeffrey L. Krichmar, Praveen Pilly, Andrea Soltoggio

Figure 1 for Deep Reinforcement Learning with Modulated Hebbian plus Q Network Architecture
Figure 2 for Deep Reinforcement Learning with Modulated Hebbian plus Q Network Architecture
Figure 3 for Deep Reinforcement Learning with Modulated Hebbian plus Q Network Architecture
Figure 4 for Deep Reinforcement Learning with Modulated Hebbian plus Q Network Architecture

This paper introduces the modulated Hebbian plus Q network architecture (MOHQA) for solving challenging partially observable Markov decision processes (POMDPs) deep reinforcement learning problems with sparse rewards and confounding observations. The proposed architecture combines a deep Q-network (DQN), and a modulated Hebbian network with neural eligibility traces (MOHN). Bio-inspired neural traces are used to bridge temporal delays between actions and rewards. The purpose is to discover distal cause-effect relationships where confounding observations and sparse rewards cause standard RL algorithms to fail. Each of the two modules of the network (DQN and MOHN) is responsible for different aspects of learning. DQN learns low level features and control, while MOHN contributes to the high-level decisions by bridging rewards with past actions. The strength of the approach is to support a DQN standard framework when temporal difference errors are difficult to compute due to non-observable states. The system is tested on a set of generalized decision making problems encoded as decision tree graphs that deliver delayed rewards after key decision points and confounding observations. The simulations show that the proposed approach helps solve problems that are currently challenging for state-of-the-art deep reinforcement learning algorithms.

Viaarxiv icon

Neuromodulated Patience for Robot and Self-Driving Vehicle Navigation

Sep 14, 2019
Jinwei Xing, Xinyun Zou, Jeffrey L. Krichmar

Figure 1 for Neuromodulated Patience for Robot and Self-Driving Vehicle Navigation
Figure 2 for Neuromodulated Patience for Robot and Self-Driving Vehicle Navigation
Figure 3 for Neuromodulated Patience for Robot and Self-Driving Vehicle Navigation
Figure 4 for Neuromodulated Patience for Robot and Self-Driving Vehicle Navigation

Robots and self-driving vehicles face a number of challenges when navigating through real environments. Successful navigation in dynamic environments requires prioritizing subtasks and monitoring resources. Animals are under similar constraints. It has been shown that the neuromodulator serotonin regulates impulsiveness and patience in animals. In the present paper, we take inspiration from the serotonergic system and apply it to the task of robot navigation. In a set of outdoor experiments, we show how changing the level of patience can affect the amount of time the robot will spend searching for a desired location. To navigate GPS compromised environments, we introduce a deep reinforcement learning paradigm in which the robot learns to follow sidewalks. This may further regulate a tradeoff between a smooth long route and a rough shorter route. Using patience as a parameter may be beneficial for autonomous systems under time pressure.

* 10 pages, 9 figures 
Viaarxiv icon

Mapping Spiking Neural Networks to Neuromorphic Hardware

Sep 04, 2019
Adarsha Balaji, Anup Das, Yuefeng Wu, Khanh Huynh, Francesco Dell'Anna, Giacomo Indiveri, Jeffrey L. Krichmar, Nikil Dutt, Siebren Schaafsma, Francky Catthoor

Figure 1 for Mapping Spiking Neural Networks to Neuromorphic Hardware
Figure 2 for Mapping Spiking Neural Networks to Neuromorphic Hardware
Figure 3 for Mapping Spiking Neural Networks to Neuromorphic Hardware
Figure 4 for Mapping Spiking Neural Networks to Neuromorphic Hardware

Neuromorphic hardware platforms implement biological neurons and synapses to execute spiking neural networks (SNNs) in an energy-efficient manner. We present SpiNeMap, a design methodology to map SNNs to crossbar-based neuromorphic hardware, minimizing spike latency and energy consumption. SpiNeMap operates in two steps: SpiNeCluster and SpiNePlacer. SpiNeCluster is a heuristic-based clustering technique to partition SNNs into clusters of synapses, where intracluster local synapses are mapped within crossbars of the hardware and inter-cluster global synapses are mapped to the shared interconnect. SpiNeCluster minimizes the number of spikes on global synapses, which reduces spike congestion on the shared interconnect, improving application performance. SpiNePlacer then finds the best placement of local and global synapses on the hardware using a meta-heuristic-based approach to minimize energy consumption and spike latency. We evaluate SpiNeMap using synthetic and realistic SNNs on the DynapSE neuromorphic hardware. We show that SpiNeMap reduces average energy consumption by 45% and average spike latency by 21%, compared to state-of-the-art techniques.

* 14 pages, 14 images, 69 references, Accepted in IEEE Transactions on Very Large Scale Integration (VLSI) Systems 
Viaarxiv icon