Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

James M. Murray

Dynamics of Supervised and Reinforcement Learning in the Non-Linear Perceptron

Sep 05, 2024

Christian Schmid, James M. Murray

Figure 1 for Dynamics of Supervised and Reinforcement Learning in the Non-Linear Perceptron

Figure 2 for Dynamics of Supervised and Reinforcement Learning in the Non-Linear Perceptron

Figure 3 for Dynamics of Supervised and Reinforcement Learning in the Non-Linear Perceptron

Figure 4 for Dynamics of Supervised and Reinforcement Learning in the Non-Linear Perceptron

Abstract:The ability of a brain or a neural network to efficiently learn depends crucially on both the task structure and the learning rule. Previous works have analyzed the dynamical equations describing learning in the relatively simplified context of the perceptron under assumptions of a student-teacher framework or a linearized output. While these assumptions have facilitated theoretical understanding, they have precluded a detailed understanding of the roles of the nonlinearity and input-data distribution in determining the learning dynamics, limiting the applicability of the theories to real biological or artificial neural networks. Here, we use a stochastic-process approach to derive flow equations describing learning, applying this framework to the case of a nonlinear perceptron performing binary classification. We characterize the effects of the learning rule (supervised or reinforcement learning, SL/RL) and input-data distribution on the perceptron's learning curve and the forgetting curve as subsequent tasks are learned. In particular, we find that the input-data noise differently affects the learning speed under SL vs. RL, as well as determines how quickly learning of a task is overwritten by subsequent learning. Additionally, we verify our approach with real data using the MNIST dataset. This approach points a way toward analyzing learning dynamics for more-complex circuit architectures.

Via

Access Paper or Ask Questions

Gradient Descent Temporal Difference-difference Learning

Sep 10, 2022

Rong J. B. Zhu, James M. Murray

Figure 1 for Gradient Descent Temporal Difference-difference Learning

Figure 2 for Gradient Descent Temporal Difference-difference Learning

Figure 3 for Gradient Descent Temporal Difference-difference Learning

Figure 4 for Gradient Descent Temporal Difference-difference Learning

Abstract:Off-policy algorithms, in which a behavior policy differs from the target policy and is used to gain experience for learning, have proven to be of great practical value in reinforcement learning. However, even for simple convex problems such as linear value function approximation, these algorithms are not guaranteed to be stable. To address this, alternative algorithms that are provably convergent in such cases have been introduced, the most well known being gradient descent temporal difference (GTD) learning. This algorithm and others like it, however, tend to converge much more slowly than conventional temporal difference learning. In this paper we propose gradient descent temporal difference-difference (Gradient-DD) learning in order to improve GTD2, a GTD algorithm, by introducing second-order differences in successive parameter updates. We investigate this algorithm in the framework of linear value function approximation, theoretically proving its convergence by applying the theory of stochastic approximation. %analytically showing its improvement over GTD2. Studying the model empirically on the random walk task, the Boyan-chain task, and the Baird's off-policy counterexample, we find substantial improvement over GTD2 and, in several cases, better performance even than conventional TD learning.

Via

Access Paper or Ask Questions

Distinguishing Learning Rules with Brain Machine Interfaces

Jun 27, 2022

Jacob P. Portes, Christian Schmid, James M. Murray

Figure 1 for Distinguishing Learning Rules with Brain Machine Interfaces

Figure 2 for Distinguishing Learning Rules with Brain Machine Interfaces

Figure 3 for Distinguishing Learning Rules with Brain Machine Interfaces

Figure 4 for Distinguishing Learning Rules with Brain Machine Interfaces

Abstract:Despite extensive theoretical work on biologically plausible learning rules, it has been difficult to obtain clear evidence about whether and how such rules are implemented in the brain. We consider biologically plausible supervised- and reinforcement-learning rules and ask whether changes in network activity during learning can be used to determine which learning rule is being used. Supervised learning requires a credit-assignment model estimating the mapping from neural activity to behavior, and, in a biological organism, this model will inevitably be an imperfect approximation of the ideal mapping, leading to a bias in the direction of the weight updates relative to the true gradient. Reinforcement learning, on the other hand, requires no credit-assignment model and tends to make weight updates following the true gradient direction. We derive a metric to distinguish between learning rules by observing changes in the network activity during learning, given that the mapping from brain to behavior is known by the experimenter. Because brain-machine interface (BMI) experiments allow for perfect knowledge of this mapping, we focus on modeling a cursor-control BMI task using recurrent neural networks, showing that learning rules can be distinguished in simulated experiments using only observations that a neuroscience experimenter would plausibly have access to.

* 22 pages, 12 figures

Via

Access Paper or Ask Questions