Alert button
Picture for Christian Borgelt

Christian Borgelt

Alert button

ISAAC Newton: Input-based Approximate Curvature for Newton's Method

May 01, 2023
Felix Petersen, Tobias Sutter, Christian Borgelt, Dongsung Huh, Hilde Kuehne, Yuekai Sun, Oliver Deussen

Figure 1 for ISAAC Newton: Input-based Approximate Curvature for Newton's Method
Figure 2 for ISAAC Newton: Input-based Approximate Curvature for Newton's Method
Figure 3 for ISAAC Newton: Input-based Approximate Curvature for Newton's Method
Figure 4 for ISAAC Newton: Input-based Approximate Curvature for Newton's Method

We present ISAAC (Input-baSed ApproximAte Curvature), a novel method that conditions the gradient using selected second-order information and has an asymptotically vanishing computational overhead, assuming a batch size smaller than the number of neurons. We show that it is possible to compute a good conditioner based on only the input to a respective layer without a substantial computational overhead. The proposed method allows effective training even in small-batch stochastic regimes, which makes it competitive to first-order as well as second-order methods.

* Published at ICLR 2023, Code @ https://github.com/Felix-Petersen/isaac, Video @ https://youtu.be/7RKRX-MdwqM 
Viaarxiv icon

Deep Differentiable Logic Gate Networks

Oct 15, 2022
Felix Petersen, Christian Borgelt, Hilde Kuehne, Oliver Deussen

Figure 1 for Deep Differentiable Logic Gate Networks
Figure 2 for Deep Differentiable Logic Gate Networks
Figure 3 for Deep Differentiable Logic Gate Networks
Figure 4 for Deep Differentiable Logic Gate Networks

Recently, research has increasingly focused on developing efficient neural network architectures. In this work, we explore logic gate networks for machine learning tasks by learning combinations of logic gates. These networks comprise logic gates such as "AND" and "XOR", which allow for very fast execution. The difficulty in learning logic gate networks is that they are conventionally non-differentiable and therefore do not allow training with gradient descent. Thus, to allow for effective training, we propose differentiable logic gate networks, an architecture that combines real-valued logics and a continuously parameterized relaxation of the network. The resulting discretized logic gate networks achieve fast inference speeds, e.g., beyond a million images of MNIST per second on a single CPU core.

* Published at NeurIPS 2022 
Viaarxiv icon

Differentiable Top-k Classification Learning

Jun 15, 2022
Felix Petersen, Hilde Kuehne, Christian Borgelt, Oliver Deussen

Figure 1 for Differentiable Top-k Classification Learning
Figure 2 for Differentiable Top-k Classification Learning
Figure 3 for Differentiable Top-k Classification Learning
Figure 4 for Differentiable Top-k Classification Learning

The top-k classification accuracy is one of the core metrics in machine learning. Here, k is conventionally a positive integer, such as 1 or 5, leading to top-1 or top-5 training objectives. In this work, we relax this assumption and optimize the model for multiple k simultaneously instead of using a single k. Leveraging recent advances in differentiable sorting and ranking, we propose a differentiable top-k cross-entropy classification loss. This allows training the network while not only considering the top-1 prediction, but also, e.g., the top-2 and top-5 predictions. We evaluate the proposed loss function for fine-tuning on state-of-the-art architectures, as well as for training from scratch. We find that relaxing k does not only produce better top-5 accuracies, but also leads to top-1 accuracy improvements. When fine-tuning publicly available ImageNet models, we achieve a new state-of-the-art for these models.

* Published at ICML 2022, Code @ https://github.com/Felix-Petersen/difftopk 
Viaarxiv icon

GenDR: A Generalized Differentiable Renderer

Apr 29, 2022
Felix Petersen, Bastian Goldluecke, Christian Borgelt, Oliver Deussen

Figure 1 for GenDR: A Generalized Differentiable Renderer
Figure 2 for GenDR: A Generalized Differentiable Renderer
Figure 3 for GenDR: A Generalized Differentiable Renderer
Figure 4 for GenDR: A Generalized Differentiable Renderer

In this work, we present and study a generalized family of differentiable renderers. We discuss from scratch which components are necessary for differentiable rendering and formalize the requirements for each component. We instantiate our general differentiable renderer, which generalizes existing differentiable renderers like SoftRas and DIB-R, with an array of different smoothing distributions to cover a large spectrum of reasonable settings. We evaluate an array of differentiable renderer instantiations on the popular ShapeNet 3D reconstruction benchmark and analyze the implications of our results. Surprisingly, the simple uniform distribution yields the best overall results when averaged over 13 classes; in general, however, the optimal choice of distribution heavily depends on the task.

* Published at CVPR 2022, Code @ https://github.com/Felix-Petersen/gendr 
Viaarxiv icon

Monotonic Differentiable Sorting Networks

Mar 17, 2022
Felix Petersen, Christian Borgelt, Hilde Kuehne, Oliver Deussen

Figure 1 for Monotonic Differentiable Sorting Networks
Figure 2 for Monotonic Differentiable Sorting Networks
Figure 3 for Monotonic Differentiable Sorting Networks
Figure 4 for Monotonic Differentiable Sorting Networks

Differentiable sorting algorithms allow training with sorting and ranking supervision, where only the ordering or ranking of samples is known. Various methods have been proposed to address this challenge, ranging from optimal transport-based differentiable Sinkhorn sorting algorithms to making classic sorting networks differentiable. One problem of current differentiable sorting methods is that they are non-monotonic. To address this issue, we propose a novel relaxation of conditional swap operations that guarantees monotonicity in differentiable sorting networks. We introduce a family of sigmoid functions and prove that they produce differentiable sorting networks that are monotonic. Monotonicity ensures that the gradients always have the correct sign, which is an advantage in gradient-based optimization. We demonstrate that monotonic differentiable sorting networks improve upon previous differentiable sorting methods.

* Published at ICLR 2022, Code @ https://github.com/Felix-Petersen/diffsort, Video @ https://www.youtube.com/watch?v=Rl-sFaE1z4M 
Viaarxiv icon

Learning with Algorithmic Supervision via Continuous Relaxations

Oct 25, 2021
Felix Petersen, Christian Borgelt, Hilde Kuehne, Oliver Deussen

Figure 1 for Learning with Algorithmic Supervision via Continuous Relaxations
Figure 2 for Learning with Algorithmic Supervision via Continuous Relaxations
Figure 3 for Learning with Algorithmic Supervision via Continuous Relaxations
Figure 4 for Learning with Algorithmic Supervision via Continuous Relaxations

The integration of algorithmic components into neural architectures has gained increased attention recently, as it allows training neural networks with new forms of supervision such as ordering constraints or silhouettes instead of using ground truth labels. Many approaches in the field focus on the continuous relaxation of a specific task and show promising results in this context. But the focus on single tasks also limits the applicability of the proposed concepts to a narrow range of applications. In this work, we build on those ideas to propose an approach that allows to integrate algorithms into end-to-end trainable neural network architectures based on a general approximation of discrete conditions. To this end, we relax these conditions in control structures such as conditional statements, loops, and indexing, so that resulting algorithms are smoothly differentiable. To obtain meaningful gradients, each relevant variable is perturbed via logistic distributions and the expectation value under this perturbation is approximated. We evaluate the proposed continuous relaxation model on four challenging tasks and show that it can keep up with relaxations specifically designed for each individual task.

* Published at NeurIPS 2021, Code @ https://github.com/Felix-Petersen/algovision, Video @ https://www.youtube.com/watch?v=01ENzpkjOCE 
Viaarxiv icon

Differentiable Sorting Networks for Scalable Sorting and Ranking Supervision

May 09, 2021
Felix Petersen, Christian Borgelt, Hilde Kuehne, Oliver Deussen

Figure 1 for Differentiable Sorting Networks for Scalable Sorting and Ranking Supervision
Figure 2 for Differentiable Sorting Networks for Scalable Sorting and Ranking Supervision
Figure 3 for Differentiable Sorting Networks for Scalable Sorting and Ranking Supervision
Figure 4 for Differentiable Sorting Networks for Scalable Sorting and Ranking Supervision

Sorting and ranking supervision is a method for training neural networks end-to-end based on ordering constraints. That is, the ground truth order of sets of samples is known, while their absolute values remain unsupervised. For that, we propose differentiable sorting networks by relaxing their pairwise conditional swap operations. To address the problems of vanishing gradients and extensive blurring that arise with larger numbers of layers, we propose mapping activations to regions with moderate gradients. We consider odd-even as well as bitonic sorting networks, which outperform existing relaxations of the sorting operation. We show that bitonic sorting networks can achieve stable training on large input sets of up to 1024 elements.

Viaarxiv icon

AlgoNet: $C^\infty$ Smooth Algorithmic Neural Networks

May 23, 2019
Felix Petersen, Christian Borgelt, Oliver Deussen

Figure 1 for AlgoNet: $C^\infty$ Smooth Algorithmic Neural Networks
Figure 2 for AlgoNet: $C^\infty$ Smooth Algorithmic Neural Networks
Figure 3 for AlgoNet: $C^\infty$ Smooth Algorithmic Neural Networks
Figure 4 for AlgoNet: $C^\infty$ Smooth Algorithmic Neural Networks

Artificial neural networks revolutionized many areas of computer science in recent years since they provide solutions to a number of previously unsolved problems. On the other hand, for many problems, classic algorithms exist, which typically exceed the accuracy and stability of neural networks. To combine these two concepts, we present a new kind of neural networks$-$algorithmic neural networks (AlgoNets). These networks integrate smooth versions of classic algorithms into the topology of neural networks. A forward AlgoNet includes algorithmic layers into existing architectures while a backward AlgoNet can solve inverse problems without or with only weak supervision. In addition, we present the $\texttt{algonet}$ package, a PyTorch based library that includes, inter alia, a smoothly evaluated programming language, a smooth 3D mesh renderer, and smooth sorting algorithms.

* preprint, 9 pages 
Viaarxiv icon