Alert button
Picture for Sidak Pal Singh

Sidak Pal Singh

Alert button

ETH Zurich

Hallmarks of Optimization Trajectories in Neural Networks and LLMs: The Lengths, Bends, and Dead Ends

Add code
Bookmark button
Alert button
Mar 12, 2024
Sidak Pal Singh, Bobby He, Thomas Hofmann, Bernhard Schölkopf

Figure 1 for Hallmarks of Optimization Trajectories in Neural Networks and LLMs: The Lengths, Bends, and Dead Ends
Figure 2 for Hallmarks of Optimization Trajectories in Neural Networks and LLMs: The Lengths, Bends, and Dead Ends
Figure 3 for Hallmarks of Optimization Trajectories in Neural Networks and LLMs: The Lengths, Bends, and Dead Ends
Figure 4 for Hallmarks of Optimization Trajectories in Neural Networks and LLMs: The Lengths, Bends, and Dead Ends
Viaarxiv icon

Towards Meta-Pruning via Optimal Transport

Add code
Bookmark button
Alert button
Feb 13, 2024
Alexander Theus, Olin Geimer, Friedrich Wicke, Thomas Hofmann, Sotiris Anagnostidis, Sidak Pal Singh

Viaarxiv icon

Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers

Add code
Bookmark button
Alert button
Nov 29, 2023
Vukasin Bozic, Danilo Dordevic, Daniele Coppola, Joseph Thommes, Sidak Pal Singh

Viaarxiv icon

Transformer Fusion with Optimal Transport

Add code
Bookmark button
Alert button
Oct 15, 2023
Moritz Imfeld, Jacopo Graldi, Marco Giordano, Thomas Hofmann, Sotiris Anagnostidis, Sidak Pal Singh

Figure 1 for Transformer Fusion with Optimal Transport
Figure 2 for Transformer Fusion with Optimal Transport
Figure 3 for Transformer Fusion with Optimal Transport
Figure 4 for Transformer Fusion with Optimal Transport
Viaarxiv icon

Towards guarantees for parameter isolation in continual learning

Add code
Bookmark button
Alert button
Oct 02, 2023
Giulia Lanzillotta, Sidak Pal Singh, Benjamin F. Grewe, Thomas Hofmann

Viaarxiv icon

On the curvature of the loss landscape

Add code
Bookmark button
Alert button
Jul 10, 2023
Alison Pouplin, Hrittik Roy, Sidak Pal Singh, Georgios Arvanitidis

Figure 1 for On the curvature of the loss landscape
Figure 2 for On the curvature of the loss landscape
Figure 3 for On the curvature of the loss landscape
Figure 4 for On the curvature of the loss landscape
Viaarxiv icon

The Hessian perspective into the Nature of Convolutional Neural Networks

Add code
Bookmark button
Alert button
May 16, 2023
Sidak Pal Singh, Thomas Hofmann, Bernhard Schölkopf

Figure 1 for The Hessian perspective into the Nature of Convolutional Neural Networks
Figure 2 for The Hessian perspective into the Nature of Convolutional Neural Networks
Figure 3 for The Hessian perspective into the Nature of Convolutional Neural Networks
Figure 4 for The Hessian perspective into the Nature of Convolutional Neural Networks
Viaarxiv icon

Some Fundamental Aspects about Lipschitz Continuity of Neural Network Functions

Add code
Bookmark button
Alert button
Feb 21, 2023
Grigory Khromov, Sidak Pal Singh

Figure 1 for Some Fundamental Aspects about Lipschitz Continuity of Neural Network Functions
Figure 2 for Some Fundamental Aspects about Lipschitz Continuity of Neural Network Functions
Figure 3 for Some Fundamental Aspects about Lipschitz Continuity of Neural Network Functions
Figure 4 for Some Fundamental Aspects about Lipschitz Continuity of Neural Network Functions
Viaarxiv icon

Signal Propagation in Transformers: Theoretical Perspectives and the Role of Rank Collapse

Add code
Bookmark button
Alert button
Jun 07, 2022
Lorenzo Noci, Sotiris Anagnostidis, Luca Biggio, Antonio Orvieto, Sidak Pal Singh, Aurelien Lucchi

Figure 1 for Signal Propagation in Transformers: Theoretical Perspectives and the Role of Rank Collapse
Figure 2 for Signal Propagation in Transformers: Theoretical Perspectives and the Role of Rank Collapse
Figure 3 for Signal Propagation in Transformers: Theoretical Perspectives and the Role of Rank Collapse
Figure 4 for Signal Propagation in Transformers: Theoretical Perspectives and the Role of Rank Collapse
Viaarxiv icon