Picture for Amnon Geifman

Amnon Geifman

Allan

Nemotron 3 Ultra: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Add code
Jun 12, 2026
Viaarxiv icon

Nemotron 3 Super: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Add code
Apr 14, 2026
Viaarxiv icon

Extending Puzzle for Mixture-of-Experts Reasoning Models with Application to GPT-OSS Acceleration

Add code
Feb 12, 2026
Viaarxiv icon

NVIDIA Nemotron 3: Efficient and Open Intelligence

Add code
Dec 24, 2025
Viaarxiv icon

Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Add code
Dec 23, 2025
Viaarxiv icon

FFN Fusion: Rethinking Sequential Computation in Large Language Models

Add code
Mar 24, 2025
Viaarxiv icon

Puzzle: Distillation-Based NAS for Inference-Optimized LLMs

Add code
Dec 03, 2024
Figure 1 for Puzzle: Distillation-Based NAS for Inference-Optimized LLMs
Figure 2 for Puzzle: Distillation-Based NAS for Inference-Optimized LLMs
Figure 3 for Puzzle: Distillation-Based NAS for Inference-Optimized LLMs
Figure 4 for Puzzle: Distillation-Based NAS for Inference-Optimized LLMs
Viaarxiv icon

Controlling the Inductive Bias of Wide Neural Networks by Modifying the Kernel's Spectrum

Add code
Jul 26, 2023
Figure 1 for Controlling the Inductive Bias of Wide Neural Networks by Modifying the Kernel's Spectrum
Viaarxiv icon

A Kernel Perspective of Skip Connections in Convolutional Networks

Add code
Nov 27, 2022
Figure 1 for A Kernel Perspective of Skip Connections in Convolutional Networks
Figure 2 for A Kernel Perspective of Skip Connections in Convolutional Networks
Figure 3 for A Kernel Perspective of Skip Connections in Convolutional Networks
Viaarxiv icon

On the Spectral Bias of Convolutional Neural Tangent and Gaussian Process Kernels

Add code
Mar 17, 2022
Figure 1 for On the Spectral Bias of Convolutional Neural Tangent and Gaussian Process Kernels
Figure 2 for On the Spectral Bias of Convolutional Neural Tangent and Gaussian Process Kernels
Figure 3 for On the Spectral Bias of Convolutional Neural Tangent and Gaussian Process Kernels
Figure 4 for On the Spectral Bias of Convolutional Neural Tangent and Gaussian Process Kernels
Viaarxiv icon