Picture for Jason Ramapuram

Jason Ramapuram

Scaling Properties of Continuous Diffusion Spoken Language Models

Add code
Apr 27, 2026
Viaarxiv icon

Path-Constrained Mixture-of-Experts

Add code
Mar 18, 2026
Viaarxiv icon

The Design Space of Tri-Modal Masked Diffusion Models

Add code
Feb 25, 2026
Viaarxiv icon

A Small-Scale System for Autoregressive Program Synthesis Enabling Controlled Experimentation

Add code
Feb 09, 2026
Viaarxiv icon

Completed Hyperparameter Transfer across Modules, Width, Depth, Batch and Duration

Add code
Dec 26, 2025
Viaarxiv icon

Learning Unmasking Policies for Diffusion Language Models

Add code
Dec 12, 2025
Viaarxiv icon

Revisiting the Scaling Properties of Downstream Metrics in Large Language Model Training

Add code
Dec 09, 2025
Viaarxiv icon

Distillation Scaling Laws

Add code
Feb 12, 2025
Viaarxiv icon

Theory, Analysis, and Best Practices for Sigmoid Self-Attention

Add code
Sep 06, 2024
Figure 1 for Theory, Analysis, and Best Practices for Sigmoid Self-Attention
Figure 2 for Theory, Analysis, and Best Practices for Sigmoid Self-Attention
Figure 3 for Theory, Analysis, and Best Practices for Sigmoid Self-Attention
Figure 4 for Theory, Analysis, and Best Practices for Sigmoid Self-Attention
Viaarxiv icon

Poly-View Contrastive Learning

Add code
Mar 08, 2024
Figure 1 for Poly-View Contrastive Learning
Figure 2 for Poly-View Contrastive Learning
Figure 3 for Poly-View Contrastive Learning
Figure 4 for Poly-View Contrastive Learning
Viaarxiv icon