Picture for Hossein Hajimirsadeghi

Hossein Hajimirsadeghi

You Need Reasoning to Learn Reasoning: The Limitations of Label-Free RL in Weak Base Models

Add code
Nov 07, 2025
Figure 1 for You Need Reasoning to Learn Reasoning: The Limitations of Label-Free RL in Weak Base Models
Figure 2 for You Need Reasoning to Learn Reasoning: The Limitations of Label-Free RL in Weak Base Models
Figure 3 for You Need Reasoning to Learn Reasoning: The Limitations of Label-Free RL in Weak Base Models
Figure 4 for You Need Reasoning to Learn Reasoning: The Limitations of Label-Free RL in Weak Base Models
Viaarxiv icon

TabReason: A Reinforcement Learning-Enhanced Reasoning LLM for Explainable Tabular Data Prediction

Add code
May 29, 2025
Viaarxiv icon

Radar: Fast Long-Context Decoding for Any Transformer

Add code
Mar 13, 2025
Viaarxiv icon

Attention as an RNN

Add code
May 22, 2024
Figure 1 for Attention as an RNN
Figure 2 for Attention as an RNN
Figure 3 for Attention as an RNN
Figure 4 for Attention as an RNN
Viaarxiv icon

Tree Cross Attention

Add code
Sep 29, 2023
Figure 1 for Tree Cross Attention
Figure 2 for Tree Cross Attention
Figure 3 for Tree Cross Attention
Figure 4 for Tree Cross Attention
Viaarxiv icon

Constant Memory Attention Block

Add code
Jun 21, 2023
Figure 1 for Constant Memory Attention Block
Figure 2 for Constant Memory Attention Block
Figure 3 for Constant Memory Attention Block
Figure 4 for Constant Memory Attention Block
Viaarxiv icon

Constant Memory Attentive Neural Processes

Add code
May 23, 2023
Figure 1 for Constant Memory Attentive Neural Processes
Figure 2 for Constant Memory Attentive Neural Processes
Figure 3 for Constant Memory Attentive Neural Processes
Figure 4 for Constant Memory Attentive Neural Processes
Viaarxiv icon

Latent Bottlenecked Attentive Neural Processes

Add code
Nov 15, 2022
Viaarxiv icon

Training a Vision Transformer from scratch in less than 24 hours with 1 GPU

Add code
Nov 09, 2022
Figure 1 for Training a Vision Transformer from scratch in less than 24 hours with 1 GPU
Figure 2 for Training a Vision Transformer from scratch in less than 24 hours with 1 GPU
Figure 3 for Training a Vision Transformer from scratch in less than 24 hours with 1 GPU
Viaarxiv icon

Stop Overcomplicating Selective Classification: Use Max-Logit

Add code
Jun 17, 2022
Figure 1 for Stop Overcomplicating Selective Classification: Use Max-Logit
Figure 2 for Stop Overcomplicating Selective Classification: Use Max-Logit
Figure 3 for Stop Overcomplicating Selective Classification: Use Max-Logit
Figure 4 for Stop Overcomplicating Selective Classification: Use Max-Logit
Viaarxiv icon