Listops


NRGPT: An Energy-based Alternative for GPT

Add code
Dec 18, 2025
Figure 1 for NRGPT: An Energy-based Alternative for GPT
Figure 2 for NRGPT: An Energy-based Alternative for GPT
Figure 3 for NRGPT: An Energy-based Alternative for GPT
Figure 4 for NRGPT: An Energy-based Alternative for GPT
Viaarxiv icon

Context Is Not Comprehension

Add code
Jun 08, 2025
Viaarxiv icon

Verbose ListOps (VLO): Beyond Long Context -- Unmasking LLM's Reasoning Blind Spots

Add code
Jun 05, 2025
Viaarxiv icon

Small Models, Smarter Learning: The Power of Joint Task Training

Add code
May 23, 2025
Viaarxiv icon

Recurrent Transformers with Dynamic Halt

Add code
Feb 01, 2024
Viaarxiv icon

Cached Transformers: Improving Transformers with Differentiable Memory Cache

Add code
Dec 20, 2023
Viaarxiv icon

Recursion in Recursion: Two-Level Nested Recursion for Length Generalization with Scalability

Add code
Nov 08, 2023
Figure 1 for Recursion in Recursion: Two-Level Nested Recursion for Length Generalization with Scalability
Figure 2 for Recursion in Recursion: Two-Level Nested Recursion for Length Generalization with Scalability
Figure 3 for Recursion in Recursion: Two-Level Nested Recursion for Length Generalization with Scalability
Figure 4 for Recursion in Recursion: Two-Level Nested Recursion for Length Generalization with Scalability
Viaarxiv icon

Efficient Beam Tree Recursion

Add code
Jul 20, 2023
Figure 1 for Efficient Beam Tree Recursion
Figure 2 for Efficient Beam Tree Recursion
Figure 3 for Efficient Beam Tree Recursion
Figure 4 for Efficient Beam Tree Recursion
Viaarxiv icon

Investigating Pre-trained Language Models on Cross-Domain Datasets, a Step Closer to General AI

Add code
Jun 21, 2023
Figure 1 for Investigating Pre-trained Language Models on Cross-Domain Datasets, a Step Closer to General AI
Figure 2 for Investigating Pre-trained Language Models on Cross-Domain Datasets, a Step Closer to General AI
Figure 3 for Investigating Pre-trained Language Models on Cross-Domain Datasets, a Step Closer to General AI
Figure 4 for Investigating Pre-trained Language Models on Cross-Domain Datasets, a Step Closer to General AI
Viaarxiv icon

Opening the Black Box: Analyzing Attention Weights and Hidden States in Pre-trained Language Models for Non-language Tasks

Add code
Jun 21, 2023
Figure 1 for Opening the Black Box: Analyzing Attention Weights and Hidden States in Pre-trained Language Models for Non-language Tasks
Figure 2 for Opening the Black Box: Analyzing Attention Weights and Hidden States in Pre-trained Language Models for Non-language Tasks
Figure 3 for Opening the Black Box: Analyzing Attention Weights and Hidden States in Pre-trained Language Models for Non-language Tasks
Figure 4 for Opening the Black Box: Analyzing Attention Weights and Hidden States in Pre-trained Language Models for Non-language Tasks
Viaarxiv icon