Picture for Dimitris Papailiopoulos

Dimitris Papailiopoulos

From Artificial Needles to Real Haystacks: Improving Retrieval Capabilities in LLMs by Finetuning on Synthetic Data

Add code
Jun 27, 2024
Viaarxiv icon

CHAI: Clustered Head Attention for Efficient LLM Inference

Add code
Mar 12, 2024
Figure 1 for CHAI: Clustered Head Attention for Efficient LLM Inference
Figure 2 for CHAI: Clustered Head Attention for Efficient LLM Inference
Figure 3 for CHAI: Clustered Head Attention for Efficient LLM Inference
Figure 4 for CHAI: Clustered Head Attention for Efficient LLM Inference
Viaarxiv icon

How Well Can Transformers Emulate In-context Newton's Method?

Add code
Mar 05, 2024
Figure 1 for How Well Can Transformers Emulate In-context Newton's Method?
Figure 2 for How Well Can Transformers Emulate In-context Newton's Method?
Figure 3 for How Well Can Transformers Emulate In-context Newton's Method?
Figure 4 for How Well Can Transformers Emulate In-context Newton's Method?
Viaarxiv icon

Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks

Add code
Feb 06, 2024
Viaarxiv icon

Looped Transformers are Better at Learning Learning Algorithms

Add code
Nov 21, 2023
Viaarxiv icon

Mini-Batch Optimization of Contrastive Loss

Add code
Jul 12, 2023
Figure 1 for Mini-Batch Optimization of Contrastive Loss
Figure 2 for Mini-Batch Optimization of Contrastive Loss
Figure 3 for Mini-Batch Optimization of Contrastive Loss
Figure 4 for Mini-Batch Optimization of Contrastive Loss
Viaarxiv icon

Predictive Pipelined Decoding: A Compute-Latency Trade-off for Exact LLM Decoding

Add code
Jul 12, 2023
Figure 1 for Predictive Pipelined Decoding: A Compute-Latency Trade-off for Exact LLM Decoding
Figure 2 for Predictive Pipelined Decoding: A Compute-Latency Trade-off for Exact LLM Decoding
Figure 3 for Predictive Pipelined Decoding: A Compute-Latency Trade-off for Exact LLM Decoding
Figure 4 for Predictive Pipelined Decoding: A Compute-Latency Trade-off for Exact LLM Decoding
Viaarxiv icon

Teaching Arithmetic to Small Transformers

Add code
Jul 07, 2023
Figure 1 for Teaching Arithmetic to Small Transformers
Figure 2 for Teaching Arithmetic to Small Transformers
Figure 3 for Teaching Arithmetic to Small Transformers
Figure 4 for Teaching Arithmetic to Small Transformers
Viaarxiv icon

Dissecting Chain-of-Thought: A Study on Compositional In-Context Learning of MLPs

Add code
May 30, 2023
Figure 1 for Dissecting Chain-of-Thought: A Study on Compositional In-Context Learning of MLPs
Figure 2 for Dissecting Chain-of-Thought: A Study on Compositional In-Context Learning of MLPs
Figure 3 for Dissecting Chain-of-Thought: A Study on Compositional In-Context Learning of MLPs
Figure 4 for Dissecting Chain-of-Thought: A Study on Compositional In-Context Learning of MLPs
Viaarxiv icon

Prompted LLMs as Chatbot Modules for Long Open-domain Conversation

Add code
May 08, 2023
Figure 1 for Prompted LLMs as Chatbot Modules for Long Open-domain Conversation
Figure 2 for Prompted LLMs as Chatbot Modules for Long Open-domain Conversation
Figure 3 for Prompted LLMs as Chatbot Modules for Long Open-domain Conversation
Figure 4 for Prompted LLMs as Chatbot Modules for Long Open-domain Conversation
Viaarxiv icon