Picture for Karl Krauth

Karl Krauth

Shammie

Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces

Add code
Jan 17, 2026
Viaarxiv icon

Breaking Feedback Loops in Recommender Systems with Causal Inference

Add code
Jul 15, 2022
Figure 1 for Breaking Feedback Loops in Recommender Systems with Causal Inference
Figure 2 for Breaking Feedback Loops in Recommender Systems with Causal Inference
Figure 3 for Breaking Feedback Loops in Recommender Systems with Causal Inference
Figure 4 for Breaking Feedback Loops in Recommender Systems with Causal Inference
Viaarxiv icon

Recommendation Systems with Distribution-Free Reliability Guarantees

Add code
Jul 04, 2022
Figure 1 for Recommendation Systems with Distribution-Free Reliability Guarantees
Figure 2 for Recommendation Systems with Distribution-Free Reliability Guarantees
Figure 3 for Recommendation Systems with Distribution-Free Reliability Guarantees
Figure 4 for Recommendation Systems with Distribution-Free Reliability Guarantees
Viaarxiv icon

Modeling Content Creator Incentives on Algorithm-Curated Platforms

Add code
Jun 27, 2022
Figure 1 for Modeling Content Creator Incentives on Algorithm-Curated Platforms
Figure 2 for Modeling Content Creator Incentives on Algorithm-Curated Platforms
Figure 3 for Modeling Content Creator Incentives on Algorithm-Curated Platforms
Figure 4 for Modeling Content Creator Incentives on Algorithm-Curated Platforms
Viaarxiv icon

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

Add code
Jun 10, 2022
Viaarxiv icon

On component interactions in two-stage recommender systems

Add code
Jun 28, 2021
Figure 1 for On component interactions in two-stage recommender systems
Figure 2 for On component interactions in two-stage recommender systems
Figure 3 for On component interactions in two-stage recommender systems
Figure 4 for On component interactions in two-stage recommender systems
Viaarxiv icon

The Stereotyping Problem in Collaboratively Filtered Recommender Systems

Add code
Jun 23, 2021
Figure 1 for The Stereotyping Problem in Collaboratively Filtered Recommender Systems
Figure 2 for The Stereotyping Problem in Collaboratively Filtered Recommender Systems
Figure 3 for The Stereotyping Problem in Collaboratively Filtered Recommender Systems
Figure 4 for The Stereotyping Problem in Collaboratively Filtered Recommender Systems
Viaarxiv icon

Do Offline Metrics Predict Online Performance in Recommender Systems?

Add code
Nov 07, 2020
Figure 1 for Do Offline Metrics Predict Online Performance in Recommender Systems?
Figure 2 for Do Offline Metrics Predict Online Performance in Recommender Systems?
Figure 3 for Do Offline Metrics Predict Online Performance in Recommender Systems?
Figure 4 for Do Offline Metrics Predict Online Performance in Recommender Systems?
Viaarxiv icon

Exploration in two-stage recommender systems

Add code
Sep 01, 2020
Figure 1 for Exploration in two-stage recommender systems
Figure 2 for Exploration in two-stage recommender systems
Figure 3 for Exploration in two-stage recommender systems
Viaarxiv icon

The Effect of Natural Distribution Shift on Question Answering Models

Add code
Apr 29, 2020
Figure 1 for The Effect of Natural Distribution Shift on Question Answering Models
Figure 2 for The Effect of Natural Distribution Shift on Question Answering Models
Figure 3 for The Effect of Natural Distribution Shift on Question Answering Models
Figure 4 for The Effect of Natural Distribution Shift on Question Answering Models
Viaarxiv icon