Picture for David Schlangen

David Schlangen

Order in the Evaluation Court: A Critical Analysis of NLG Evaluation Trends

Add code
Jan 12, 2026
Viaarxiv icon

Could the Road to Grounded, Neuro-symbolic AI be Paved with Words-as-Classifiers?

Add code
Jul 08, 2025
Viaarxiv icon

From Templates to Natural Language: Generalization Challenges in Instruction-Tuned LLMs for Spatial Reasoning

Add code
May 20, 2025
Figure 1 for From Templates to Natural Language: Generalization Challenges in Instruction-Tuned LLMs for Spatial Reasoning
Figure 2 for From Templates to Natural Language: Generalization Challenges in Instruction-Tuned LLMs for Spatial Reasoning
Figure 3 for From Templates to Natural Language: Generalization Challenges in Instruction-Tuned LLMs for Spatial Reasoning
Figure 4 for From Templates to Natural Language: Generalization Challenges in Instruction-Tuned LLMs for Spatial Reasoning
Viaarxiv icon

clem:todd: A Framework for the Systematic Benchmarking of LLM-Based Task-Oriented Dialogue System Realisations

Add code
May 08, 2025
Viaarxiv icon

Playpen: An Environment for Exploring Learning Through Conversational Interaction

Add code
Apr 11, 2025
Viaarxiv icon

Triangulating LLM Progress through Benchmarks, Games, and Cognitive Tests

Add code
Feb 20, 2025
Figure 1 for Triangulating LLM Progress through Benchmarks, Games, and Cognitive Tests
Figure 2 for Triangulating LLM Progress through Benchmarks, Games, and Cognitive Tests
Figure 3 for Triangulating LLM Progress through Benchmarks, Games, and Cognitive Tests
Figure 4 for Triangulating LLM Progress through Benchmarks, Games, and Cognitive Tests
Viaarxiv icon

Plant in Cupboard, Orange on Table, Book on Shelf. Benchmarking Practical Reasoning and Situation Modelling in a Text-Simulated Situated Environment

Add code
Feb 17, 2025
Figure 1 for Plant in Cupboard, Orange on Table, Book on Shelf. Benchmarking Practical Reasoning and Situation Modelling in a Text-Simulated Situated Environment
Figure 2 for Plant in Cupboard, Orange on Table, Book on Shelf. Benchmarking Practical Reasoning and Situation Modelling in a Text-Simulated Situated Environment
Figure 3 for Plant in Cupboard, Orange on Table, Book on Shelf. Benchmarking Practical Reasoning and Situation Modelling in a Text-Simulated Situated Environment
Figure 4 for Plant in Cupboard, Orange on Table, Book on Shelf. Benchmarking Practical Reasoning and Situation Modelling in a Text-Simulated Situated Environment
Viaarxiv icon

Ad-hoc Concept Forming in the Game Codenames as a Means for Evaluating Large Language Models

Add code
Feb 17, 2025
Viaarxiv icon

Incremental Dialogue Management: Survey, Discussion, and Implications for HRI

Add code
Jan 01, 2025
Figure 1 for Incremental Dialogue Management: Survey, Discussion, and Implications for HRI
Figure 2 for Incremental Dialogue Management: Survey, Discussion, and Implications for HRI
Figure 3 for Incremental Dialogue Management: Survey, Discussion, and Implications for HRI
Figure 4 for Incremental Dialogue Management: Survey, Discussion, and Implications for HRI
Viaarxiv icon

Towards No-Code Programming of Cobots: Experiments with Code Synthesis by Large Code Models for Conversational Programming

Add code
Sep 18, 2024
Figure 1 for Towards No-Code Programming of Cobots: Experiments with Code Synthesis by Large Code Models for Conversational Programming
Figure 2 for Towards No-Code Programming of Cobots: Experiments with Code Synthesis by Large Code Models for Conversational Programming
Figure 3 for Towards No-Code Programming of Cobots: Experiments with Code Synthesis by Large Code Models for Conversational Programming
Figure 4 for Towards No-Code Programming of Cobots: Experiments with Code Synthesis by Large Code Models for Conversational Programming
Viaarxiv icon