Picture for Dietrich Klakow

Dietrich Klakow

Evaluating Grounded Reasoning by Code-Assisted Large Language Models for Mathematics

Add code
Apr 24, 2025
Viaarxiv icon

Agree to Disagree? A Meta-Evaluation of LLM Misgendering

Add code
Apr 23, 2025
Viaarxiv icon

Implementing Rational Choice Functions with LLMs and Measuring their Alignment with User Preferences

Add code
Apr 22, 2025
Viaarxiv icon

Aligned Probing: Relating Toxic Behavior and Model Internals

Add code
Mar 17, 2025
Viaarxiv icon

Unveiling the Key Factors for Distilling Chain-of-Thought Reasoning

Add code
Feb 25, 2025
Viaarxiv icon

AFRIDOC-MT: Document-level MT Corpus for African Languages

Add code
Jan 10, 2025
Figure 1 for AFRIDOC-MT: Document-level MT Corpus for African Languages
Figure 2 for AFRIDOC-MT: Document-level MT Corpus for African Languages
Figure 3 for AFRIDOC-MT: Document-level MT Corpus for African Languages
Figure 4 for AFRIDOC-MT: Document-level MT Corpus for African Languages
Viaarxiv icon

IGC: Integrating a Gated Calculator into an LLM to Solve Arithmetic Tasks Reliably and Efficiently

Add code
Jan 01, 2025
Figure 1 for IGC: Integrating a Gated Calculator into an LLM to Solve Arithmetic Tasks Reliably and Efficiently
Figure 2 for IGC: Integrating a Gated Calculator into an LLM to Solve Arithmetic Tasks Reliably and Efficiently
Figure 3 for IGC: Integrating a Gated Calculator into an LLM to Solve Arithmetic Tasks Reliably and Efficiently
Figure 4 for IGC: Integrating a Gated Calculator into an LLM to Solve Arithmetic Tasks Reliably and Efficiently
Viaarxiv icon

Utilizing Multimodal Data for Edge Case Robust Call-sign Recognition and Understanding

Add code
Dec 29, 2024
Viaarxiv icon

Evaluating the Capabilities of Large Language Models for Multi-label Emotion Understanding

Add code
Dec 17, 2024
Viaarxiv icon

Uhura: A Benchmark for Evaluating Scientific Question Answering and Truthfulness in Low-Resource African Languages

Add code
Dec 01, 2024
Figure 1 for Uhura: A Benchmark for Evaluating Scientific Question Answering and Truthfulness in Low-Resource African Languages
Figure 2 for Uhura: A Benchmark for Evaluating Scientific Question Answering and Truthfulness in Low-Resource African Languages
Figure 3 for Uhura: A Benchmark for Evaluating Scientific Question Answering and Truthfulness in Low-Resource African Languages
Figure 4 for Uhura: A Benchmark for Evaluating Scientific Question Answering and Truthfulness in Low-Resource African Languages
Viaarxiv icon