Picture for Thilo Hagendorff

Thilo Hagendorff

Beyond Chains of Thought: Benchmarking Latent-Space Reasoning Abilities in Large Language Models

Add code
Apr 14, 2025
Viaarxiv icon

Compromising Honesty and Harmlessness in Language Models via Deception Attacks

Add code
Feb 12, 2025
Viaarxiv icon

A Looming Replication Crisis in Evaluating Behavior in Language Models? Evidence and Solutions

Add code
Sep 30, 2024
Figure 1 for A Looming Replication Crisis in Evaluating Behavior in Language Models? Evidence and Solutions
Figure 2 for A Looming Replication Crisis in Evaluating Behavior in Language Models? Evidence and Solutions
Figure 3 for A Looming Replication Crisis in Evaluating Behavior in Language Models? Evidence and Solutions
Figure 4 for A Looming Replication Crisis in Evaluating Behavior in Language Models? Evidence and Solutions
Viaarxiv icon

Mapping the Ethics of Generative AI: A Comprehensive Scoping Review

Add code
Feb 13, 2024
Viaarxiv icon

Fairness Hacking: The Malicious Practice of Shrouding Unfairness in Algorithms

Add code
Nov 12, 2023
Viaarxiv icon

Deception Abilities Emerged in Large Language Models

Add code
Jul 31, 2023
Viaarxiv icon

Human-Like Intuitive Behavior and Reasoning Biases Emerged in Language Models -- and Disappeared in GPT-4

Add code
Jun 13, 2023
Viaarxiv icon

Machine Psychology: Investigating Emergent Capabilities and Behavior in Large Language Models Using Psychological Methods

Add code
Mar 24, 2023
Viaarxiv icon

Machine intuition: Uncovering human-like intuitive decision-making in GPT-3.5

Add code
Dec 10, 2022
Viaarxiv icon

Why we need biased AI -- How including cognitive and ethical machine biases can enhance AI systems

Add code
Mar 18, 2022
Viaarxiv icon