Picture for Iryna Gurevych

Iryna Gurevych

Geolocation with Real Human Gameplay Data: A Large-Scale Dataset and Human-Like Reasoning Framework

Add code
Feb 19, 2025
Figure 1 for Geolocation with Real Human Gameplay Data: A Large-Scale Dataset and Human-Like Reasoning Framework
Figure 2 for Geolocation with Real Human Gameplay Data: A Large-Scale Dataset and Human-Like Reasoning Framework
Figure 3 for Geolocation with Real Human Gameplay Data: A Large-Scale Dataset and Human-Like Reasoning Framework
Figure 4 for Geolocation with Real Human Gameplay Data: A Large-Scale Dataset and Human-Like Reasoning Framework
Viaarxiv icon

Is Human-Like Text Liked by Humans? Multilingual Human Detection and Preference Against AI

Add code
Feb 17, 2025
Figure 1 for Is Human-Like Text Liked by Humans? Multilingual Human Detection and Preference Against AI
Figure 2 for Is Human-Like Text Liked by Humans? Multilingual Human Detection and Preference Against AI
Figure 3 for Is Human-Like Text Liked by Humans? Multilingual Human Detection and Preference Against AI
Figure 4 for Is Human-Like Text Liked by Humans? Multilingual Human Detection and Preference Against AI
Viaarxiv icon

Self-Rationalization in the Wild: A Large Scale Out-of-Distribution Evaluation on NLI-related tasks

Add code
Feb 07, 2025
Viaarxiv icon

COVE: COntext and VEracity prediction for out-of-context images

Add code
Feb 03, 2025
Viaarxiv icon

Differentially Private Steering for Large Language Model Alignment

Add code
Jan 30, 2025
Figure 1 for Differentially Private Steering for Large Language Model Alignment
Figure 2 for Differentially Private Steering for Large Language Model Alignment
Figure 3 for Differentially Private Steering for Large Language Model Alignment
Figure 4 for Differentially Private Steering for Large Language Model Alignment
Viaarxiv icon

GenAI Content Detection Task 1: English and Multilingual Machine-Generated Text Detection: AI vs. Human

Add code
Jan 19, 2025
Viaarxiv icon

The Inherent Limits of Pretrained LLMs: The Unexpected Convergence of Instruction Tuning and In-Context Learning Capabilities

Add code
Jan 15, 2025
Viaarxiv icon

Turning Logic Against Itself : Probing Model Defenses Through Contrastive Questions

Add code
Jan 03, 2025
Figure 1 for Turning Logic Against Itself : Probing Model Defenses Through Contrastive Questions
Figure 2 for Turning Logic Against Itself : Probing Model Defenses Through Contrastive Questions
Figure 3 for Turning Logic Against Itself : Probing Model Defenses Through Contrastive Questions
Figure 4 for Turning Logic Against Itself : Probing Model Defenses Through Contrastive Questions
Viaarxiv icon

Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability

Add code
Dec 24, 2024
Figure 1 for Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability
Figure 2 for Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability
Figure 3 for Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability
Figure 4 for Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability
Viaarxiv icon

How to Weight Multitask Finetuning? Fast Previews via Bayesian Model-Merging

Add code
Dec 11, 2024
Figure 1 for How to Weight Multitask Finetuning? Fast Previews via Bayesian Model-Merging
Figure 2 for How to Weight Multitask Finetuning? Fast Previews via Bayesian Model-Merging
Figure 3 for How to Weight Multitask Finetuning? Fast Previews via Bayesian Model-Merging
Figure 4 for How to Weight Multitask Finetuning? Fast Previews via Bayesian Model-Merging
Viaarxiv icon