Picture for Yarin Gal

Yarin Gal

FindingDory: A Benchmark to Evaluate Memory in Embodied Agents

Add code
Jun 18, 2025
Viaarxiv icon

Protriever: End-to-End Differentiable Protein Homology Search for Fitness Prediction

Add code
Jun 10, 2025
Viaarxiv icon

Attacking Multimodal OS Agents with Malicious Image Patches

Add code
Mar 13, 2025
Viaarxiv icon

Do Multilingual LLMs Think In English?

Add code
Feb 21, 2025
Viaarxiv icon

Fundamental Limitations in Defending LLM Finetuning APIs

Add code
Feb 20, 2025
Viaarxiv icon

Open Problems in Machine Unlearning for AI Safety

Add code
Jan 09, 2025
Viaarxiv icon

Detecting LLM Hallucination Through Layer-wise Information Deficiency: Analysis of Unanswerable Questions and Ambiguous Prompts

Add code
Dec 13, 2024
Figure 1 for Detecting LLM Hallucination Through Layer-wise Information Deficiency: Analysis of Unanswerable Questions and Ambiguous Prompts
Figure 2 for Detecting LLM Hallucination Through Layer-wise Information Deficiency: Analysis of Unanswerable Questions and Ambiguous Prompts
Figure 3 for Detecting LLM Hallucination Through Layer-wise Information Deficiency: Analysis of Unanswerable Questions and Ambiguous Prompts
Figure 4 for Detecting LLM Hallucination Through Layer-wise Information Deficiency: Analysis of Unanswerable Questions and Ambiguous Prompts
Viaarxiv icon

AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents

Add code
Oct 11, 2024
Figure 1 for AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents
Figure 2 for AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents
Figure 3 for AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents
Figure 4 for AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents
Viaarxiv icon

Temporal-Difference Variational Continual Learning

Add code
Oct 10, 2024
Figure 1 for Temporal-Difference Variational Continual Learning
Figure 2 for Temporal-Difference Variational Continual Learning
Figure 3 for Temporal-Difference Variational Continual Learning
Figure 4 for Temporal-Difference Variational Continual Learning
Viaarxiv icon

TextCAVs: Debugging vision models using text

Add code
Aug 16, 2024
Viaarxiv icon