Picture for Yarin Gal

Yarin Gal

Scaling Up Active Testing to Large Language Models

Add code
Aug 12, 2025
Viaarxiv icon

Uncertainty Quantification for Surface Ozone Emulators using Deep Learning

Add code
Aug 06, 2025
Viaarxiv icon

Leveraging Deep Learning for Physical Model Bias of Global Air Quality Estimates

Add code
Aug 06, 2025
Viaarxiv icon

Weighted Conditional Flow Matching

Add code
Jul 29, 2025
Viaarxiv icon

Security Challenges in AI Agent Deployment: Insights from a Large Scale Public Competition

Add code
Jul 28, 2025
Viaarxiv icon

FindingDory: A Benchmark to Evaluate Memory in Embodied Agents

Add code
Jun 18, 2025
Viaarxiv icon

Protriever: End-to-End Differentiable Protein Homology Search for Fitness Prediction

Add code
Jun 10, 2025
Viaarxiv icon

Attacking Multimodal OS Agents with Malicious Image Patches

Add code
Mar 13, 2025
Viaarxiv icon

Do Multilingual LLMs Think In English?

Add code
Feb 21, 2025
Viaarxiv icon

Fundamental Limitations in Defending LLM Finetuning APIs

Add code
Feb 20, 2025
Viaarxiv icon