Picture for Timothy Baldwin

Timothy Baldwin

Beyond the Resumé: A Rubric-Aware Automatic Interview System for Information Elicitation

Add code
Mar 02, 2026
Viaarxiv icon

JailNewsBench: Multi-Lingual and Regional Benchmark for Fake News Generation under Jailbreak Attacks

Add code
Mar 01, 2026
Viaarxiv icon

Controllable Reasoning Models Are Private Thinkers

Add code
Feb 27, 2026
Viaarxiv icon

Don't Ignore the Tail: Decoupling top-K Probabilities for Efficient Language Model Distillation

Add code
Feb 24, 2026
Viaarxiv icon

SimuScene: Training and Benchmarking Code Generation to Simulate Physical Scenarios

Add code
Feb 11, 2026
Viaarxiv icon

AI Security Beyond Core Domains: Resume Screening as a Case Study of Adversarial Vulnerabilities in Specialized LLM Applications

Add code
Dec 23, 2025
Figure 1 for AI Security Beyond Core Domains: Resume Screening as a Case Study of Adversarial Vulnerabilities in Specialized LLM Applications
Figure 2 for AI Security Beyond Core Domains: Resume Screening as a Case Study of Adversarial Vulnerabilities in Specialized LLM Applications
Figure 3 for AI Security Beyond Core Domains: Resume Screening as a Case Study of Adversarial Vulnerabilities in Specialized LLM Applications
Figure 4 for AI Security Beyond Core Domains: Resume Screening as a Case Study of Adversarial Vulnerabilities in Specialized LLM Applications
Viaarxiv icon

Reasoning with Confidence: Efficient Verification of LLM Reasoning Steps via Uncertainty Heads

Add code
Nov 11, 2025
Viaarxiv icon

Faithfulness-Aware Uncertainty Quantification for Fact-Checking the Output of Retrieval Augmented Generation

Add code
May 28, 2025
Viaarxiv icon

Uncertainty-Aware Attention Heads: Efficient Unsupervised Uncertainty Quantification for LLMs

Add code
May 26, 2025
Viaarxiv icon

A Head to Predict and a Head to Question: Pre-trained Uncertainty Quantification Heads for Hallucination Detection in LLM Outputs

Add code
May 13, 2025
Viaarxiv icon