Picture for Stephen H. Bach

Stephen H. Bach

Can We Predict Alignment Before Models Finish Thinking? Towards Monitoring Misaligned Reasoning Models

Add code
Jul 16, 2025
Viaarxiv icon

The State of Multilingual LLM Safety Research: From Measuring the Language Gap to Mitigating It

Add code
May 30, 2025
Viaarxiv icon

Crosslingual Reasoning through Test-Time Scaling

Add code
May 08, 2025
Viaarxiv icon

Beyond Contrastive Learning: Synthetic Data Enables List-wise Training with Multiple Levels of Relevance

Add code
Mar 29, 2025
Viaarxiv icon

K-Paths: Reasoning over Graph Paths for Drug Repurposing and Drug Interaction Prediction

Add code
Feb 18, 2025
Viaarxiv icon

$100K or 100 Days: Trade-offs when Pre-Training with Academic Resources

Add code
Oct 30, 2024
Figure 1 for $100K or 100 Days: Trade-offs when Pre-Training with Academic Resources
Figure 2 for $100K or 100 Days: Trade-offs when Pre-Training with Academic Resources
Figure 3 for $100K or 100 Days: Trade-offs when Pre-Training with Academic Resources
Figure 4 for $100K or 100 Days: Trade-offs when Pre-Training with Academic Resources
Viaarxiv icon

Planetarium: A Rigorous Benchmark for Translating Text to Structured Planning Languages

Add code
Jul 03, 2024
Figure 1 for Planetarium: A Rigorous Benchmark for Translating Text to Structured Planning Languages
Figure 2 for Planetarium: A Rigorous Benchmark for Translating Text to Structured Planning Languages
Figure 3 for Planetarium: A Rigorous Benchmark for Translating Text to Structured Planning Languages
Figure 4 for Planetarium: A Rigorous Benchmark for Translating Text to Structured Planning Languages
Viaarxiv icon

Preference Tuning For Toxicity Mitigation Generalizes Across Languages

Add code
Jun 23, 2024
Figure 1 for Preference Tuning For Toxicity Mitigation Generalizes Across Languages
Figure 2 for Preference Tuning For Toxicity Mitigation Generalizes Across Languages
Figure 3 for Preference Tuning For Toxicity Mitigation Generalizes Across Languages
Figure 4 for Preference Tuning For Toxicity Mitigation Generalizes Across Languages
Viaarxiv icon

If CLIP Could Talk: Understanding Vision-Language Model Representations Through Their Preferred Concept Descriptions

Add code
Mar 25, 2024
Viaarxiv icon

Learning to Generate Instruction Tuning Datasets for Zero-Shot Task Adaptation

Add code
Feb 28, 2024
Viaarxiv icon