Picture for Alex Oesterling

Alex Oesterling

Inference-Time Reward Hacking in Large Language Models

Add code
Jun 24, 2025
Figure 1 for Inference-Time Reward Hacking in Large Language Models
Figure 2 for Inference-Time Reward Hacking in Large Language Models
Figure 3 for Inference-Time Reward Hacking in Large Language Models
Figure 4 for Inference-Time Reward Hacking in Large Language Models
Viaarxiv icon

Multi-Group Proportional Representation for Text-to-Image Models

Add code
May 29, 2025
Figure 1 for Multi-Group Proportional Representation for Text-to-Image Models
Figure 2 for Multi-Group Proportional Representation for Text-to-Image Models
Figure 3 for Multi-Group Proportional Representation for Text-to-Image Models
Figure 4 for Multi-Group Proportional Representation for Text-to-Image Models
Viaarxiv icon

Soft Best-of-n Sampling for Model Alignment

Add code
May 06, 2025
Figure 1 for Soft Best-of-n Sampling for Model Alignment
Viaarxiv icon

All Roads Lead to Rome? Exploring Representational Similarities Between Latent Spaces of Generative Image Models

Add code
Jul 18, 2024
Figure 1 for All Roads Lead to Rome? Exploring Representational Similarities Between Latent Spaces of Generative Image Models
Figure 2 for All Roads Lead to Rome? Exploring Representational Similarities Between Latent Spaces of Generative Image Models
Figure 3 for All Roads Lead to Rome? Exploring Representational Similarities Between Latent Spaces of Generative Image Models
Figure 4 for All Roads Lead to Rome? Exploring Representational Similarities Between Latent Spaces of Generative Image Models
Viaarxiv icon

Multi-Group Proportional Representation

Add code
Jul 11, 2024
Viaarxiv icon

Operationalizing the Blueprint for an AI Bill of Rights: Recommendations for Practitioners, Researchers, and Policy Makers

Add code
Jul 11, 2024
Viaarxiv icon

Interpreting CLIP with Sparse Linear Concept Embeddings (SpLiCE)

Add code
Feb 16, 2024
Figure 1 for Interpreting CLIP with Sparse Linear Concept Embeddings (SpLiCE)
Figure 2 for Interpreting CLIP with Sparse Linear Concept Embeddings (SpLiCE)
Figure 3 for Interpreting CLIP with Sparse Linear Concept Embeddings (SpLiCE)
Figure 4 for Interpreting CLIP with Sparse Linear Concept Embeddings (SpLiCE)
Viaarxiv icon

Fair Machine Unlearning: Data Removal while Mitigating Disparities

Add code
Jul 27, 2023
Figure 1 for Fair Machine Unlearning: Data Removal while Mitigating Disparities
Figure 2 for Fair Machine Unlearning: Data Removal while Mitigating Disparities
Figure 3 for Fair Machine Unlearning: Data Removal while Mitigating Disparities
Figure 4 for Fair Machine Unlearning: Data Removal while Mitigating Disparities
Viaarxiv icon

Distributionally Robust Group Backwards Compatibility

Add code
Dec 20, 2021
Figure 1 for Distributionally Robust Group Backwards Compatibility
Figure 2 for Distributionally Robust Group Backwards Compatibility
Figure 3 for Distributionally Robust Group Backwards Compatibility
Figure 4 for Distributionally Robust Group Backwards Compatibility
Viaarxiv icon

Multitask Learning for Citation Purpose Classification

Add code
Jun 24, 2021
Figure 1 for Multitask Learning for Citation Purpose Classification
Figure 2 for Multitask Learning for Citation Purpose Classification
Figure 3 for Multitask Learning for Citation Purpose Classification
Viaarxiv icon