Picture for Youssef Mroueh

Youssef Mroueh

IBM Research, USA

GP-MoLFormer-Sim: Test Time Molecular Optimization through Contextual Similarity Guidance

Add code
Jun 05, 2025
Viaarxiv icon

Guided Speculative Inference for Efficient Test-Time Alignment of LLMs

Add code
Jun 04, 2025
Viaarxiv icon

Revisiting Group Relative Policy Optimization: Insights into On-Policy and Off-Policy Training

Add code
May 28, 2025
Viaarxiv icon

Reinforcement Learning with Verifiable Rewards: GRPO's Effective Loss, Dynamics, and Success Amplification

Add code
Mar 09, 2025
Viaarxiv icon

Verify when Uncertain: Beyond Self-Consistency in Black Box Hallucination Detection

Add code
Feb 20, 2025
Viaarxiv icon

Theoretical Analysis of KL-regularized RLHF with Multiple Reference Models

Add code
Feb 03, 2025
Figure 1 for Theoretical Analysis of KL-regularized RLHF with Multiple Reference Models
Viaarxiv icon

Large Language Models can be Strong Self-Detoxifiers

Add code
Oct 04, 2024
Viaarxiv icon

Gradient Flows and Riemannian Structure in the Gromov-Wasserstein Geometry

Add code
Jul 16, 2024
Figure 1 for Gradient Flows and Riemannian Structure in the Gromov-Wasserstein Geometry
Figure 2 for Gradient Flows and Riemannian Structure in the Gromov-Wasserstein Geometry
Figure 3 for Gradient Flows and Riemannian Structure in the Gromov-Wasserstein Geometry
Figure 4 for Gradient Flows and Riemannian Structure in the Gromov-Wasserstein Geometry
Viaarxiv icon

Multivariate Stochastic Dominance via Optimal Transport and Applications to Models Benchmarking

Add code
Jun 10, 2024
Viaarxiv icon

Information Theoretic Guarantees For Policy Alignment In Large Language Models

Add code
Jun 09, 2024
Figure 1 for Information Theoretic Guarantees For Policy Alignment In Large Language Models
Figure 2 for Information Theoretic Guarantees For Policy Alignment In Large Language Models
Viaarxiv icon