Picture for Youssef Mroueh

Youssef Mroueh

IBM Research, USA

Quantum Verifiable Rewards for Post-Training Qiskit Code Assistant

Add code
Aug 28, 2025
Viaarxiv icon

GP-MoLFormer-Sim: Test Time Molecular Optimization through Contextual Similarity Guidance

Add code
Jun 05, 2025
Viaarxiv icon

Guided Speculative Inference for Efficient Test-Time Alignment of LLMs

Add code
Jun 04, 2025
Viaarxiv icon

Revisiting Group Relative Policy Optimization: Insights into On-Policy and Off-Policy Training

Add code
May 28, 2025
Viaarxiv icon

Reinforcement Learning with Verifiable Rewards: GRPO's Effective Loss, Dynamics, and Success Amplification

Add code
Mar 09, 2025
Viaarxiv icon

Verify when Uncertain: Beyond Self-Consistency in Black Box Hallucination Detection

Add code
Feb 20, 2025
Viaarxiv icon

Theoretical Analysis of KL-regularized RLHF with Multiple Reference Models

Add code
Feb 03, 2025
Figure 1 for Theoretical Analysis of KL-regularized RLHF with Multiple Reference Models
Viaarxiv icon

Large Language Models can be Strong Self-Detoxifiers

Add code
Oct 04, 2024
Viaarxiv icon

Gradient Flows and Riemannian Structure in the Gromov-Wasserstein Geometry

Add code
Jul 16, 2024
Figure 1 for Gradient Flows and Riemannian Structure in the Gromov-Wasserstein Geometry
Figure 2 for Gradient Flows and Riemannian Structure in the Gromov-Wasserstein Geometry
Figure 3 for Gradient Flows and Riemannian Structure in the Gromov-Wasserstein Geometry
Figure 4 for Gradient Flows and Riemannian Structure in the Gromov-Wasserstein Geometry
Viaarxiv icon

Multivariate Stochastic Dominance via Optimal Transport and Applications to Models Benchmarking

Add code
Jun 10, 2024
Viaarxiv icon