Picture for Rohan Surana

Rohan Surana

F-GRPO: Factorized Group-Relative Policy Optimization for Unified Candidate Generation and Ranking

Add code
May 13, 2026
Viaarxiv icon

MASS-DPO: Multi-negative Active Sample Selection for Direct Policy Optimization

Add code
May 11, 2026
Viaarxiv icon

Skill-R1: Agent Skill Evolution via Reinforcement Learning

Add code
May 10, 2026
Viaarxiv icon

WS-GRPO: Weakly-Supervised Group-Relative Policy Optimization for Rollout-Efficient Reasoning

Add code
Feb 19, 2026
Viaarxiv icon

AMPS: Adaptive Modality Preference Steering via Functional Entropy

Add code
Feb 13, 2026
Viaarxiv icon

Evaluation on Entity Matching in Recommender Systems

Add code
Jan 23, 2026
Viaarxiv icon

In-context Ranking Preference Optimization

Add code
Apr 21, 2025
Figure 1 for In-context Ranking Preference Optimization
Figure 2 for In-context Ranking Preference Optimization
Figure 3 for In-context Ranking Preference Optimization
Figure 4 for In-context Ranking Preference Optimization
Viaarxiv icon

From Reviews to Dialogues: Active Synthesis for Zero-Shot LLM-based Conversational Recommender System

Add code
Apr 21, 2025
Figure 1 for From Reviews to Dialogues: Active Synthesis for Zero-Shot LLM-based Conversational Recommender System
Figure 2 for From Reviews to Dialogues: Active Synthesis for Zero-Shot LLM-based Conversational Recommender System
Figure 3 for From Reviews to Dialogues: Active Synthesis for Zero-Shot LLM-based Conversational Recommender System
Figure 4 for From Reviews to Dialogues: Active Synthesis for Zero-Shot LLM-based Conversational Recommender System
Viaarxiv icon