Picture for Zhonghou Lv

Zhonghou Lv

Safety-Utility Conflicts Are Not Global: Surgical Alignment via Head-Level Diagnosis

Add code
Jan 07, 2026
Viaarxiv icon

Token-Level Policy Optimization: Linking Group-Level Rewards to Token-Level Aggregation via Markov Likelihood

Add code
Oct 10, 2025
Figure 1 for Token-Level Policy Optimization: Linking Group-Level Rewards to Token-Level Aggregation via Markov Likelihood
Figure 2 for Token-Level Policy Optimization: Linking Group-Level Rewards to Token-Level Aggregation via Markov Likelihood
Figure 3 for Token-Level Policy Optimization: Linking Group-Level Rewards to Token-Level Aggregation via Markov Likelihood
Figure 4 for Token-Level Policy Optimization: Linking Group-Level Rewards to Token-Level Aggregation via Markov Likelihood
Viaarxiv icon

Knowledgeable-r1: Policy Optimization for Knowledge Exploration in Retrieval-Augmented Generation

Add code
Jun 05, 2025
Figure 1 for Knowledgeable-r1: Policy Optimization for Knowledge Exploration in Retrieval-Augmented Generation
Figure 2 for Knowledgeable-r1: Policy Optimization for Knowledge Exploration in Retrieval-Augmented Generation
Figure 3 for Knowledgeable-r1: Policy Optimization for Knowledge Exploration in Retrieval-Augmented Generation
Figure 4 for Knowledgeable-r1: Policy Optimization for Knowledge Exploration in Retrieval-Augmented Generation
Viaarxiv icon