Picture for Yifeng Liu

Yifeng Liu

Mending the Holes: Mitigating Reward Hacking in Reinforcement Learning for Multilingual Translation

Add code
Mar 13, 2026
Viaarxiv icon

Deep Delta Learning

Add code
Jan 01, 2026
Viaarxiv icon

Group Representational Position Encoding

Add code
Dec 08, 2025
Figure 1 for Group Representational Position Encoding
Figure 2 for Group Representational Position Encoding
Figure 3 for Group Representational Position Encoding
Figure 4 for Group Representational Position Encoding
Viaarxiv icon

On the Design of KL-Regularized Policy Gradient Algorithms for LLM Reasoning

Add code
May 23, 2025
Viaarxiv icon

R-PRM: Reasoning-Driven Process Reward Modeling

Add code
Mar 27, 2025
Figure 1 for R-PRM: Reasoning-Driven Process Reward Modeling
Figure 2 for R-PRM: Reasoning-Driven Process Reward Modeling
Figure 3 for R-PRM: Reasoning-Driven Process Reward Modeling
Figure 4 for R-PRM: Reasoning-Driven Process Reward Modeling
Viaarxiv icon

A Survey of Zero-Knowledge Proof Based Verifiable Machine Learning

Add code
Feb 25, 2025
Figure 1 for A Survey of Zero-Knowledge Proof Based Verifiable Machine Learning
Figure 2 for A Survey of Zero-Knowledge Proof Based Verifiable Machine Learning
Figure 3 for A Survey of Zero-Knowledge Proof Based Verifiable Machine Learning
Figure 4 for A Survey of Zero-Knowledge Proof Based Verifiable Machine Learning
Viaarxiv icon

Tensor Product Attention Is All You Need

Add code
Jan 11, 2025
Figure 1 for Tensor Product Attention Is All You Need
Figure 2 for Tensor Product Attention Is All You Need
Figure 3 for Tensor Product Attention Is All You Need
Figure 4 for Tensor Product Attention Is All You Need
Viaarxiv icon

MARS: Unleashing the Power of Variance Reduction for Training Large Models

Add code
Nov 15, 2024
Figure 1 for MARS: Unleashing the Power of Variance Reduction for Training Large Models
Figure 2 for MARS: Unleashing the Power of Variance Reduction for Training Large Models
Figure 3 for MARS: Unleashing the Power of Variance Reduction for Training Large Models
Figure 4 for MARS: Unleashing the Power of Variance Reduction for Training Large Models
Viaarxiv icon

T-Rex: Text-assisted Retrosynthesis Prediction

Add code
Jan 26, 2024
Viaarxiv icon

UniRel: Unified Representation and Interaction for Joint Relational Triple Extraction

Add code
Nov 16, 2022
Figure 1 for UniRel: Unified Representation and Interaction for Joint Relational Triple Extraction
Figure 2 for UniRel: Unified Representation and Interaction for Joint Relational Triple Extraction
Figure 3 for UniRel: Unified Representation and Interaction for Joint Relational Triple Extraction
Figure 4 for UniRel: Unified Representation and Interaction for Joint Relational Triple Extraction
Viaarxiv icon