Picture for Yanqiu Wu

Yanqiu Wu

A Survey on Progress in LLM Alignment from the Perspective of Reward Design

Add code
May 05, 2025
Viaarxiv icon

Radio Signal Classification by Adversarially Robust Quantum Machine Learning

Add code
Dec 13, 2023
Viaarxiv icon

Quantum-Inspired Machine Learning: a Survey

Add code
Sep 08, 2023
Figure 1 for Quantum-Inspired Machine Learning: a Survey
Figure 2 for Quantum-Inspired Machine Learning: a Survey
Figure 3 for Quantum-Inspired Machine Learning: a Survey
Figure 4 for Quantum-Inspired Machine Learning: a Survey
Viaarxiv icon

Spatio-temporal Incentives Optimization for Ride-hailing Services with Offline Deep Reinforcement Learning

Add code
Nov 06, 2022
Viaarxiv icon

Aggressive Q-Learning with Ensembles: Achieving Both High Sample Efficiency and High Asymptotic Performance

Add code
Nov 17, 2021
Figure 1 for Aggressive Q-Learning with Ensembles: Achieving Both High Sample Efficiency and High Asymptotic Performance
Figure 2 for Aggressive Q-Learning with Ensembles: Achieving Both High Sample Efficiency and High Asymptotic Performance
Figure 3 for Aggressive Q-Learning with Ensembles: Achieving Both High Sample Efficiency and High Asymptotic Performance
Figure 4 for Aggressive Q-Learning with Ensembles: Achieving Both High Sample Efficiency and High Asymptotic Performance
Viaarxiv icon

BAIL: Best-Action Imitation Learning for Batch Deep Reinforcement Learning

Add code
Oct 27, 2019
Figure 1 for BAIL: Best-Action Imitation Learning for Batch Deep Reinforcement Learning
Figure 2 for BAIL: Best-Action Imitation Learning for Batch Deep Reinforcement Learning
Figure 3 for BAIL: Best-Action Imitation Learning for Batch Deep Reinforcement Learning
Figure 4 for BAIL: Best-Action Imitation Learning for Batch Deep Reinforcement Learning
Viaarxiv icon

Towards Simplicity in Deep Reinforcement Learning: Streamlined Off-Policy Learning

Add code
Oct 10, 2019
Figure 1 for Towards Simplicity in Deep Reinforcement Learning: Streamlined Off-Policy Learning
Figure 2 for Towards Simplicity in Deep Reinforcement Learning: Streamlined Off-Policy Learning
Figure 3 for Towards Simplicity in Deep Reinforcement Learning: Streamlined Off-Policy Learning
Figure 4 for Towards Simplicity in Deep Reinforcement Learning: Streamlined Off-Policy Learning
Viaarxiv icon