Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Shuyang Shi

Navigating Noisy Feedback: Enhancing Reinforcement Learning with Error-Prone Language Models

Oct 22, 2024

Muhan Lin, Shuyang Shi, Yue Guo, Behdad Chalaki, Vaishnav Tadiparthi, Ehsan Moradi Pari, Simon Stepputtis, Joseph Campbell, Katia Sycara

Abstract:The correct specification of reward models is a well-known challenge in reinforcement learning. Hand-crafted reward functions often lead to inefficient or suboptimal policies and may not be aligned with user values. Reinforcement learning from human feedback is a successful technique that can mitigate such issues, however, the collection of human feedback can be laborious. Recent works have solicited feedback from pre-trained large language models rather than humans to reduce or eliminate human effort, however, these approaches yield poor performance in the presence of hallucination and other errors. This paper studies the advantages and limitations of reinforcement learning from large language model feedback and proposes a simple yet effective method for soliciting and applying feedback as a potential-based shaping function. We theoretically show that inconsistent rankings, which approximate ranking errors, lead to uninformative rewards with our approach. Our method empirically improves convergence speed and policy returns over commonly used baselines even with significant ranking errors, and eliminates the need for complex post-processing of reward functions.

* 13 pages, 8 figures, The 2024 Conference on Empirical Methods in Natural Language Processing

Via

Access Paper or Ask Questions

Predicting pregnancy using large-scale data from a women's health tracking mobile application

Dec 05, 2018

Bo Liu, Shuyang Shi, Yongshang Wu, Daniel Thomas, Laura Symul, Emma Pierson, Jure Leskovec

Figure 1 for Predicting pregnancy using large-scale data from a women's health tracking mobile application

Figure 2 for Predicting pregnancy using large-scale data from a women's health tracking mobile application

Figure 3 for Predicting pregnancy using large-scale data from a women's health tracking mobile application

Figure 4 for Predicting pregnancy using large-scale data from a women's health tracking mobile application

Abstract:Predicting pregnancy has been a fundamental problem in women's health for more than 50 years. Previous datasets have been collected via carefully curated medical studies, but the recent growth of women's health tracking mobile apps offers potential for reaching a much broader population. However, the feasibility of predicting pregnancy from mobile health tracking data is unclear. Here we develop four models -- a logistic regression model, and 3 LSTM models -- to predict a woman's probability of becoming pregnant using data from a women's health tracking app, Clue by BioWink GmbH. Evaluating our models on a dataset of 79 million logs from 65,276 women with ground truth pregnancy test data, we show that our predicted pregnancy probabilities meaningfully stratify women: women in the top 10% of predicted probabilities have a 88% chance of becoming pregnant over 6 menstrual cycles, as compared to a 30% chance for women in the bottom 10%. We develop an intuitive technique for extracting interpretable time trends from our deep learning models, and show these trends are consistent with previous fertility research. Our findings illustrate that women's health tracking data offers potential for predicting pregnancy on a broader population; we conclude by discussing the steps needed to fulfill this potential.

* An earlier version of this paper was presented at the 2018 NeurIPS ML4H Workshop

Via

Access Paper or Ask Questions