Alert button
Picture for Hugh Zhang

Hugh Zhang

Alert button

Q-Probe: A Lightweight Approach to Reward Maximization for Language Models

Add code
Bookmark button
Alert button
Feb 22, 2024
Kenneth Li, Samy Jelassi, Hugh Zhang, Sham Kakade, Martin Wattenberg, David Brandfonbrener

Viaarxiv icon

Easy as ABCs: Unifying Boltzmann Q-Learning and Counterfactual Regret Minimization

Add code
Bookmark button
Alert button
Feb 19, 2024
Luca D'Amico-Wong, Hugh Zhang, Marc Lanctot, David C. Parkes

Viaarxiv icon

Chain-of-Thought Reasoning is a Policy Improvement Operator

Add code
Bookmark button
Alert button
Sep 15, 2023
Hugh Zhang, David C. Parkes

Viaarxiv icon

Trading Off Diversity and Quality in Natural Language Generation

Add code
Bookmark button
Alert button
Apr 22, 2020
Hugh Zhang, Daniel Duckworth, Daphne Ippolito, Arvind Neelakantan

Figure 1 for Trading Off Diversity and Quality in Natural Language Generation
Figure 2 for Trading Off Diversity and Quality in Natural Language Generation
Figure 3 for Trading Off Diversity and Quality in Natural Language Generation
Figure 4 for Trading Off Diversity and Quality in Natural Language Generation
Viaarxiv icon

Unifying Human and Statistical Evaluation for Natural Language Generation

Add code
Bookmark button
Alert button
Apr 04, 2019
Tatsunori B. Hashimoto, Hugh Zhang, Percy Liang

Figure 1 for Unifying Human and Statistical Evaluation for Natural Language Generation
Figure 2 for Unifying Human and Statistical Evaluation for Natural Language Generation
Figure 3 for Unifying Human and Statistical Evaluation for Natural Language Generation
Figure 4 for Unifying Human and Statistical Evaluation for Natural Language Generation
Viaarxiv icon