Alert button
Picture for Tatsunori Hashimoto

Tatsunori Hashimoto

Alert button

Trustless Audits without Revealing Data or Models

Add code
Bookmark button
Alert button
Apr 06, 2024
Suppakit Waiwitlikhit, Ion Stoica, Yi Sun, Tatsunori Hashimoto, Daniel Kang

Viaarxiv icon

Linguistic Calibration of Language Models

Add code
Bookmark button
Alert button
Mar 30, 2024
Neil Band, Xuechen Li, Tengyu Ma, Tatsunori Hashimoto

Viaarxiv icon

A Survey on Data Selection for Language Models

Add code
Bookmark button
Alert button
Mar 08, 2024
Alon Albalak, Yanai Elazar, Sang Michael Xie, Shayne Longpre, Nathan Lambert, Xinyi Wang, Niklas Muennighoff, Bairu Hou, Liangming Pan, Haewon Jeong, Colin Raffel, Shiyu Chang, Tatsunori Hashimoto, William Yang Wang

Viaarxiv icon

Language Models with Conformal Factuality Guarantees

Add code
Bookmark button
Alert button
Feb 15, 2024
Christopher Mohri, Tatsunori Hashimoto

Viaarxiv icon

Stochastic Amortization: A Unified Approach to Accelerate Feature and Data Attribution

Add code
Bookmark button
Alert button
Jan 29, 2024
Ian Covert, Chanwoo Kim, Su-In Lee, James Zou, Tatsunori Hashimoto

Viaarxiv icon

On the Learnability of Watermarks for Language Models

Add code
Bookmark button
Alert button
Dec 07, 2023
Chenchen Gu, Xiang Lisa Li, Percy Liang, Tatsunori Hashimoto

Viaarxiv icon

Removing RLHF Protections in GPT-4 via Fine-Tuning

Add code
Bookmark button
Alert button
Nov 10, 2023
Qiusi Zhan, Richard Fang, Rohan Bindu, Akul Gupta, Tatsunori Hashimoto, Daniel Kang

Figure 1 for Removing RLHF Protections in GPT-4 via Fine-Tuning
Figure 2 for Removing RLHF Protections in GPT-4 via Fine-Tuning
Viaarxiv icon

MoCa: Measuring Human-Language Model Alignment on Causal and Moral Judgment Tasks

Add code
Bookmark button
Alert button
Oct 31, 2023
Allen Nie, Yuhui Zhang, Atharva Amdekar, Chris Piech, Tatsunori Hashimoto, Tobias Gerstenberg

Viaarxiv icon