Alert button
Picture for Najoung Kim

Najoung Kim

Alert button

Syn-QA2: Evaluating False Assumptions in Long-tail Questions with Synthetic QA Datasets

Add code
Bookmark button
Alert button
Mar 18, 2024
Ashwin Daswani, Rohan Sawant, Najoung Kim

Figure 1 for Syn-QA2: Evaluating False Assumptions in Long-tail Questions with Synthetic QA Datasets
Figure 2 for Syn-QA2: Evaluating False Assumptions in Long-tail Questions with Synthetic QA Datasets
Figure 3 for Syn-QA2: Evaluating False Assumptions in Long-tail Questions with Synthetic QA Datasets
Figure 4 for Syn-QA2: Evaluating False Assumptions in Long-tail Questions with Synthetic QA Datasets
Viaarxiv icon

Personas as a Way to Model Truthfulness in Language Models

Add code
Bookmark button
Alert button
Oct 30, 2023
Nitish Joshi, Javier Rando, Abulhair Saparov, Najoung Kim, He He

Figure 1 for Personas as a Way to Model Truthfulness in Language Models
Figure 2 for Personas as a Way to Model Truthfulness in Language Models
Figure 3 for Personas as a Way to Model Truthfulness in Language Models
Figure 4 for Personas as a Way to Model Truthfulness in Language Models
Viaarxiv icon

SLOG: A Structural Generalization Benchmark for Semantic Parsing

Add code
Bookmark button
Alert button
Oct 23, 2023
Bingzhi Li, Lucia Donatelli, Alexander Koller, Tal Linzen, Yuekun Yao, Najoung Kim

Viaarxiv icon

Reasoning or Reciting? Exploring the Capabilities and Limitations of Language Models Through Counterfactual Tasks

Add code
Bookmark button
Alert button
Aug 01, 2023
Zhaofeng Wu, Linlu Qiu, Alexis Ross, Ekin Akyürek, Boyuan Chen, Bailin Wang, Najoung Kim, Jacob Andreas, Yoon Kim

Figure 1 for Reasoning or Reciting? Exploring the Capabilities and Limitations of Language Models Through Counterfactual Tasks
Figure 2 for Reasoning or Reciting? Exploring the Capabilities and Limitations of Language Models Through Counterfactual Tasks
Figure 3 for Reasoning or Reciting? Exploring the Capabilities and Limitations of Language Models Through Counterfactual Tasks
Figure 4 for Reasoning or Reciting? Exploring the Capabilities and Limitations of Language Models Through Counterfactual Tasks
Viaarxiv icon

Inverse Scaling: When Bigger Isn't Better

Add code
Bookmark button
Alert button
Jun 15, 2023
Ian R. McKenzie, Alexander Lyzhov, Michael Pieler, Alicia Parrish, Aaron Mueller, Ameya Prabhu, Euan McLean, Aaron Kirtland, Alexis Ross, Alisa Liu, Andrew Gritsevskiy, Daniel Wurgaft, Derik Kauffman, Gabriel Recchia, Jiacheng Liu, Joe Cavanagh, Max Weiss, Sicong Huang, The Floating Droid, Tom Tseng, Tomasz Korbak, Xudong Shen, Yuhui Zhang, Zhengping Zhou, Najoung Kim, Samuel R. Bowman, Ethan Perez

Figure 1 for Inverse Scaling: When Bigger Isn't Better
Figure 2 for Inverse Scaling: When Bigger Isn't Better
Figure 3 for Inverse Scaling: When Bigger Isn't Better
Figure 4 for Inverse Scaling: When Bigger Isn't Better
Viaarxiv icon

BoardgameQA: A Dataset for Natural Language Reasoning with Contradictory Information

Add code
Bookmark button
Alert button
Jun 13, 2023
Mehran Kazemi, Quan Yuan, Deepti Bhatia, Najoung Kim, Xin Xu, Vaiva Imbrasaite, Deepak Ramachandran

Figure 1 for BoardgameQA: A Dataset for Natural Language Reasoning with Contradictory Information
Figure 2 for BoardgameQA: A Dataset for Natural Language Reasoning with Contradictory Information
Figure 3 for BoardgameQA: A Dataset for Natural Language Reasoning with Contradictory Information
Figure 4 for BoardgameQA: A Dataset for Natural Language Reasoning with Contradictory Information
Viaarxiv icon

Testing the General Deductive Reasoning Capacity of Large Language Models Using OOD Examples

Add code
Bookmark button
Alert button
May 24, 2023
Abulhair Saparov, Richard Yuanzhe Pang, Vishakh Padmakumar, Nitish Joshi, Seyed Mehran Kazemi, Najoung Kim, He He

Figure 1 for Testing the General Deductive Reasoning Capacity of Large Language Models Using OOD Examples
Figure 2 for Testing the General Deductive Reasoning Capacity of Large Language Models Using OOD Examples
Figure 3 for Testing the General Deductive Reasoning Capacity of Large Language Models Using OOD Examples
Figure 4 for Testing the General Deductive Reasoning Capacity of Large Language Models Using OOD Examples
Viaarxiv icon

Entity Tracking in Language Models

Add code
Bookmark button
Alert button
May 03, 2023
Najoung Kim, Sebastian Schuster

Figure 1 for Entity Tracking in Language Models
Figure 2 for Entity Tracking in Language Models
Figure 3 for Entity Tracking in Language Models
Figure 4 for Entity Tracking in Language Models
Viaarxiv icon

Reconstruction Probing

Add code
Bookmark button
Alert button
Dec 21, 2022
Najoung Kim, Jatin Khilnani, Alex Warstadt, Abed Qaddoumi

Figure 1 for Reconstruction Probing
Figure 2 for Reconstruction Probing
Figure 3 for Reconstruction Probing
Figure 4 for Reconstruction Probing
Viaarxiv icon