Picture for Yuling Gu

Yuling Gu

OLMES: A Standard for Language Model Evaluations

Add code
Jun 12, 2024
Viaarxiv icon

WorldValuesBench: A Large-Scale Benchmark Dataset for Multi-Cultural Value Awareness of Language Models

Add code
Apr 25, 2024
Viaarxiv icon

PROC2PDDL: Open-Domain Planning Representations from Texts

Add code
Feb 29, 2024
Figure 1 for PROC2PDDL: Open-Domain Planning Representations from Texts
Figure 2 for PROC2PDDL: Open-Domain Planning Representations from Texts
Figure 3 for PROC2PDDL: Open-Domain Planning Representations from Texts
Figure 4 for PROC2PDDL: Open-Domain Planning Representations from Texts
Viaarxiv icon

OLMo: Accelerating the Science of Language Models

Add code
Feb 07, 2024
Viaarxiv icon

Digital Socrates: Evaluating LLMs through explanation critiques

Add code
Nov 16, 2023
Viaarxiv icon

What Makes it Ok to Set a Fire? Iterative Self-distillation of Contexts and Rationales for Disambiguating Defeasible Social and Moral Situations

Nov 01, 2023
Figure 1 for What Makes it Ok to Set a Fire? Iterative Self-distillation of Contexts and Rationales for Disambiguating Defeasible Social and Moral Situations
Figure 2 for What Makes it Ok to Set a Fire? Iterative Self-distillation of Contexts and Rationales for Disambiguating Defeasible Social and Moral Situations
Figure 3 for What Makes it Ok to Set a Fire? Iterative Self-distillation of Contexts and Rationales for Disambiguating Defeasible Social and Moral Situations
Figure 4 for What Makes it Ok to Set a Fire? Iterative Self-distillation of Contexts and Rationales for Disambiguating Defeasible Social and Moral Situations
Viaarxiv icon

Do language models have coherent mental models of everyday things?

Add code
Dec 20, 2022
Figure 1 for Do language models have coherent mental models of everyday things?
Figure 2 for Do language models have coherent mental models of everyday things?
Figure 3 for Do language models have coherent mental models of everyday things?
Figure 4 for Do language models have coherent mental models of everyday things?
Viaarxiv icon

Measure More, Question More: Experimental Studies on Transformer-based Language Models and Complement Coercion

Add code
Dec 20, 2022
Figure 1 for Measure More, Question More: Experimental Studies on Transformer-based Language Models and Complement Coercion
Figure 2 for Measure More, Question More: Experimental Studies on Transformer-based Language Models and Complement Coercion
Figure 3 for Measure More, Question More: Experimental Studies on Transformer-based Language Models and Complement Coercion
Figure 4 for Measure More, Question More: Experimental Studies on Transformer-based Language Models and Complement Coercion
Viaarxiv icon

One Venue, Two Conferences: The Separation of Chinese and American Citation Networks

Add code
Nov 17, 2022
Figure 1 for One Venue, Two Conferences: The Separation of Chinese and American Citation Networks
Viaarxiv icon

Just-DREAM-about-it: Figurative Language Understanding with DREAM-FLUTE

Add code
Oct 28, 2022
Figure 1 for Just-DREAM-about-it: Figurative Language Understanding with DREAM-FLUTE
Figure 2 for Just-DREAM-about-it: Figurative Language Understanding with DREAM-FLUTE
Figure 3 for Just-DREAM-about-it: Figurative Language Understanding with DREAM-FLUTE
Figure 4 for Just-DREAM-about-it: Figurative Language Understanding with DREAM-FLUTE
Viaarxiv icon