Picture for Tushar Khot

Tushar Khot

DiscoveryBench: Towards Data-Driven Discovery with Large Language Models

Add code
Jul 01, 2024
Viaarxiv icon

DISCOVERYWORLD: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery Agents

Add code
Jun 10, 2024
Figure 1 for DISCOVERYWORLD: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery Agents
Figure 2 for DISCOVERYWORLD: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery Agents
Figure 3 for DISCOVERYWORLD: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery Agents
Figure 4 for DISCOVERYWORLD: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery Agents
Viaarxiv icon

Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning

Add code
Jun 10, 2024
Figure 1 for Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning
Figure 2 for Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning
Figure 3 for Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning
Figure 4 for Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning
Viaarxiv icon

OLMo: Accelerating the Science of Language Models

Add code
Feb 07, 2024
Figure 1 for OLMo: Accelerating the Science of Language Models
Figure 2 for OLMo: Accelerating the Science of Language Models
Figure 3 for OLMo: Accelerating the Science of Language Models
Figure 4 for OLMo: Accelerating the Science of Language Models
Viaarxiv icon

ADaPT: As-Needed Decomposition and Planning with Language Models

Add code
Nov 08, 2023
Viaarxiv icon

Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs

Add code
Nov 08, 2023
Figure 1 for Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs
Figure 2 for Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs
Figure 3 for Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs
Figure 4 for Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs
Viaarxiv icon

How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources

Add code
Jun 07, 2023
Figure 1 for How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources
Figure 2 for How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources
Figure 3 for How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources
Figure 4 for How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources
Viaarxiv icon

Chain-of-Thought Hub: A Continuous Effort to Measure Large Language Models' Reasoning Performance

Add code
May 26, 2023
Figure 1 for Chain-of-Thought Hub: A Continuous Effort to Measure Large Language Models' Reasoning Performance
Figure 2 for Chain-of-Thought Hub: A Continuous Effort to Measure Large Language Models' Reasoning Performance
Viaarxiv icon

Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback

Add code
May 17, 2023
Figure 1 for Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback
Figure 2 for Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback
Figure 3 for Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback
Figure 4 for Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback
Viaarxiv icon

Specializing Smaller Language Models towards Multi-Step Reasoning

Add code
Jan 30, 2023
Figure 1 for Specializing Smaller Language Models towards Multi-Step Reasoning
Figure 2 for Specializing Smaller Language Models towards Multi-Step Reasoning
Figure 3 for Specializing Smaller Language Models towards Multi-Step Reasoning
Figure 4 for Specializing Smaller Language Models towards Multi-Step Reasoning
Viaarxiv icon