Picture for Graham Neubig

Graham Neubig

Carnegie Mellon University

Instruction-tuned Language Models are Better Knowledge Learners

Add code
Feb 20, 2024
Figure 1 for Instruction-tuned Language Models are Better Knowledge Learners
Figure 2 for Instruction-tuned Language Models are Better Knowledge Learners
Figure 3 for Instruction-tuned Language Models are Better Knowledge Learners
Figure 4 for Instruction-tuned Language Models are Better Knowledge Learners
Viaarxiv icon

Everybody Prune Now: Structured Pruning of LLMs with only Forward Passes

Add code
Feb 09, 2024
Viaarxiv icon

Can Large Language Models be Trusted for Evaluation? Scalable Meta-Evaluation of LLMs as Evaluators via Agent Debate

Add code
Jan 30, 2024
Viaarxiv icon

VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web Tasks

Add code
Jan 24, 2024
Figure 1 for VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web Tasks
Figure 2 for VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web Tasks
Figure 3 for VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web Tasks
Figure 4 for VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web Tasks
Viaarxiv icon

TroVE: Inducing Verifiable and Efficient Toolboxes for Solving Programmatic Tasks

Add code
Jan 23, 2024
Viaarxiv icon

Fine-grained Hallucination Detection and Editing for Language Models

Add code
Jan 17, 2024
Viaarxiv icon

An In-depth Look at Gemini's Language Abilities

Add code
Dec 24, 2023
Viaarxiv icon

Alignment for Honesty

Add code
Dec 12, 2023
Figure 1 for Alignment for Honesty
Figure 2 for Alignment for Honesty
Figure 3 for Alignment for Honesty
Figure 4 for Alignment for Honesty
Viaarxiv icon

Multitask Learning Can Improve Worst-Group Outcomes

Add code
Dec 05, 2023
Figure 1 for Multitask Learning Can Improve Worst-Group Outcomes
Figure 2 for Multitask Learning Can Improve Worst-Group Outcomes
Figure 3 for Multitask Learning Can Improve Worst-Group Outcomes
Figure 4 for Multitask Learning Can Improve Worst-Group Outcomes
Viaarxiv icon

Program-Aided Reasoners Know What They Know

Add code
Nov 16, 2023
Viaarxiv icon