Picture for Bernd Bohnet

Bernd Bohnet

Exploring and Benchmarking the Planning Capabilities of Large Language Models

Add code
Jun 18, 2024
Viaarxiv icon

Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation

Add code
May 31, 2024
Figure 1 for Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation
Figure 2 for Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation
Figure 3 for Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation
Figure 4 for Long-Span Question-Answering: Automatic Question Generation and QA-System Ranking via Side-by-Side Evaluation
Viaarxiv icon

Many-Shot In-Context Learning

Add code
Apr 17, 2024
Viaarxiv icon

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Add code
Mar 08, 2024
Viaarxiv icon

A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains

Add code
Feb 02, 2024
Figure 1 for A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains
Figure 2 for A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains
Figure 3 for A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains
Figure 4 for A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains
Viaarxiv icon

Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models

Add code
Dec 22, 2023
Figure 1 for Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Figure 2 for Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Figure 3 for Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Figure 4 for Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Viaarxiv icon

Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5?

Add code
Nov 15, 2023
Figure 1 for Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5?
Figure 2 for Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5?
Figure 3 for Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5?
Figure 4 for Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5?
Viaarxiv icon

A Comprehensive Evaluation of Tool-Assisted Generation Strategies

Add code
Oct 16, 2023
Figure 1 for A Comprehensive Evaluation of Tool-Assisted Generation Strategies
Figure 2 for A Comprehensive Evaluation of Tool-Assisted Generation Strategies
Figure 3 for A Comprehensive Evaluation of Tool-Assisted Generation Strategies
Figure 4 for A Comprehensive Evaluation of Tool-Assisted Generation Strategies
Viaarxiv icon

Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models

Add code
Dec 15, 2022
Figure 1 for Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models
Figure 2 for Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models
Figure 3 for Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models
Figure 4 for Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models
Viaarxiv icon

Coreference Resolution through a seq2seq Transition-Based System

Add code
Nov 22, 2022
Figure 1 for Coreference Resolution through a seq2seq Transition-Based System
Figure 2 for Coreference Resolution through a seq2seq Transition-Based System
Figure 3 for Coreference Resolution through a seq2seq Transition-Based System
Figure 4 for Coreference Resolution through a seq2seq Transition-Based System
Viaarxiv icon