Picture for Jonathan Herzig

Jonathan Herzig

Can Few-shot Work in Long-Context? Recycling the Context to Generate Demonstrations

Add code
Jun 19, 2024
Figure 1 for Can Few-shot Work in Long-Context? Recycling the Context to Generate Demonstrations
Figure 2 for Can Few-shot Work in Long-Context? Recycling the Context to Generate Demonstrations
Figure 3 for Can Few-shot Work in Long-Context? Recycling the Context to Generate Demonstrations
Figure 4 for Can Few-shot Work in Long-Context? Recycling the Context to Generate Demonstrations
Viaarxiv icon

TACT: Advancing Complex Aggregative Reasoning with Information Extraction Tools

Add code
Jun 05, 2024
Figure 1 for TACT: Advancing Complex Aggregative Reasoning with Information Extraction Tools
Figure 2 for TACT: Advancing Complex Aggregative Reasoning with Information Extraction Tools
Figure 3 for TACT: Advancing Complex Aggregative Reasoning with Information Extraction Tools
Figure 4 for TACT: Advancing Complex Aggregative Reasoning with Information Extraction Tools
Viaarxiv icon

Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?

Add code
May 09, 2024
Figure 1 for Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?
Figure 2 for Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?
Figure 3 for Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?
Figure 4 for Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?
Viaarxiv icon

Constructing Benchmarks and Interventions for Combating Hallucinations in LLMs

Add code
Apr 15, 2024
Viaarxiv icon

MiMiC: Minimally Modified Counterfactuals in the Representation Space

Add code
Feb 16, 2024
Figure 1 for MiMiC: Minimally Modified Counterfactuals in the Representation Space
Figure 2 for MiMiC: Minimally Modified Counterfactuals in the Representation Space
Figure 3 for MiMiC: Minimally Modified Counterfactuals in the Representation Space
Figure 4 for MiMiC: Minimally Modified Counterfactuals in the Representation Space
Viaarxiv icon

A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains

Add code
Feb 02, 2024
Figure 1 for A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains
Figure 2 for A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains
Figure 3 for A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains
Figure 4 for A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains
Viaarxiv icon

Multilingual Instruction Tuning With Just a Pinch of Multilinguality

Add code
Jan 08, 2024
Viaarxiv icon

A Comprehensive Evaluation of Tool-Assisted Generation Strategies

Add code
Oct 16, 2023
Figure 1 for A Comprehensive Evaluation of Tool-Assisted Generation Strategies
Figure 2 for A Comprehensive Evaluation of Tool-Assisted Generation Strategies
Figure 3 for A Comprehensive Evaluation of Tool-Assisted Generation Strategies
Figure 4 for A Comprehensive Evaluation of Tool-Assisted Generation Strategies
Viaarxiv icon

Evaluating and Modeling Attribution for Cross-Lingual Question Answering

Add code
May 23, 2023
Figure 1 for Evaluating and Modeling Attribution for Cross-Lingual Question Answering
Figure 2 for Evaluating and Modeling Attribution for Cross-Lingual Question Answering
Figure 3 for Evaluating and Modeling Attribution for Cross-Lingual Question Answering
Figure 4 for Evaluating and Modeling Attribution for Cross-Lingual Question Answering
Viaarxiv icon

What You See is What You Read? Improving Text-Image Alignment Evaluation

Add code
May 22, 2023
Figure 1 for What You See is What You Read? Improving Text-Image Alignment Evaluation
Figure 2 for What You See is What You Read? Improving Text-Image Alignment Evaluation
Figure 3 for What You See is What You Read? Improving Text-Image Alignment Evaluation
Figure 4 for What You See is What You Read? Improving Text-Image Alignment Evaluation
Viaarxiv icon