Picture for Linyong Nan

Linyong Nan

On Evaluating the Integration of Reasoning and Action in LLM Agents with Database Question Answering

Add code
Nov 16, 2023
Viaarxiv icon

DocMath-Eval: Evaluating Numerical Reasoning Capabilities of LLMs in Understanding Long Documents with Tabular Data

Add code
Nov 16, 2023
Figure 1 for DocMath-Eval: Evaluating Numerical Reasoning Capabilities of LLMs in Understanding Long Documents with Tabular Data
Figure 2 for DocMath-Eval: Evaluating Numerical Reasoning Capabilities of LLMs in Understanding Long Documents with Tabular Data
Figure 3 for DocMath-Eval: Evaluating Numerical Reasoning Capabilities of LLMs in Understanding Long Documents with Tabular Data
Figure 4 for DocMath-Eval: Evaluating Numerical Reasoning Capabilities of LLMs in Understanding Long Documents with Tabular Data
Viaarxiv icon

RobuT: A Systematic Study of Table QA Robustness Against Human-Annotated Adversarial Perturbations

Add code
Jun 25, 2023
Figure 1 for RobuT: A Systematic Study of Table QA Robustness Against Human-Annotated Adversarial Perturbations
Figure 2 for RobuT: A Systematic Study of Table QA Robustness Against Human-Annotated Adversarial Perturbations
Figure 3 for RobuT: A Systematic Study of Table QA Robustness Against Human-Annotated Adversarial Perturbations
Figure 4 for RobuT: A Systematic Study of Table QA Robustness Against Human-Annotated Adversarial Perturbations
Viaarxiv icon

Large Language Models are Effective Table-to-Text Generators, Evaluators, and Feedback Providers

Add code
May 24, 2023
Figure 1 for Large Language Models are Effective Table-to-Text Generators, Evaluators, and Feedback Providers
Figure 2 for Large Language Models are Effective Table-to-Text Generators, Evaluators, and Feedback Providers
Figure 3 for Large Language Models are Effective Table-to-Text Generators, Evaluators, and Feedback Providers
Figure 4 for Large Language Models are Effective Table-to-Text Generators, Evaluators, and Feedback Providers
Viaarxiv icon

QTSumm: A New Benchmark for Query-Focused Table Summarization

Add code
May 23, 2023
Figure 1 for QTSumm: A New Benchmark for Query-Focused Table Summarization
Figure 2 for QTSumm: A New Benchmark for Query-Focused Table Summarization
Figure 3 for QTSumm: A New Benchmark for Query-Focused Table Summarization
Figure 4 for QTSumm: A New Benchmark for Query-Focused Table Summarization
Viaarxiv icon

Enhancing Few-shot Text-to-SQL Capabilities of Large Language Models: A Study on Prompt Design Strategies

Add code
May 21, 2023
Figure 1 for Enhancing Few-shot Text-to-SQL Capabilities of Large Language Models: A Study on Prompt Design Strategies
Figure 2 for Enhancing Few-shot Text-to-SQL Capabilities of Large Language Models: A Study on Prompt Design Strategies
Figure 3 for Enhancing Few-shot Text-to-SQL Capabilities of Large Language Models: A Study on Prompt Design Strategies
Figure 4 for Enhancing Few-shot Text-to-SQL Capabilities of Large Language Models: A Study on Prompt Design Strategies
Viaarxiv icon

LoFT: Enhancing Faithfulness and Diversity for Table-to-Text Generation via Logic Form Control

Add code
Feb 06, 2023
Figure 1 for LoFT: Enhancing Faithfulness and Diversity for Table-to-Text Generation via Logic Form Control
Figure 2 for LoFT: Enhancing Faithfulness and Diversity for Table-to-Text Generation via Logic Form Control
Figure 3 for LoFT: Enhancing Faithfulness and Diversity for Table-to-Text Generation via Logic Form Control
Figure 4 for LoFT: Enhancing Faithfulness and Diversity for Table-to-Text Generation via Logic Form Control
Viaarxiv icon

Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation

Add code
Dec 15, 2022
Figure 1 for Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation
Figure 2 for Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation
Figure 3 for Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation
Figure 4 for Revisiting the Gold Standard: Grounding Summarization Evaluation with Robust Human Evaluation
Viaarxiv icon

FOLIO: Natural Language Reasoning with First-Order Logic

Add code
Sep 02, 2022
Figure 1 for FOLIO: Natural Language Reasoning with First-Order Logic
Figure 2 for FOLIO: Natural Language Reasoning with First-Order Logic
Figure 3 for FOLIO: Natural Language Reasoning with First-Order Logic
Figure 4 for FOLIO: Natural Language Reasoning with First-Order Logic
Viaarxiv icon

Leveraging Locality in Abstractive Text Summarization

Add code
May 25, 2022
Figure 1 for Leveraging Locality in Abstractive Text Summarization
Figure 2 for Leveraging Locality in Abstractive Text Summarization
Figure 3 for Leveraging Locality in Abstractive Text Summarization
Figure 4 for Leveraging Locality in Abstractive Text Summarization
Viaarxiv icon