Picture for Yew Ken Chia

Yew Ken Chia

Auto Arena of LLMs: Automating LLM Evaluations with Agent Peer-battles and Committee Discussions

Add code
May 30, 2024
Viaarxiv icon

PuzzleVQA: Diagnosing Multimodal Reasoning Challenges of Language Models with Abstract Visual Patterns

Add code
Mar 20, 2024
Figure 1 for PuzzleVQA: Diagnosing Multimodal Reasoning Challenges of Language Models with Abstract Visual Patterns
Figure 2 for PuzzleVQA: Diagnosing Multimodal Reasoning Challenges of Language Models with Abstract Visual Patterns
Figure 3 for PuzzleVQA: Diagnosing Multimodal Reasoning Challenges of Language Models with Abstract Visual Patterns
Figure 4 for PuzzleVQA: Diagnosing Multimodal Reasoning Challenges of Language Models with Abstract Visual Patterns
Viaarxiv icon

Contrastive Chain-of-Thought Prompting

Add code
Nov 15, 2023
Figure 1 for Contrastive Chain-of-Thought Prompting
Figure 2 for Contrastive Chain-of-Thought Prompting
Figure 3 for Contrastive Chain-of-Thought Prompting
Figure 4 for Contrastive Chain-of-Thought Prompting
Viaarxiv icon

Flacuna: Unleashing the Problem Solving Power of Vicuna using FLAN Fine-Tuning

Add code
Jul 05, 2023
Figure 1 for Flacuna: Unleashing the Problem Solving Power of Vicuna using FLAN Fine-Tuning
Figure 2 for Flacuna: Unleashing the Problem Solving Power of Vicuna using FLAN Fine-Tuning
Figure 3 for Flacuna: Unleashing the Problem Solving Power of Vicuna using FLAN Fine-Tuning
Figure 4 for Flacuna: Unleashing the Problem Solving Power of Vicuna using FLAN Fine-Tuning
Viaarxiv icon

INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large Language Models

Add code
Jun 15, 2023
Figure 1 for INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large Language Models
Figure 2 for INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large Language Models
Figure 3 for INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large Language Models
Figure 4 for INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large Language Models
Viaarxiv icon

M3Exam: A Multilingual, Multimodal, Multilevel Benchmark for Examining Large Language Models

Add code
Jun 08, 2023
Figure 1 for M3Exam: A Multilingual, Multimodal, Multilevel Benchmark for Examining Large Language Models
Figure 2 for M3Exam: A Multilingual, Multimodal, Multilevel Benchmark for Examining Large Language Models
Figure 3 for M3Exam: A Multilingual, Multimodal, Multilevel Benchmark for Examining Large Language Models
Figure 4 for M3Exam: A Multilingual, Multimodal, Multilevel Benchmark for Examining Large Language Models
Viaarxiv icon

Domain-Expanded ASTE: Rethinking Generalization in Aspect Sentiment Triplet Extraction

Add code
May 23, 2023
Figure 1 for Domain-Expanded ASTE: Rethinking Generalization in Aspect Sentiment Triplet Extraction
Figure 2 for Domain-Expanded ASTE: Rethinking Generalization in Aspect Sentiment Triplet Extraction
Figure 3 for Domain-Expanded ASTE: Rethinking Generalization in Aspect Sentiment Triplet Extraction
Figure 4 for Domain-Expanded ASTE: Rethinking Generalization in Aspect Sentiment Triplet Extraction
Viaarxiv icon

Chain of Knowledge: A Framework for Grounding Large Language Models with Structured Knowledge Bases

Add code
May 22, 2023
Figure 1 for Chain of Knowledge: A Framework for Grounding Large Language Models with Structured Knowledge Bases
Figure 2 for Chain of Knowledge: A Framework for Grounding Large Language Models with Structured Knowledge Bases
Figure 3 for Chain of Knowledge: A Framework for Grounding Large Language Models with Structured Knowledge Bases
Figure 4 for Chain of Knowledge: A Framework for Grounding Large Language Models with Structured Knowledge Bases
Viaarxiv icon

A Dataset for Hyper-Relational Extraction and a Cube-Filling Approach

Add code
Nov 18, 2022
Figure 1 for A Dataset for Hyper-Relational Extraction and a Cube-Filling Approach
Figure 2 for A Dataset for Hyper-Relational Extraction and a Cube-Filling Approach
Figure 3 for A Dataset for Hyper-Relational Extraction and a Cube-Filling Approach
Figure 4 for A Dataset for Hyper-Relational Extraction and a Cube-Filling Approach
Viaarxiv icon

RelationPrompt: Leveraging Prompts to Generate Synthetic Data for Zero-Shot Relation Triplet Extraction

Add code
Mar 17, 2022
Figure 1 for RelationPrompt: Leveraging Prompts to Generate Synthetic Data for Zero-Shot Relation Triplet Extraction
Figure 2 for RelationPrompt: Leveraging Prompts to Generate Synthetic Data for Zero-Shot Relation Triplet Extraction
Figure 3 for RelationPrompt: Leveraging Prompts to Generate Synthetic Data for Zero-Shot Relation Triplet Extraction
Figure 4 for RelationPrompt: Leveraging Prompts to Generate Synthetic Data for Zero-Shot Relation Triplet Extraction
Viaarxiv icon