Picture for Guijin Son

Guijin Son

The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models

Add code
Jun 09, 2024
Figure 1 for The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Figure 2 for The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Figure 3 for The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Figure 4 for The BiGGen Bench: A Principled Benchmark for Fine-grained Evaluation of Language Models with Language Models
Viaarxiv icon

ESG Classification by Implicit Rule Learning via GPT-4

Add code
Mar 22, 2024
Figure 1 for ESG Classification by Implicit Rule Learning via GPT-4
Figure 2 for ESG Classification by Implicit Rule Learning via GPT-4
Figure 3 for ESG Classification by Implicit Rule Learning via GPT-4
Figure 4 for ESG Classification by Implicit Rule Learning via GPT-4
Viaarxiv icon

Multi-Task Inference: Can Large Language Models Follow Multiple Instructions at Once?

Add code
Feb 18, 2024
Figure 1 for Multi-Task Inference: Can Large Language Models Follow Multiple Instructions at Once?
Figure 2 for Multi-Task Inference: Can Large Language Models Follow Multiple Instructions at Once?
Figure 3 for Multi-Task Inference: Can Large Language Models Follow Multiple Instructions at Once?
Figure 4 for Multi-Task Inference: Can Large Language Models Follow Multiple Instructions at Once?
Viaarxiv icon

KMMLU: Measuring Massive Multitask Language Understanding in Korean

Add code
Feb 18, 2024
Figure 1 for KMMLU: Measuring Massive Multitask Language Understanding in Korean
Figure 2 for KMMLU: Measuring Massive Multitask Language Understanding in Korean
Figure 3 for KMMLU: Measuring Massive Multitask Language Understanding in Korean
Figure 4 for KMMLU: Measuring Massive Multitask Language Understanding in Korean
Viaarxiv icon

HAE-RAE Bench: Evaluation of Korean Knowledge in Language Models

Add code
Sep 15, 2023
Figure 1 for HAE-RAE Bench: Evaluation of Korean Knowledge in Language Models
Figure 2 for HAE-RAE Bench: Evaluation of Korean Knowledge in Language Models
Figure 3 for HAE-RAE Bench: Evaluation of Korean Knowledge in Language Models
Figure 4 for HAE-RAE Bench: Evaluation of Korean Knowledge in Language Models
Viaarxiv icon

Beyond Classification: Financial Reasoning in State-of-the-Art Language Models

Add code
Apr 30, 2023
Figure 1 for Beyond Classification: Financial Reasoning in State-of-the-Art Language Models
Figure 2 for Beyond Classification: Financial Reasoning in State-of-the-Art Language Models
Figure 3 for Beyond Classification: Financial Reasoning in State-of-the-Art Language Models
Figure 4 for Beyond Classification: Financial Reasoning in State-of-the-Art Language Models
Viaarxiv icon

Removing Non-Stationary Knowledge From Pre-Trained Language Models for Entity-Level Sentiment Classification in Finance

Add code
Jan 25, 2023
Figure 1 for Removing Non-Stationary Knowledge From Pre-Trained Language Models for Entity-Level Sentiment Classification in Finance
Figure 2 for Removing Non-Stationary Knowledge From Pre-Trained Language Models for Entity-Level Sentiment Classification in Finance
Figure 3 for Removing Non-Stationary Knowledge From Pre-Trained Language Models for Entity-Level Sentiment Classification in Finance
Figure 4 for Removing Non-Stationary Knowledge From Pre-Trained Language Models for Entity-Level Sentiment Classification in Finance
Viaarxiv icon