Picture for Haidar Khan

Haidar Khan

Stream RAG: Instant and Accurate Spoken Dialogue Systems with Streaming Tool Usage

Add code
Oct 02, 2025
Viaarxiv icon

ConfQA: Answer Only If You Are Confident

Add code
Jun 08, 2025
Figure 1 for ConfQA: Answer Only If You Are Confident
Figure 2 for ConfQA: Answer Only If You Are Confident
Figure 3 for ConfQA: Answer Only If You Are Confident
Figure 4 for ConfQA: Answer Only If You Are Confident
Viaarxiv icon

ZeroSumEval: Scaling LLM Evaluation with Inter-Model Competition

Add code
Apr 17, 2025
Viaarxiv icon

ALLaM: Large Language Models for Arabic and English

Add code
Jul 22, 2024
Figure 1 for ALLaM: Large Language Models for Arabic and English
Figure 2 for ALLaM: Large Language Models for Arabic and English
Figure 3 for ALLaM: Large Language Models for Arabic and English
Figure 4 for ALLaM: Large Language Models for Arabic and English
Viaarxiv icon

A Systematic Survey and Critical Review on Evaluating Large Language Models: Challenges, Limitations, and Recommendations

Add code
Jul 04, 2024
Figure 1 for A Systematic Survey and Critical Review on Evaluating Large Language Models: Challenges, Limitations, and Recommendations
Figure 2 for A Systematic Survey and Critical Review on Evaluating Large Language Models: Challenges, Limitations, and Recommendations
Figure 3 for A Systematic Survey and Critical Review on Evaluating Large Language Models: Challenges, Limitations, and Recommendations
Figure 4 for A Systematic Survey and Critical Review on Evaluating Large Language Models: Challenges, Limitations, and Recommendations
Viaarxiv icon

When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards

Add code
Feb 01, 2024
Figure 1 for When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards
Figure 2 for When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards
Figure 3 for When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards
Figure 4 for When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards
Viaarxiv icon

Controlling the Extraction of Memorized Data from Large Language Models via Prompt-Tuning

Add code
May 19, 2023
Viaarxiv icon

Low-Resource Compositional Semantic Parsing with Concept Pretraining

Add code
Jan 30, 2023
Viaarxiv icon

AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model

Add code
Aug 03, 2022
Figure 1 for AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model
Figure 2 for AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model
Figure 3 for AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model
Figure 4 for AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model
Viaarxiv icon

Alexa Teacher Model: Pretraining and Distilling Multi-Billion-Parameter Encoders for Natural Language Understanding Systems

Add code
Jun 15, 2022
Figure 1 for Alexa Teacher Model: Pretraining and Distilling Multi-Billion-Parameter Encoders for Natural Language Understanding Systems
Figure 2 for Alexa Teacher Model: Pretraining and Distilling Multi-Billion-Parameter Encoders for Natural Language Understanding Systems
Figure 3 for Alexa Teacher Model: Pretraining and Distilling Multi-Billion-Parameter Encoders for Natural Language Understanding Systems
Figure 4 for Alexa Teacher Model: Pretraining and Distilling Multi-Billion-Parameter Encoders for Natural Language Understanding Systems
Viaarxiv icon