Picture for Nabeel Seedat

Nabeel Seedat

Guideline-Grounded Evidence Accumulation for High-Stakes Agent Verification

Add code
Mar 03, 2026
Viaarxiv icon

Scales++: Compute Efficient Evaluation Subset Selection with Cognitive Scales Embeddings

Add code
Oct 30, 2025
Figure 1 for Scales++: Compute Efficient Evaluation Subset Selection with Cognitive Scales Embeddings
Figure 2 for Scales++: Compute Efficient Evaluation Subset Selection with Cognitive Scales Embeddings
Figure 3 for Scales++: Compute Efficient Evaluation Subset Selection with Cognitive Scales Embeddings
Figure 4 for Scales++: Compute Efficient Evaluation Subset Selection with Cognitive Scales Embeddings
Viaarxiv icon

Beyond Pointwise Scores: Decomposed Criteria-Based Evaluation of LLM Responses

Add code
Sep 19, 2025
Figure 1 for Beyond Pointwise Scores: Decomposed Criteria-Based Evaluation of LLM Responses
Figure 2 for Beyond Pointwise Scores: Decomposed Criteria-Based Evaluation of LLM Responses
Figure 3 for Beyond Pointwise Scores: Decomposed Criteria-Based Evaluation of LLM Responses
Figure 4 for Beyond Pointwise Scores: Decomposed Criteria-Based Evaluation of LLM Responses
Viaarxiv icon

Towards Human-Guided, Data-Centric LLM Co-Pilots

Add code
Jan 17, 2025
Viaarxiv icon

Unlocking Historical Clinical Trial Data with ALIGN: A Compositional Large Language Model System for Medical Coding

Add code
Nov 20, 2024
Figure 1 for Unlocking Historical Clinical Trial Data with ALIGN: A Compositional Large Language Model System for Medical Coding
Figure 2 for Unlocking Historical Clinical Trial Data with ALIGN: A Compositional Large Language Model System for Medical Coding
Figure 3 for Unlocking Historical Clinical Trial Data with ALIGN: A Compositional Large Language Model System for Medical Coding
Figure 4 for Unlocking Historical Clinical Trial Data with ALIGN: A Compositional Large Language Model System for Medical Coding
Viaarxiv icon

Self-Healing Machine Learning: A Framework for Autonomous Adaptation in Real-World Environments

Add code
Oct 31, 2024
Figure 1 for Self-Healing Machine Learning: A Framework for Autonomous Adaptation in Real-World Environments
Figure 2 for Self-Healing Machine Learning: A Framework for Autonomous Adaptation in Real-World Environments
Figure 3 for Self-Healing Machine Learning: A Framework for Autonomous Adaptation in Real-World Environments
Figure 4 for Self-Healing Machine Learning: A Framework for Autonomous Adaptation in Real-World Environments
Viaarxiv icon

Matchmaker: Self-Improving Large Language Model Programs for Schema Matching

Add code
Oct 31, 2024
Figure 1 for Matchmaker: Self-Improving Large Language Model Programs for Schema Matching
Figure 2 for Matchmaker: Self-Improving Large Language Model Programs for Schema Matching
Figure 3 for Matchmaker: Self-Improving Large Language Model Programs for Schema Matching
Figure 4 for Matchmaker: Self-Improving Large Language Model Programs for Schema Matching
Viaarxiv icon

Context-Aware Testing: A New Paradigm for Model Testing with Large Language Models

Add code
Oct 31, 2024
Figure 1 for Context-Aware Testing: A New Paradigm for Model Testing with Large Language Models
Figure 2 for Context-Aware Testing: A New Paradigm for Model Testing with Large Language Models
Figure 3 for Context-Aware Testing: A New Paradigm for Model Testing with Large Language Models
Figure 4 for Context-Aware Testing: A New Paradigm for Model Testing with Large Language Models
Viaarxiv icon

You can't handle the truth: Data-centric insights improve pseudo-labeling

Add code
Jun 19, 2024
Figure 1 for You can't handle the  truth: Data-centric insights improve pseudo-labeling
Figure 2 for You can't handle the  truth: Data-centric insights improve pseudo-labeling
Figure 3 for You can't handle the  truth: Data-centric insights improve pseudo-labeling
Figure 4 for You can't handle the  truth: Data-centric insights improve pseudo-labeling
Viaarxiv icon

Relaxed Quantile Regression: Prediction Intervals for Asymmetric Noise

Add code
Jun 05, 2024
Viaarxiv icon