Picture for Manish Bhatt

Manish Bhatt

Sid

Manifold of Failure: Behavioral Attraction Basins in Language Models

Add code
Feb 25, 2026
Viaarxiv icon

Predictive Coding and Information Bottleneck for Hallucination Detection in Large Language Models

Add code
Jan 22, 2026
Viaarxiv icon

Large Empirical Case Study: Go-Explore adapted for AI Red Team Testing

Add code
Jan 06, 2026
Viaarxiv icon

MAIF: Enforcing AI Trust and Provenance with an Artifact-Centric Agentic Paradigm

Add code
Nov 19, 2025
Viaarxiv icon

CYBERSECEVAL 3: Advancing the Evaluation of Cybersecurity Risks and Capabilities in Large Language Models

Add code
Aug 02, 2024
Figure 1 for CYBERSECEVAL 3: Advancing the Evaluation of Cybersecurity Risks and Capabilities in Large Language Models
Figure 2 for CYBERSECEVAL 3: Advancing the Evaluation of Cybersecurity Risks and Capabilities in Large Language Models
Figure 3 for CYBERSECEVAL 3: Advancing the Evaluation of Cybersecurity Risks and Capabilities in Large Language Models
Figure 4 for CYBERSECEVAL 3: Advancing the Evaluation of Cybersecurity Risks and Capabilities in Large Language Models
Viaarxiv icon

The Llama 3 Herd of Models

Add code
Jul 31, 2024
Viaarxiv icon

CyberSecEval 2: A Wide-Ranging Cybersecurity Evaluation Suite for Large Language Models

Add code
Apr 19, 2024
Figure 1 for CyberSecEval 2: A Wide-Ranging Cybersecurity Evaluation Suite for Large Language Models
Figure 2 for CyberSecEval 2: A Wide-Ranging Cybersecurity Evaluation Suite for Large Language Models
Figure 3 for CyberSecEval 2: A Wide-Ranging Cybersecurity Evaluation Suite for Large Language Models
Figure 4 for CyberSecEval 2: A Wide-Ranging Cybersecurity Evaluation Suite for Large Language Models
Viaarxiv icon

Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts

Add code
Feb 26, 2024
Figure 1 for Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
Figure 2 for Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
Figure 3 for Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
Figure 4 for Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts
Viaarxiv icon

Purple Llama CyberSecEval: A Secure Coding Benchmark for Language Models

Add code
Dec 07, 2023
Figure 1 for Purple Llama CyberSecEval: A Secure Coding Benchmark for Language Models
Figure 2 for Purple Llama CyberSecEval: A Secure Coding Benchmark for Language Models
Figure 3 for Purple Llama CyberSecEval: A Secure Coding Benchmark for Language Models
Figure 4 for Purple Llama CyberSecEval: A Secure Coding Benchmark for Language Models
Viaarxiv icon

Code Llama: Open Foundation Models for Code

Add code
Aug 25, 2023
Figure 1 for Code Llama: Open Foundation Models for Code
Figure 2 for Code Llama: Open Foundation Models for Code
Figure 3 for Code Llama: Open Foundation Models for Code
Figure 4 for Code Llama: Open Foundation Models for Code
Viaarxiv icon