Picture for Erchin Serpedin

Erchin Serpedin

Halluverse-M^3: A multitask multilingual benchmark for hallucination in LLMs

Add code
Feb 06, 2026
Viaarxiv icon

SCALAR: Quantifying Structural Hallucination, Consistency, and Reasoning Gaps in Materials Foundation Models

Add code
Jan 29, 2026
Viaarxiv icon

C2NP: A Benchmark for Learning Scale-Dependent Geometric Invariances in 3D Materials Generation

Add code
Jan 27, 2026
Viaarxiv icon

Evaluating Multilingual and Code-Switched Alignment in LLMs via Synthetic Natural Language Inference

Add code
Aug 20, 2025
Viaarxiv icon

EMPATHIA: Multi-Faceted Human-AI Collaboration for Refugee Integration

Add code
Aug 11, 2025
Viaarxiv icon

Stress-Testing Multimodal Foundation Models for Crystallographic Reasoning

Add code
Jun 16, 2025
Viaarxiv icon

Theorem-of-Thought: A Multi-Agent Framework for Abductive, Deductive, and Inductive Reasoning in Language Models

Add code
Jun 08, 2025
Viaarxiv icon

Understanding the Capabilities of Molecular Graph Neural Networks in Materials Science Through Multimodal Learning and Physical Context Encoding

Add code
May 17, 2025
Viaarxiv icon

HalluVerse25: Fine-grained Multilingual Benchmark Dataset for LLM Hallucinations

Add code
Mar 10, 2025
Figure 1 for HalluVerse25: Fine-grained Multilingual Benchmark Dataset for LLM Hallucinations
Figure 2 for HalluVerse25: Fine-grained Multilingual Benchmark Dataset for LLM Hallucinations
Figure 3 for HalluVerse25: Fine-grained Multilingual Benchmark Dataset for LLM Hallucinations
Figure 4 for HalluVerse25: Fine-grained Multilingual Benchmark Dataset for LLM Hallucinations
Viaarxiv icon

SINdex: Semantic INconsistency Index for Hallucination Detection in LLMs

Add code
Mar 07, 2025
Viaarxiv icon