Picture for Sophia Ananiadou

Sophia Ananiadou

Ebisu: Benchmarking Large Language Models in Japanese Finance

Add code
Feb 01, 2026
Viaarxiv icon

XCR-Bench: A Multi-Task Benchmark for Evaluating Cultural Reasoning in LLMs

Add code
Jan 20, 2026
Viaarxiv icon

Locate, Steer, and Improve: A Practical Survey of Actionable Mechanistic Interpretability in Large Language Models

Add code
Jan 20, 2026
Viaarxiv icon

MisSpans: Fine-Grained False Span Identification in Cross-Domain Fake News

Add code
Jan 08, 2026
Viaarxiv icon

All That Glisters Is Not Gold: A Benchmark for Reference-Free Counterfactual Financial Misinformation Detection

Add code
Jan 08, 2026
Viaarxiv icon

RAAR: Retrieval Augmented Agentic Reasoning for Cross-Domain Misinformation Detection

Add code
Jan 08, 2026
Viaarxiv icon

Implicit Graph, Explicit Retrieval: Towards Efficient and Interpretable Long-horizon Memory for Large Language Models

Add code
Jan 06, 2026
Viaarxiv icon

MentraSuite: Post-Training Large Language Models for Mental Health Reasoning and Assessment

Add code
Dec 16, 2025
Viaarxiv icon

FinCriticalED: A Visual Benchmark for Financial Fact-Level OCR Evaluation

Add code
Nov 19, 2025
Viaarxiv icon

Semantic Label Drift in Cross-Cultural Translation

Add code
Oct 29, 2025
Viaarxiv icon