Benchmarking


They Said Memes Were Harmless-We Found the Ones That Hurt: Decoding Jokes, Symbols, and Cultural References

Add code
Feb 03, 2026
Viaarxiv icon

Adaptive Evidence Weighting for Audio-Spatiotemporal Fusion

Add code
Feb 03, 2026
Viaarxiv icon

Enhancing Imbalanced Node Classification via Curriculum-Guided Feature Learning and Three-Stage Attention Network

Add code
Feb 03, 2026
Viaarxiv icon

FullStack-Agent: Enhancing Agentic Full-Stack Web Coding via Development-Oriented Testing and Repository Back-Translation

Add code
Feb 03, 2026
Viaarxiv icon

Context Compression via Explicit Information Transmission

Add code
Feb 03, 2026
Viaarxiv icon

CUBO: Self-Contained Retrieval-Augmented Generation on Consumer Laptops 10 GB Corpora, 16 GB RAM, Single-Device Deployment

Add code
Feb 03, 2026
Viaarxiv icon

Beyond Tokens: Semantic-Aware Speculative Decoding for Efficient Inference by Probing Internal States

Add code
Feb 03, 2026
Viaarxiv icon

Data-Driven Graph Filters via Adaptive Spectral Shaping

Add code
Feb 03, 2026
Viaarxiv icon

OCRTurk: A Comprehensive OCR Benchmark for Turkish

Add code
Feb 03, 2026
Viaarxiv icon

Rethinking the Reranker: Boundary-Aware Evidence Selection for Robust Retrieval-Augmented Generation

Add code
Feb 03, 2026
Viaarxiv icon