Picture for Carlo Siebenschuh

Carlo Siebenschuh

HiPerRAG: High-Performance Retrieval Augmented Generation for Scientific Insights

Add code
May 07, 2025
Viaarxiv icon

AdaParse: An Adaptive Parallel PDF Parsing and Resource Scaling Engine

Add code
Apr 23, 2025
Viaarxiv icon

Connecting Large Language Model Agent to High Performance Computing Resource

Add code
Feb 17, 2025
Viaarxiv icon

LSHBloom: Memory-efficient, Extreme-scale Document Deduplication

Add code
Nov 06, 2024
Viaarxiv icon

WordScape: a Pipeline to extract multilingual, visually rich Documents with Layout Annotations from Web Crawl Data

Add code
Dec 15, 2023
Viaarxiv icon