Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Moshe Lavee

Automatic Detection of Complex Quotation Patterns in Aggadic Literature

Dec 29, 2025

Hadar Miller, Tsvi Kuflik, Moshe Lavee

Abstract:This paper presents ACT (Allocate Connections between Texts), a novel three-stage algorithm for the automatic detection of biblical quotations in Rabbinic literature. Unlike existing text reuse frameworks that struggle with short, paraphrased, or structurally embedded quotations, ACT combines a morphology-aware alignment algorithm with a context-sensitive enrichment stage that identifies complex citation patterns such as "Wave" and "Echo" quotations. Our approach was evaluated against leading systems, including Dicta, Passim, Text-Matcher, as well as human-annotated critical editions. We further assessed three ACT configurations to isolate the contribution of each component. Results demonstrate that the full ACT pipeline (ACT-QE) outperforms all baselines, achieving an F1 score of 0.91, with superior Recall (0.89) and Precision (0.94). Notably, ACT-2, which lacks stylistic enrichment, achieves higher Recall (0.90) but suffers in Precision, while ACT-3, using longer n-grams, offers a tradeoff between coverage and specificity. In addition to improving quotation detection, ACT's ability to classify stylistic patterns across corpora opens new avenues for genre classification and intertextual analysis. This work contributes to digital humanities and computational philology by addressing the methodological gap between exhaustive machine-based detection and human editorial judgment. ACT lays a foundation for broader applications in historical textual analysis, especially in morphologically rich and citation-dense traditions like Aggadic literature.

* This paper is under review at Cogent Arts & Humanities

Via

Access Paper or Ask Questions

Style Classification of Rabbinic Literature for Detection of Lost Midrash Tanhuma Material

Nov 17, 2022

Shlomo Tannor, Nachum Dershowitz, Moshe Lavee

Figure 1 for Style Classification of Rabbinic Literature for Detection of Lost Midrash Tanhuma Material

Figure 2 for Style Classification of Rabbinic Literature for Detection of Lost Midrash Tanhuma Material

Figure 3 for Style Classification of Rabbinic Literature for Detection of Lost Midrash Tanhuma Material

Figure 4 for Style Classification of Rabbinic Literature for Detection of Lost Midrash Tanhuma Material

Abstract:Midrash collections are complex rabbinic works that consist of text in multiple languages, which evolved through long processes of unstable oral and written transmission. Determining the origin of a given passage in such a compilation is not always straightforward and is often a matter of dispute among scholars, yet it is essential for scholars' understanding of the passage and its relationship to other texts in the rabbinic corpus. To help solve this problem, we propose a system for classification of rabbinic literature based on its style, leveraging recently released pretrained Transformer models for Hebrew. Additionally, we demonstrate how our method can be applied to uncover lost material from Midrash Tanhuma.

Via

Access Paper or Ask Questions