the social life of early medieval canonical collections

Category: Digital Humanities

Workshop: Ruse of reuse. Detecting text-similarity with AI in historical sources

The SOLEMNE project is co-organising this workshop on text-reuse detection to be held at the Austrian Academy of Sciences on 5-6 March 2026.

The amount of texts available in various corpora, thanks to the many ongoing digitalisation projects, has increased immensely, and the success of HTR will only increase the amount of available texts. This means that the need to take advantage of this amount of text only increases. The ability to identify text-reuse and measure text similarity is thus more important than ever, and promises to see connections never viewed before. AI promises to make this possible.
This two-day small workshop is a continuation of the event held in 2023: Finding Connections: Using AI and DNA Sequencing to Find Similarities and Parallels in Medieval Texts, with an extended focus on general historical sources.
March 5th will be devoted to presentations of research in two sessions. In the afternoon, there will be an associated KIMAFO by William Mattingly, open to the general audience.
March 6th will be more informal and truly of a workshop nature. Two morning sessions will be devoted to work-in-progress reports, plans for the implementation of AI in various projects, experimental approaches to the problem, and discussions. In the afternoon, there will be a practical workshop focusing on the use of various AI approaches. This will be led by Martin Roček and Gleb Schmidt.

Published: “Biblicality” of Early Medieval Canon Law

Using language modeling and the information-theoretic concept of perplexity, Friederike, Gleb, and Sven explore the influence of the Bible on early medieval canon law. We demonstrate that calculating perplexity of a corpus of canon law under a language model trained exclusively on Scripture can serve as a reliable proxy for accessing the overall linguistic similarity—stylistic, semantic, and syntactic—of canon law to the Bible. Our paper, presented at the conference on Computational Humanities Research 2025 (Luxembourg), presents the measured “biblicality” of various canon law texts, explores its chronological development, and delves into the linguistic and stylistic meaning of higher or lower “biblicality” by observing correlations between perplexity under the biblical language model and various linguistic features of canon law texts, which can be extracted from rich morpho-styntactic annotation. We hypothesise that changes in the levels of “biblicality”, clearly observable across the chronological subdivisions of the corpus, suggest that the imitation of scriptural language may have been a deliberate strategy reflecting evolving views on the role and place of Scripture in legislation. Further research is needed to trace in more detail the connection between the authoritative status of canon law collections and the use of the Bible.

The 'Maison du Savoir' of the University of Luxembourg, venue of CHR 2025

The ‘Maison du Savoir’ of the University of Luxembourg, venue of CHR 2025

© 2026 Canones

Theme by Anders NorénUp ↑