Colin Swaelens, Similarity Detection: A Starting Point for Greek

Abstract

Antique literature survived thanks to scribes painstakingly copying texts from one manuscript to the other, prior to the art of printing. Occasionally, these scribes added metrical paratexts to the manuscripts, i.e. texts standing next to the main text (Genette, 1987) and introduced in Byzantine scholarship by Lauxtermann (2003) as book epigrams. Ghent University’s Database of Byzantine Book Epigrams (Ricceri et al., 2023) stores more than 12,000 of such epigrams, being verbatim transcriptions precisely as they are found in the manuscripts. This entails that the Greek of these epigrams is interspersed with orthographic inconsistencies, mainly due to phonetic changes like the itacism. These verbatim transcriptions are called occurrences and are grouped under one or more so-called types, a readable representation of its occurrences in standardised, classical Greek. Eventually, we aim to develop a dynamic system to group hemistichs, verses and epigrams based on distinct similarity measures in order for scholars to find all kinds of similar texts instead of only the ones that pop up in their mind. While developing those similarity measures, just like any other algorithm, evaluation is an essential part of the development process. However, a gold standard for the evaluation of verse similarity measures does not exist. At this point, we already conducted a pilot study on pairwise annotation of 2 verses with 10 annotators. Each verse was set off alongside six pairs of verses, of which the annotator had to mark the most similar one in their opinion. The inter-annotator agreement (IAA) yielded an agreement score of 57.69%, which is seen as a moderate agreement (Landis & Koch, 1977). This agreement score is the arithmetic mean of the agreement between each pair of annotators, as all annotators annotated the exact same set of verses. Despite the rather modest size of this pilot study, it is possible to unravel the distinct lines of reasoning of the annotators. They did not receive detailed instructions for the annotation process, because of which every annotator was free to have their own focal point. The most remarkable of those focal points was the metre. One of the annotators based their judgement on the amount of syllables a verse counts. The majority, however, seemed to take syntax as a decisive factor to determine the most similar verse; semantics were only deciding, if the syntax of both options was identical. While the gold standard is being annotated, we already started computing similarity between words. These similarities will, in a next stage, be used to compute similarity between (half) verses. The main goal of the experiment is to find out whether transformer embeddings take into account enough context to find identical or similar words with deviant orthography.

Practical information

This lecture will be given at the ‘Computational Approaches to Ancient Greek and Latin Workshop’, organised by KU Leuven and the University of Groningen. This workshop series started in 2021 with the aim of further exploring the potential of computational approaches (Natural Language Processing) applied to Ancient Greek and Latin. The 2024 edition will be held hybridly on November 28th and 29th, 2024.

Date & time: Friday 29 November 2024, 13:45-14:30

Location: KU Leuven: Mgr. Sencie Instituut (Erasmusplein 2, 3000 Leuven, Belgium) & online

Register via this link. Registration for in-person attendance is not possible anymore. The deadline for registration for online attendance is 27 November 2024.

More information about this conference and the full programme can be found here.

Kristoffel Demoen, Kyriaki Giannikou & Colin Swaelens, The Database of Byzantine Book Epigrams. Paratextual Poems from the Margins of Medieval Manuscripts to a Searchable Digital Corpus

This lecture will be given at the 8th International Byzantine Seminar Lecture Series (2024) on “Digital Methods for Byzantine Studies”, organised by the Institute for the History of Ancient Civilizations at the Northeast Normal University in Changchun (China), in collaboration with the Department og Byzantine and Modern Greek Studies at the University of Cologne and the Department of Historical and Classical Studies at the Norwegian University of Science and Technology.

Date & time: Thursday 21 November 2024, 11:00 am (CET)

Location: online via Zoom

Registration is free, but required. The Zoom link will be provided upon registration. To register or for more information, email with “IBSLS Registration” to liq762@hotmail.com.

LW Research Day 2024: poster session

The fourth LW Research Day will take place on Wednesday 27 November 2024, in the Ghent University Museum (GUM). Central theme is ‘From Source to Understanding’.

What is the role of interpretation in our journey from studying source material to scientific understanding? Indeed, that journey can never be devoid of interpretation, which, in many cases, serves as the quintessential bridge between source material and understanding, whether it pertains to a historical study based on ego documents, the archaeological perspective on the material culture of the past or the anthropological view of human behaviour. Not infrequently, interpretation itself becomes the object of research. For instance, translation scholars examine translation choices that result from interpretations. Literary and art scholars investigate works that themselves provide an interpretation of the world in which they originate and the world they create. Similarly, language itself reflects a particular understanding of the world in a historical and sociological sense, which linguists further explore. In times of digital humanities, the interpretation of (big) data by AI becomes not only conceivable but even the norm. What do interpretation and hermeneutics signify for our fields today? What constitutes a successful or legitimate interpretation, and what are the pitfalls of interpretation?

The PhD students of the DBBE team will present a poster on their research projects in the framework of the Database of Byzantine Book Epigrams.

  • Kyriaki Giannikou – Dealing with Building Blocks of Expression: Formulaic Elements & their Creative Variations in Byzantine Book Epigrams
  • Eleonora Lauro – Epigrams in Context: Glimpses into Medieval Southern Italian Book Culture

More information can be found on the LW Research Day website.

Kyriaki Giannikou, Navigating Digital Frontiers: Unveiling Formulaicity in Byzantine Book Epigrams

Abstract

Byzantine book epigrams, featuring as paratexts in manuscript margins, seamlessly intertwine poetic expression with practical details, illuminating aspects such as the manuscripts’ patrons and the identities of the scribes involved in transcription. Although deeply rooted in traditional book production practices and very formulaic in nature, these epigrams present noteworthy linguistic variation. While their formulaicity has been acknowledged, a thorough exploration of the formulaic sequences present in the Database of Byzantine Book Epigrams (DBBE) or similar corpora remains a gap in current research. My research, to be conducted on the well-established DBBE corpus, acts as a bridge between linguistic research on formulas inherent in everyday speech and those studied within the context of oral poetry.

This interdisciplinary project, adopting a corpus-driven approach, seeks to combine close-reading along with digital methods for navigating a vast corpus of Byzantine book epigrams. This research addresses the challenge of identifying formulaic constructions (i.e. pairings of form and meaning in the context of Construction Grammar) that function as “verse building blocks” and their variation within a historical linguistic corpus that combines poetic expression and practical information. However, the digital journey of pattern identification encounters challenges arising from inherent complexities of Greek – from flexible syntax to extensive morphological variety – compounded by great linguistic variation across registers, ranging from Homeric and classicizing Greek to medieval forms interwoven with vernacular elements. The absence of critical texts for numerous epigrams further complicates matters, preserving the idiosyncrasies of original scribal choices on the one hand, but impeding uniformization for digital analysis on the other.

This presentation serves to illuminate the challenges inherent in working on Byzantine paratextual material in the Digital Humanities context of a project that endeavours to unravel the intricate linguistic nuances within Byzantine book epigrams, displaying commitment to deeper understand the complexities inherent in the intersection of Byzantine literature and Digital Humanities.

Practical information

This lecture will be given at the international workshop ‘The Impact of Digital Methods and Approaches on Ancient Studies Research‘ (13-14 May 2024, Berlin).

Date & time: Monday 13 May 2024, 4:40 pm

Location: Freie Universität Berlin (Hittorfstraße 18, 14195 Berlin)

 

More information about this workshop and the full programme can be found here.

Eleonora Lauro, Alongside the Text: Byzantine Metrical Paratexts in Gospel Manuscripts from Medieval Southern Italy

Abstract

This paper aims to investigate the relationship between Byzantine metrical paratexts, also known as book epigrams, and the biblical text found in Gospel-Books and Lectionaries from medieval Southern Italy.

In the field of New Testament textual scholarship, recent years have witnessed an increased interest in aspects of manuscripts that extend beyond their textual content. Scholars now recognize that insights gained from studying scribal corrections and paratextual features help us to understand how texts were transmitted and received in historical contexts (Lanier-Han, 2021).

Despite this considerable shift in New Testament studies, Byzantine book epigrams and their affiliation to the biblical text remain an intriguing and less-explored domain. These paratexts represent an interesting research object for philologists and historians studying the manuscript tradition of the Greek New Testament. Often copied alongside the main text, book epigrams can help to establish genealogies between manuscripts. Moreover, they offer relevant information on the communities writing and reading these books.

Specifically, my research will consider the following questions:

  1. What kind of book epigrams can be found in Gospel-Books and Lectionaries produced in medieval Southern Italy? Are they original compositions or just conventional formulas?
  2. Do the metrical paratexts reveal specific regional and cultural influences? And how do they differ from Gospel-Books and Lectionaries from other regions?
  3. Are there thematic correlations between book epigrams and biblical text?
  4. Which reading strategies do the book epigrams prescribe?
  5. what is the relation between the chain of transmission of the metrical paratexts and that of the main texts?

This study will focus on a corpus of Byzantine book epigrams found in a selected group of Gospel-Books and Lectionaries produced in Southern Italy (10th-13th century). The combination of cultural exploration and examination of textual and extratextual features presents a model for integrating various disciplines to enrich our understanding of New Testament manuscript tradition.

Practical information

This lecture will be given at the international CSNTM Text & Manuscript Conference ‘Intersection. Interdisciplinary Approaches to New Testament Text and Manuscript Studies‘, organised by the Center for the Study of New Testament Manuscripts in Plano (Texas).

The “Intersection” theme aims to explore how the many disciplines of the study of ancient Christian documents (paleography, art history, exegesis, paratext, linguistics, conservation, etc.) collaborate to help us better understand their content.

Date & time: Thursday 30 May 2024, 10:55 pm

Location: The Marriott at Legacy Town Center (7121 Bishop Rd Plano, TX 75024)

 

More information about this conference and the full programme can be found here.

Data-driven Approaches to Ancient Languages (DAAL)

On Thursday 27 June 2024, the Database of Byzantine Book Epigrams project (DBBE) is organising a workshop on Data-driven Approaches to Ancient Languages (DAAL) in Ghent, Belgium. This workshop will follow immediately after the conference “Paratexts in Premodern Writing Cultures”.

Premodern or historically attested languages are invaluable resources of both the study of diachronic linguistics and their contemporary culture. Although these languages might be from various language families or have a different script, researchers face common challenges, among which illegible or lost text (parts), inexistent gold standards and, very important these days, scarcity of data. Luckily, more and more texts become available, but the language of those texts might be so different from their modern pendant — should that modern pendant exist — that it considerably impacts the performance of existing tools. This workshop aims to provide a platform to a broad field of researchers engaged in digital approaches to pre-modern languages.

 

For all further information, please visit the conference website: https://www.dbbe2024.ugent.be/workshop/.
For any additional questions you may have, please contact the organisers at daal2024@ugent.be.

Maxime Deforche, Ilse De Vos, Antoon Bronselaer & Guy De Tré, An Orthographic Similarity Measure for Graph-Based Text Representations

This presentation will be given at theThe Dutch-Belgian DataBase Day (DBDBD), a yearly one-day workshop, organized in a Belgian or Dutch university, whose general topic is database research. DBDBD 2023 will be held in Ghent, Belgium.

At DBDBD 2023, junior and senior researchers from the Netherlands and Belgium can present their recent results, and meet fellow researchers in the field of data management. It is an excellent opportunity to meet up with your Belgian/Dutch colleagues, and to get informed about the (recent) database-related research performed in Belgian/Dutch universities. The workshop welcomes non-Belgian/Dutch participants (presentations are in English). DBDBD has a tradition of favouring presentations by junior researchers.

Practical information

Date & time: Thursday 21 December 2023, 10:30am

Location:Technicum (building T2) (Sint-Pietersnieuwstraat 41, 9000 Gent)

More information about this workshop and the full programme can be found here.

Crash Course in Greek Palaeography

The Leiden University Centre for the Arts in Society, Leiden University Library and the Greek department of Ghent University offer a two-day course in Greek palaeography in collaboration with the Research School OIKOS. The course is intended for MA, ResMA and doctoral students in the areas of Classics, Ancient History, Ancient Civilizations and Medieval studies with a good command of Greek. It offers a chronological introduction into Greek palaeography from the Hellenistic period until the end of the Middle Ages and is specifically aimed at acquiring practical skills for research involving literary and documentary papyri and/or manuscripts. This course gives the unique opportunity to practice reading on original papyri and manuscripts from the collection of the Leiden Papyrological Institute and the special collections of the Leiden University Library.

Programme

The course is set up as an intensive two-day seminar. Five lectures by specialists in the field will give a chronological overview of the development of Greek handwriting, each followed by a practice session reading relevant extracts from papyri and manuscripts in smaller groups under the supervision of young researchers.

Monday, May 27

  • 10:00 Introduction
  • 10:15-11:15 Papyri of the Ptolemaic and Roman period (3rd cent. BCE – 3rd cent. CE) (Dr. Joanne Stolk)
  • 11:15-12:30 Practice with papyri of the Ptolemaic and Roman period
  • 12:30-13:30 Lunch break
  • 13:30-14:30 Papyri of the Byzantine period (4th-8th centuries) (Dr. Yasmine Amory)
  • 14:30-15:45 Practice papyri of the Byzantine period
  • 15:45-16:15 Coffee break
  • 16:15-17:00 Presentation of Greek manuscripts from the Leiden University Library
  • 17:00-17:45 Presentation of Greek papyri from the Leiden Papyrological Institute
  • 19:00 Dinner

 

Tuesday, May 28

  • 9:00-10:00 Majuscule and early minuscule bookhands (4th-9th centuries) (Dr. Rachele Ricceri)
  • 10:00-11:15 Practice majuscule and early minuscule bookhands
  • 11:15-11:45 Coffee break
  • 11:45-12:45 The development of minuscule script (10th-12th centuries) (Prof. dr. Floris Bernard)
  • 12:45-13:45 Lunch break
  • 13:45-15:00 Practice minuscule script of the 10th-12th centuries
  • 15:00-15:30 Coffee break
  • 15:30-16:30 Manuscripts and scholars of the Palaeologan period (13th-15th centuries) (Prof. dr. Andrea Cuomo)
  • 16:30-17:45 Practice manuscripts of the Palaeologan period

Practical information

The study load is the equivalent of 2 ECTS (2×28 hours). Participants will be asked to read up on secondary literature in preparation for the seminar (distributed several weeks before the course). Extra material will be handed out during the course in order to continue to improve your reading skills afterwards.

There are no fees for participation in this course. Lunches on both days and dinner on the first day are provided free of charge. Travel costs and accommodation in Leiden are at your own expense.

Registration

Please register by sending an e-mail with a short motivation (ca. 300 words, including your background, research interests and why you would like to follow this course) to yasmine.amory@ugent.be. Priority is given to OIKOS doctoral students and those who did not have the opportunity to follow course(s) on palaeography before. Registration closes by the final deadline of February 15th, 2024. Successful applicants will be notified soon afterwards.

Rachele Ricceri, The Database of Byzantine Book Epigrams: Getting People In and Out Again

This lecture will be given at the PROSOPON Workshop ‘Entangled Prosopographies: Connecting the “Prosopographies of the Later Roman and Byzantine Worlds” Across the Eastern Mediterranean and Beyond’ (The University of Edinburgh, 8-9 December 2023). It is part of Round Table 2: ‘Archives and Manuscripts’.

The workshop brings together a large number of current prosopographical research projects with a focus from the late antique to the late Byzantine periods and is dedicated to exploring ways of going forward, connecting projects and researchers. It offers ample opportunity to discuss the methods and practices of prosopographical research, to learn from each other, and develop closer ties of cooperation.

Practical information

Date & time: Friday 8 December 2023, 1:30pm

Location: Meadows Lecture Theatre, Old Medical School, Doorway 4 (Teviot Place, Edinburgh)

More information about this conference and the full programme can be found here.

Paratexts in Premodern Writing Cultures

The Database of Byzantine Book Epigrams project (DBBE) will organise a conference on “Paratexts in Premodern Writing Cultures”, which will take place in Ghent on 24-26 June 2024. 

With this conference we aim to bring together scholars engaged in the exploration of premodern paratexts transmitted in a variety of languages (such as Arabic, Armenian, Greek, Coptic, Hebrew, Latin, Slavonic, Syriac). It is our aim to discuss the nature of paratextuality in medieval manuscripts, to reveal similarities and peculiarities of paratexts across language borders, and to understand the broader cultural and historical ramifications of paratexts. We are interested both in the textual evidence of medieval paratexts and in their material transmission.

 

For all further information, please visit the conference website: https://www.dbbe2024.ugent.be/.
For any additional questions you may have, please contact the organisers at dbbe@ugent.be.