This page presents current and completed doctoral projects that are being or have been carried out with the participation of the Chair of Digital Humanities.

Current dissertation projects

Ariadne Baresch: La Recherche selon Albertin Simonet : réécriture d’un temps fugitif

Marcel Proust's novel cycle À la Recherche du temps perdu (1913-1927) is primarily known for its themes of memory and time, as well as the Madeleine scene, which acts as an emblematic nexus between these two subjects. The complexity that constitutes the oeuvre and its readings is reflected in the countless studies that have dealt with the "étoffe proustienne" over the course of a century.

The plot, which is told from the perspective of a first-person narrator, highlights one character in particular despite its thematic variety: his lover Albertine Simonet. As the most frequently mentioned character in the novel, her enigmatically drawn personality and her supposed leisure activities are the driving force behind the narrator's constant reflections, which are characterized by jealousy. The deliberately ambivalent portrayals thus prove to be stimuli for various levels of analysis and possible interpretations. In the artistic field, this is reflected in numerous adaptations of various media genres.

The aim of the doctoral project is to comparatively explore Proust adaptations that have not yet received sufficient attention. Genetic, intermedial and thematological comparisons will be tested in order to encompass both the heterogeneity of a multimedia and multilingual corpus and the plurality of the Albertine character.

Supervisors: Prof. Dr. Christof Schöch (Trier), Professeur Henri Garric (Dijon).

Julia Röttgermann: Affekt und Aufklärung - Automatische Erhebung literaturhistorisch relevanter Informationen aus Volltexten am Beispiel von französischen Romanen des XVIII. Jahrhunderts

The dissertation, which is part of the Mining and Modeling Text (MiMoText) research project, deals with a corpus of French novels from the period 1750-1800, which are being translated into TEI-compliant XML for the first time and published as part of the European Literary Text Collection (ELTeC). Quantitative and qualitative methods of text analysis are applied to the corpus with the aim of extracting information on aspects such as themes, characters, places or motifs that can be used in literary studies. In-depth analyses and evaluations are planned on the topic of affects in the French novel of the 18th century. All extracted data will be modeled as linked open data in a semantic network, linked with further information from MiMoText and will be available for structured queries.

Supervisors: Prof. Dr. Christof Schöch (Trier), Prof. Dr. Matei Chihaia (Wuppertal).

Andreas Büttner: Bilingual Stylometry: A Computational Study of the Arabic-Latin Textual Tradition

The late medieval translation of scientific and philosophical texts from Arabic into Latin heavily influenced the history of European thought for many centuries. This transmission of knowledge was mediated by translators, many of whom remained anonymous, while others, e.g. Gerhard of Cremona or Dominicus Gundissalinus, are appreciated as important historical figures. The ongoing digitisation of the texts facilitates innovative ways of analysis, leading to new insights into their work.

The first part of the dissertation will deal with the digitisation methods employed in building the bilingual corpus. A special focus will be placed on the problem of the alignment of the Arabic original and the Latin translation. I will evaluate existing methods, develop new technologies building on neural machine translation, and compare them with more traditional approaches, using the Arabic and Latin translations corpus and, for comparison, text collections in other languages.

The main part will be devoted to the problem of translator identification using stylometric methods. To compensate for the often very short tractates and large range of subject matters of the texts, I will employ the information gained from the bilingual alignment to filter the stylistic signature of the translator from the statistical properties of language-use in the corpus.

The philological aim of the dissertation therefore is to gain new perspectives on the history of the Arabic-Latin translation movement. From the methodological point of view, the work will seek new strategies to analyse bilingual corpora, especially concerning techniques of stylometry.

Supervisors: Prof. Dr. Dag Nikolaus Hasse (Erstbetreuer, Würzburg), Prof. Dr. Fotis Jannidis (Würzburg), Prof. Dr. Christof Schöch (Trier)

Completed dissertation projects

Keli Du: Evaluation von Topic Modeling in Digital Humanities (2023)

Topic modeling is an approach for semantic indexing of texts. The groups of semantically related words (topics) in a corpus can be extracted using topic modeling. The topics represent an overview of the possible topics in the corpus. Topic modeling has been used more and more in digital humanities in recent years, while the technical side of topic modeling has been less presented and discussed in research reports compared to the analysis and interpretation of the topics and the model. The outcome of topic modeling can be influenced by many technical factors. But there is still no common understanding in Digital Humanities on how to deal with these factors in order to use topic modeling in an optimal way. For this reason, my dissertation aims to better understand the relationships between the factors and the quality of the topic model as well as the quality of the topics through systematic investigations.

Supervisors: Prof. Dr. Fotis Jannidis (Würzburg), Prof. Dr. Christof Schöch (Trier), Prof. Dr. Andreas Hotho (Würzburg).

Dr. Keli Du is now a researcher at the Trier Center for Digital Humanities.

Tilman Schalmey: Computerlinguistische Datierung schriftsprachlicher chinesischer Texte (2022)

From the introduction: "Over the past centuries, the dating of texts has occupied researchers from various disciplines and cultural complexes. The focus is often on exegesis and authenticity research. Linguistic text dating is generally based on the observation of and prior knowledge about language change, traditionally through the observation of certain - individual - linguistic phenomena, words, word forms, signs or grammatical structures. Above all, the vocabulary of any actively used language, whether written or colloquial, is constantly changing. New words are added, while others gradually fall out of use. [...] The aim of this work is to enable the content-based chronological classification and dating of written Chinese texts using computational linguistic methods. For this purpose, [statistical] language models are to be adapted and applied to Chinese text material for the first time."

Supervisors: Prof. Dr. Christian Soffel (Erstbetreuer), Prof. Dr. Christof Schöch (Zweitbetreuer)

Ulrike Henny-Krahmer: Genre Analysis and Corpus Design: 19th Century Spanish American Novels, 1830-1910 (2021)

This dissertation in the field of Digital Literary Stylistics is concerned with theoretical concerns of literary genre, with the design of a corpus of 19th-century Spanish American novels, and with its empirical analysis in terms of subgenres of the novel. The digital text corpus consists of 256 Argentine, Cuban, and Mexican novels from the period between 1830 and 1910. It has been created with the goal to analyze thematic subgenres and literary currents that were represented in numerous novels in the 19th century by means of computational text categorization methods. The texts have been gathered from different sources, encoded in the standard of the Text Encoding Initiative (TEI), and enriched with detailed bibliographic and subgenre-related metadata, as well as with structural information.

To categorize the texts, statistical classification and a family resemblance analysis relying on network analysis are used with the aim to examine how the subgenres, which are understood as communicative, conventional phenomena, can be captured on the stylistic, textual level of the novels that participate in them. The result is that both thematic subgenres and literary currents are textually coherent to degrees of 70-90 %, depending on the individual subgenre constellation.

Besides the empirical focus, the dissertation also aims to relate literary theoretical genre concepts to the ones used in Digital Genre Stylistics as a subfield of Digital Humanities. It is argued that literary text types, conventional literary genres, and textual literary genres should be distinguished on a theoretical level to improve the conceptualization of genre for digital text analysis.

Supervisors: Prof. Dr. Christof Schöch (Univ. Trier), Prof. Dr. Fotis Jannidis (Würzburg), Prof. Dr. Hanno Ehrlicher (Tübingen)

JProf. Dr. Ulrike Henny-Krahmer is now a Junior Professor at Rostock University.

Ekaterina Kamlovskaya: Computer-assisted analysis of postcolonial features in the key discourses of a corpus of Australian Indigenous life writing (2021)

Supervisors: Prof. Dr. Christoph Schommer (Univ. Luxembourg), Prof. Dr. Nina Tahmasebi (external supervisor, Univ. of Gothenburg). Prof. Dr. Christof Schöch (external supervisor, Univ. Trier).


José Calvo Tello: The Novel in the Spanish Silver Age: A Digital Analysis of Genre through Machine Learning (2020)

Between the end of the 19th century and the end of the Spanish Civil War in 1939, the so-called Silver Age (Edad de Plata) emerged in Spanish art. An aesthetic generational change took place in the literature of this era: Many of the representational devices of the realistic and naturalistic novel were abandoned and new forms of expression were discovered. The works in question are classified and described by literary historians in a highly controversial manner. The subject of this study are the works described by literary history as literary prose texts written in Spain by Spanish authors and published between 1880 and 1939. In total, the collection of texts includes around 200 works. The aim of the study is therefore to use computer-aided methods to answer the following questions: How did the subgenres of the Spanish novel and shorter narrative prose develop between 1880 and 1939? Which stylistic features and which linguistic levels (morphology, syntax, lexis, semantics or text) prove useful for the study of the literary prose genres of this period?

Supervisors: Prof. Dr. Christof Schöch, Prof. Dr. Fotis Jannidis (Würzburg), Prof. Dr. Angela Schrott (Kassel)

See the monograph: José Calvo Tello, The Novel in the Spanish Silver Age. A Digital Analysis of Genre Using Machine Learning (Bielefeld: Bielefeld University Press / transcript, 2021).

Dr. Calvo Tello is now a subject librarian for Digital Humanities and Romance Studies at the State and University Library Göttingen.


Matthias Bremm: Teil-überwachtes und aktives Lernen mit unterschiedlichen annotierenden Personen zur Informationsextraktion in Texten (2020)

Located in the field of information extraction, this work combines several methods from the field of machine learning. It presents a new algorithm that combines semi-supervised learning with active learning. The starting point is the analysis of the data by dividing it into several views. Here, the input from different people is divided. In each case, the algorithm uses classifiers to generate separate models that are built from the individual labels of the people. Crowdsourcing is used to obtain the required amount of data, which makes it possible to reach a large number of people. The people are given the task of annotating texts. On the one hand, this is initially done for a historical text corpus. The steps required to offer and carry out the annotation task in crowdsourcing portals are listed. On the other hand, a current data set of short messages is used. The algorithm is applied to these sample data sets. Experiments are conducted to determine the optimal parameter selection. In addition, the results are compared with the results of previous algorithms.

Supervisors: Prof. Dr. Reinhard Köhler (main supervisor), Prof. Dr. Carolin Sporleder (until 2019), Prof. Dr. Christof Schöch (from 2019).

Dr. Bremm is a researcher at the Trier Center for Digital Humanities.

    Winfried Höhn: Semiautomatische Annotation von Orten in digitalisierten Altkarten (2018)

    Supervisors: Prof. Dr. Christoph Schommer (main supervisor, Univ. Luxembourg). Prof. Dr. Christof Schöch (Dissertation Defence Committee Member), Prof. Dr. Andreas Fickers (Dissertation Defence Committee Member).