Study programme
The CLDH department offers innovative courses on current topics of digitalisation, especially text, literature and media analysis, based on modern methods of artificial intelligence.
Information for prospective students
The automatic processing of digital media in linguistic form (e.g. literature, news articles or social media contributions) has become indispensable in view of the constantly growing flood of information and the ever-increasing quantities of textually documented knowledge in archives, databases, the Internet and numerous other electronic media.
The associated tasks include both the areas of (re)finding information that are visible to users (information retrieval, data mining or text mining, search engines, digital assistants, chatbots, etc.) as well as the numerous activities that are not obvious to everyone, such as the creation, processing, curating, filtering, summarising, annotating, structuring, storing and publishing of information by companies, authorities and institutions of all kinds. For some time now, these tasks have been inconceivable without computer-aided processes. In more and more areas, machine learning methods such as deep learning are delivering a quality comparable to that of humans.
For high school graduates who want to deal with these future-oriented topics, the Department of Computational Linguistics and Digital Humanities at Trier University offers modern degree programmes with a high degree of flexibility that are international, interdisciplinary and practice-oriented.
In addition to our study advice service, the CoDiPho student council is also happy to help with any questions. It also regularly organises events for our students to get to know each other, exchange ideas or study together.
What is Computational Linguistics?
Computational linguistics (also kown as Natural Language Processing) deals with the technical processing of human language with the aim of making human communication and human knowledge understandable, processable and in turn accessible to humans by generating text. With the help of artificial intelligence methods, especially machine learning of large language models such as ChatGPT, it has been possible in recent years to achieve enormous progress in the performance of language processing and generating systems and to open up a broad field of applications and thus new growth markets. This dynamic situation has led to a large number of application- and basic research-orientated tasks being addressed:
the creation of (software) tools that facilitate human-machine communication through the processing and generation of linguistic data (e.g. natural language dialogue systems, voice-controlled computer and assistance systems, question-answer systems, etc.);
machine tools to make text content accessible in natural language (e.g. translation systems, information retrieval, content analysis, document management, text summarisation, etc.), an area that is becoming increasingly important in the course of digitisation;
the analysis of large amounts of language data (e.g. text corpora, online media, social networks), especially with the help of machine learning methods, in order to be able to create complex models of languages and communication as accurately and conveniently as possible, e.g. for author identification for forensic purposes, detection of fake news and hate comments, tracking of rumours, dissemination of information and interpretation influences in social and media networks, etc.
What are Digital Humanities?
Like computational linguistics, digital humanities or e-humanities is a relatively new discipline and is also located at the interface between the humanities and computer sciences. However, the focus is not exclusively on the philologies (linguistics and literary studies), but on the humanities and cultural studies in general. In the course of the ongoing digitisation of humanities and cultural studies data, both through retro-digitisation and the increase in born digital data, the digital humanities have become increasingly important in recent years. Important tasks of the subject are:
- Digitisation: Textual data can be digitised and made 'computer-readable' using optical character recognition (OCR) processes. However, this only works relatively well with printed texts, good print quality of the original and modern language. Poor print quality (e.g. yellowed or incomplete pages) and old or difficult-to-read fonts (e.g. handwriting, narrow Fraktur or non-Latin alphabets) mean that letters and punctuation marks are no longer recognised well. In addition, statistical models of probable and improbable letter sequences no longer fit well when they are applied to older language levels without standardised spelling. Finally, handwritten texts can only be digitised to a very limited extent using OCR processes and must be transcribed by hand (double-keying process) or processed using newer handwritten text recognition processes. Even with multimodal data, e.g. archaeological artefacts, paintings or old audio or video recordings, digitisation is not always trivial.
Examples in this area are the projects Digitisation of Works of Historical Projection Art, a collaboration between Media Studies and the Competence Centre for Electronic Indexing and Publication Methods in the Humanities (Trier Center for Digital Humanities), and the St. Matthias Virtual Scriptorium, including the follow-up project eCodicology. - Archiving: While non-digital humanities data such as stone tablets, as well as papyri and old manuscripts, are relatively easy to archive and can be preserved for a long time under favourable conditions, long-term archiving of digital data is still a major challenge. One problem is the relatively rapid material fatigue of many digital data carriers, another is the constant development in the areas of software and hardware. There are not many computers left that still have a floppy disc drive and probably even fewer that still run software that can read a file that was created 20 years ago with a contemporary word processing program. Consequently, issues of long-term archiving and long-term availability are currently the subject of intense debate. One of the Trier-based solutions is the virtual data repository.
- Representation: Once humanities data has been digitised, the question arises as to how it can best be represented and made accessible to experts and/or laypersons. This includes both technical aspects, e.g. character encoding, selection of a suitable mark-up language or a high-performance and reliable database, as well as those that lie more in the functional-aesthetic area. Typical application examples range from the creation of a digital edition of a literary work to the realisation of a cultural-historical, multimedia database and the development of digital research databases. Examples of projects that have been implemented with the involvement of the Trier Centre for Digital Humanities include the European History Online portal and the Heinrich Heine Portal.
- Visualisation: Visualisation plays an important role between data representation and analysis. This includes, for example, the question of how complex humanities data can be presented in such a way that it is particularly accessible to specialised researchers (and/or laypersons). Linked to this is the possibility of creating an additional exploration component through suitable visual presentation. One example is the project Der Digitale Peters, in which Arno Peters' Synchronoptische Weltgeschichte was digitally edited. The digital medium offers the advantage that historical events and their context can also be presented graphically and linked to each other. Another example is the Networked Correspondences project, in which social, spatial, temporal and thematic networks are visualised in letter corpora.
- Analysis: Digital data has the advantage that it is (theoretically) not only easier to access than non-digital data, but that it also offers the possibility of (partially) automatic analysis and evaluation. As a result, it is often possible, in collaboration with humanities scholars, to obtain information that would be very difficult to obtain in the traditional way. Text mining methods play a particularly important role here, as they can be used to analyse and identify trends and interdependencies. Examples from Trier include the SeNeReko project, in which ancient Egyptian and ancient Indian texts are automatically analysed semantically in order to gain new insights into religious contacts between the two cultures. Another project, Asymetrical Encounters, aims to apply text mining methods to historical newspaper corpora in order to learn something about how different national cultures influenced each other culturally. Automatic analyses also include, for example, the deciphering of unknown scripts or encrypted historical documents as well as the automatic evaluation of Twitter data in order to find out something about the spread of new linguistic creations or topic trends, for example.
For whom are the degree programmes in Computational Linguistics and Digital Humanities suitable?
The degree programmes in Computational Linguistics and Digital Humanities are particularly suitable for all those who are interested in the effects of digitalisation, especially in the field of modern media, linguistics and humanities, and who want to combine issues of linguistics, media and text studies with those of information and communication technology.
- Are you interested in language, media and communication in the age of digitalisation?
- Do you want to access, analyse, understand and make use of content?
- Do you want to use the latest digital methods from artificial intelligence, machine learning and language models such as ChatGPT?
- Do you want a degree programme that teaches practical skills and involves working with real data?
- Do you want to have the opportunity to gradually set your own specialisations during your studies?
- Do you want to acquire expertise that will give you access to a broad spectrum of highly sought-after professional fields?
While the BA programme focuses on teaching practical skills and application-oriented competencies, the aim of the MA programmes is to qualify students for work in research and development-oriented professional fields.
What knowledge is required at the start of the programme?
In the Bachelor's degree programme in Language, Technology, Media (Sprache, Technologie, Medien), the modules have varying degrees of computer science/mathematics orientation depending on the specialisation. Previous knowledge equivalent to a basic maths course at upper secondary school is sufficient. Previous knowledge of digital media and methods, in particular programming skills, make it easier to get started, but are not required. Supervised exercises provide a practical introduction to the subject matter right from the start.
For the Master's degree programmes (Natural Language Processing, Digital Humanities), the admission requirement is proof of a suitable Bachelor's degree in the humanities or computer science.