Legal LOD - Concept-based indexing of multilingual European legal texts

The subject of the legal sub-project is the development of a multilingual corpus of European legal texts relating to digitization, such as the Digital Services Act 19 October 2022 (Regulation 2022/2065). This type of text is available in the 24 official languages of the EU, whereby all language versions are equally binding and are therefore in principle considered to have the same content, whereby the principle of equality of EU law is to be realized in all Member States. However, due to the complex editing, coordination and translation processes, there are always differences in detail that cannot be found and clarified using the simple synoptic display on the EUR-Lex platform.

The sub-project aims to solve this problem by: (a) automatically aligning the legal texts sentence by sentence; (b) identifying key legal terms and other concepts and making them available as an LOD-enabled ontology; (c) annotating these concepts across the translations (first manually, then automatically); and finally (d) enabling a concept-guided search for term usages and definitions of relevant terms across different language versions, so that passages that differ in detail can be identified and their significance for Europe-wide national case law can be assessed. The free availability of the texts and their availability in semi-structured HTML format are conducive to this project.

TP6 (Digital jurisprudence): Legal LOD - Concept-based indexing of multilingual European legal texts

The subject of the legal sub-project is the development of a multilingual corpus of European legal texts relating to digitization, such as the Digital Services Act 19 October 2022 (Regulation 2022/2065). This type of text is available in the 24 official languages of the EU, whereby all language versions are equally binding and are therefore in principle considered to have the same content, whereby the principle of equality of EU law is to be realized in all Member States. However, the complex editing, coordination and translation processes repeatedly lead to discrepancies in detail that cannot be found and clarified using the simple synoptic display on the EUR-Lex platform.

The sub-project aims to solve this problem by: (a) automatically aligning the legal texts sentence by sentence; (b) identifying key legal terms and other concepts and making them available as an LOD-enabled ontology; (c) annotating these concepts across the translations (first manually, then automatically); and finally (d) enabling a concept-led search for term usages and definitions of relevant terms across different language versions, so that passages that differ in detail can be identified and evaluated in terms of their significance for Europe-wide national case law. The free availability of the texts and their existence in semi-structured HTML format are conducive to this project.

Regulation (EU) 2022/2065 of the European Parliament and of the Council of October 19, 2022 on a single market for digital services and amending Directive 2000/31/EC (Digital Services Act), data.europa.eu/eli/reg/2022/2065/oj.

Team

  • Dr. Thomas Burch
  • Prof. Dr. Benjamin Raue
  • Veronika Wassermayr