Curation and integration of historical text resources of the 15th-19th century into the CLARIN infrastructure (WG 1)

The curation project was approved in June 2012 by the CLARIN-D steering committee. The curated resources shall be available through the infrastructure of the three involved institutions HAB, BBAW, and IDS. BBAW and IDS shall make the data available through their CLARIN service centres. HAB also is going to look after long term archiving of the data and guarantees its long term availability through persistent identifiers. The curation project is coordinated by the Deutsches Textarchiv (DTA) at BBAW.

For more information and a list of resources integrated so far, please visit the project's website (content in German language).

Project content

The curation project aims to improve the current status of German corpora of the 15th to 19th century. To this end, digital full text resources (and corresponding scans of the original prints) are to be identified, catalogued, characterized and evaluated based on selective quality criteria. After this, the resources are to be gradually edited and integrated into a mirrored repository-structure at BBAW, HAB and IDS. Publishing the resources at a relatively early stage will make it possible to use and comment on these very soon. Following this, IDS and BBAW will gradually integrate the resources into the CLARIN infrastructure.

It is expected that this project will contribute substantially to a corpus of Early New High German and the historical New High German (15th-19th century). This structured and lasting corpus will fundamentally improve the situation for research, e.g. in historical linguistics.

Also the implementation of a sustainable infrastructure for integration of new text resources might initiate a major shift in the research community. This way the project will try to establish a culture of sharing and collaborative work on text resources. The infrastructure also will be a long-term central repository for historical text resources.

Duration

  • 01.09.2012 – 28.02.2014

Applicants

  • Dr. Alexander Geyken (BBAW, Berlin)
  • Prof. Dr. Thomas Gloning (JLU, Gießen)
  • Dr. Thomas Stäcker (HAB, Wolfenbüttel)
  • Dr. Andreas Witt, Dr. Marc Kupietz (IDS, Mannheim)

Responsible institution

Executive Staff

  • Christian Thomas (BBAW, DTA (50%); Coordination of curation project (33%))
  • Hannah Sophia Glaum (SHK Universität Gießen)
  • Marc Kuse (SHK Universität Gießen)
  • Melanie Henß (SHK Universität Gießen)
  • Marcus Baumgarten (WHK HAB)
  • Frederike Neuber (SHK BBAW)
  • Elena Kirillova (SHK BBAW)
  • Jonathan West (contract to produce a work)
  • Jurgita Baranauskaite (as member of CLARIN-D FAG 1 »Deutsche Philologie«)
  • Stefanie Seim (as member of CLARIN-D FAG 1 »Deutsche Philologie«)

References

Curated resources are available through the project's website http://www.deutschestextarchiv.de/clarin_kupro (content in German language).

  • Christian Thomas, Frank Wiegand: "Making great work even better: Appraisal and Digital Curation of widely dispersed Electronic Textual Resources (c. 15th–19th cent.) in CLARIN-D." [= thoroughly revised version of the original full paper by the same title, to appear in:] Proceedings of the International Conference “Historical Corpora 2012”, December 6–9, 2012; Goethe University, Frankfurt, Germany. [within the series "Corpus Linguistics and Interdisciplinary Perspectives on Language (CLIP)"] Tuebingen: Narr Publishing House.
  • F. Neuber, C. Thomas: Kurz vor dem Ziel: Das Kurationsprojekt 1 der F-AG 1 »Deutsche Philologie«. In: Clarin-D-Newsletter 5, 2013, http://www.clarin-d.de/images/newsletter/CLARIN-D-Newsletter-2013-5.pdf, S. 29–32.
  • Frank Wiegand: Integration of Wikisource data sets into the CLARIN-D Infrastructure. Showcase auf dem CLARIN-D "M24 workshop", 27.-28.6.2013, Konzertgebäude De Vereeniging, Nimwegen (NL) [Abstract].
  • Christian Thomas: Integration von Wikisource-Texten in die Korpora des Deutschen Textarchivs und in CLARIN-D. Projektvorstellung auf dem 1. Wikisource-Arbeitstreffen, 18.–20.10.2013, Leipzig.
  • The major outlines of the curation project where introduced during a CLARIN-D Panel organized by DTA/BBAW, HAB and Universität Gießen at the DH2012 in Hamburg, Germany: "Compiling large historical reference corpora of German: Quality Assurance, Interoperability and Collaboration in the Process of Publication of Digitized Historical Prints". See abstract (html, pdf) and a/v documentation of the Panel.
  • Christian Thomas, Frank Wiegand: Making great work even better. Appraisal and Digital Curation of widely dispersed Electronic Textual Resources (c. 15th–19th cent.) in CLARIN-D. Full Paper [accepted] for the International Conference “Historical Corpora 2012”, December 6–9, 2012; Goethe University, Frankfurt, Germany. [urn:nbn:de:kobv:b4-opus-23081 – online-Version 2012-10-31]
  • Frederike Neuber, Christian Thomas: Vorstellung des Kurationsprojekts 1 der Clarin-D-FAG 1 »Deutsche Philologie«. In: Clarin-D-Newsletter, Nummer 3, 2012, November, S. 11–13. [http://www.clarin-d.de/images/newsletter/CLARIN-D-Newsletter-2012-3.pdf]
  • Alexander Geyken, Thomas Gloning: A living text archive of 15th–19th c. German. Corpus strategies, technology, organization. International Conference on “Historcal Corpora 2012”, 6.–9.12.2012, Johann Wolfgang Goethe-Universität, Frankfurt (Main). (Abstract)
  • Christian Thomas: Making great work even better: Appraisal and Digital Curation of widely dispersed Electronic Textual Resources (c. 15th–19th cent.) in CLARIN-D. International Conference on “Historcal Corpora 2012”, 6.–9.12.2012, Johann Wolfgang Goethe-Universität, Frankfurt (Main). [see Full Paper by Thomas/Wiegand above]
  • Frederike Neuber: Curation and integration of historical text resources of the 15th-19th century into the CLARIN-D Infrastructure. Poster-/Demosession at the CLARIN-D "M24 workshop", 27.-28.6.2013, concert hall de Vereeniging, Nijmegen (NL) [Poster].
  • Frank Wiegand: Integration of Wikisource data sets into the CLARIN-D Infrastructure. Showcase at the CLARIN-D "M24 workshop", 27.-28.6.2013, concert hall de Vereeniging, Nijmegen (NL) [Abstract].

The curation project in the institutions