Archives

Below we present the language data archives, which are held by the CLARIN-D centers.

A language data archive is a thematically heterogeneous collection of language data. It serves the primary goal of conserving these data on a persistent medium and the secondary goal of granting access to them to the public.


Bayerisches Archiv für Sprachsignale, München:

BAS: LogoThe Bavarian Archive for Speech Signals (Bayerische Archiv für Sprachsignale, BAS) was founded as a public institution in January 1995 and is hosted by the University of Munich, presently at the Institut für Phonetik und Sprachverarbeitung (IPS). BAS is dedicated to make databases of spoken German accessible in a well structured form to the speech science community as well as to speech engineering.

Aside from a comprehensive catalogue of BAS resources the BAS maintains a CLARIN certified respository of free speech corpora for academic users, and a number of web-based and stand-alone tools for speech processing.


Institut für Deutsche Sprache, Mannheim:

AGD: LogoThe Archive for Spoken German (Archiv für Gesprochenes Deutsch, AGD) stores German spoken-language corpora created in language surveys and research projects and makes them available to the research community for further scientific evaluation. The Database for Spoken German (Datenbank Gesprochenes Deutsch, DGD) contains a large part of the corpus documentation. These also enable users to look in transcripts for complex word combinations rather than single words, and to instantly listen to the search results as text and sound are aligned.


Max-Planck-Institut für Psycholinguistik, Nijmegen:

TLA: LogoThe Language Archive at the Max Planck Institute for Psycholinguistics contains audiovisual recordings and written material from more than 250 different languages. The DoBeS archive of endangered languages is part of the collection but the archive also contains material from e.g. language acquisition studies, studies on sign languages and multilingualism studies.

The archive employs a graded access system that gives depositors the possibility to define different access permissions for different parts of their collection. Resources can be freely accessible, accessible for registered users of the archive, accessible on request or closed. Digging into Data users can access the first two categories of files by registering on http://corpus1.mpi.nl . Permission from the depositor can be requested by filling in a form after right-clicking on a node in the archive hierarchy and selecting “request resource access” from the menu. A Code of Conduct agreement needs to be signed as part of the registration procedure.

Contact: For further information about the archive or about getting access to the material please contact Paul.Trilsbeek@mpi.nl

The Language Archive's metadata catalog browser can be accessed through http://corpus1.mpi.nl .