Accessing

Accessing

Analysing

Analysing

Preparation and Depositing

Preparation

 

More

More

Help

Help

More

Usage samples of the CLARIN-D infrastructure

WebMAUS-Basic: Automatic phonetic labelling & segmentation of a single German recording with text

 

Interviews and conversation are often recorded and later transcribed. The web service WebMAUS Basic available in the CLARIN infrastructure alows to automatically combine audio recordings and text transcriptions in a way that the phones, words and audio signals are time aligned.

Especially relevant for

everybody who has audio signals and transcriptions, for example researchers working with:

  • linguistics
  • phonetics
  • anthropology
  • ethnology
  • media studies
  • educational research
  • conversation analysis
  • speech pathology
  • political science
  • speech technology

>Further information and example >>

WebMAUS-General: long video interviews with interlocutor speech, noise, long silence intervals etc. 

Very long recordings (typical in video interviews: several hours) are difficult to time-align. Therefore the BAS offers a web service that automatically splits long recordings into so-called chunks, segments them individually, and combines the results into a common file, as demonstrated in this use case.

Especially relevant for

  • Linguistics
  • Phonetics
  • Phonology
  • Speech Technology

Further information and example >>

 

From text to phonological pronunciation

Orthographic text in many languages does not encode the precise pronunciation of the corresponding spoken utterance. It is there useful to be able to automatically transform a text into a phonological encoding (e.g. for speech synthesis). The CLARIN web service G2P provides such a tool for a multitude of languages.

Especially relevant for

  • Linguistics
  • Phonetics
  • Anthropology
  • Ethnology

Further information and example >>

WebMAUS-Multiple: Automatic phonetic labelling and segmentation of multiple recordings

Interviews and conversation are often recorded and later transcribed. The web service WebMAUS available in the CLARIN infrastructure provides tools to combine audio recordings and transcriptions in a way that the words and audio signals are time aligned.

Especially relevant for

  • Linguistics
  • Phonetics
  • Speech Technology
  • Anthropology
  • Ethnology
  • Media Science

Further information and example >>

 

Cross-corpus search and download of recordings of the BAS CLARIN repository 

 

Searching throughout various corpora is usually only possible by downloading the corpus, normalizing their structure and importing them into a query tool. For some resources downloading the whole corpus and combining them may not be possible due to licensing issues.

Especially relevant for

  • Humanities scholars interested in empirical speech data
  • Developers in speech technology

Further information and example >>

Support of Enhanced Publications in CLARIN: Citation, Archiving and Access to research data 

Repositories contain research based data available under certain conditions. As repositories are permanent archiving installations, the data in there can be cited and hence made visible. This allows reusing data, attributing the resource to the creator, and reproducing research results. Access to research data will be different from repository to repository.

Especially relevant for

  • Humanities scholars working with empirical speech data
  • Developers from speech technology

Further information and example >>

 

DiaCollo: collocation analysis in diachronic perspective 

Symbolbild DiaCollo

The meaning of a word can be revealed by the context in which it appears. Changes in a word's meaning will therefore often be directly associated with changes in its characteristic combinations (the set of words with which it typically occurs together, its collocates). DiaCollo is a software tool for the discovery, comparison, and interactive visualization of the typical word combinations for a user-specified target term.

Especially relevant for

  • Historians
  • Political scientists
  • Philologists
  • Linguists

Further information and example >>

Using automatic annotation tools for transcription files 

Symbolbild: WebLICHT as a Service mit EXMARaLDA

The EXMARaLDA Partitur-Editor enables access to the webservices provided by Weblicht and the CLARIN-D infrastructure. Weblicht as a Service work-flows can be defined and later be used with just one button click

Especially relevant for

everybody who works with the EXMARaLDA Partitur-Editor and wants to automatically annotate his files, for example researchers working with:

  • linguistics
  • anthropology
  • political science, specially work with Video and Audio files

Further information and example >>

 

Where do you say 

Many linguistic resources contain geographic information, for example the location of recording or the birthplace of a speaker. The tool Wo sagt man (German for "where do you say") uses the external data from the database of spoken German (Datenbank für Gesprochenes Deutsch, DGD) and highlights the areas in which an expression is being used. It shows the area on a map where something was recorded.

Especially relevant for

  • Dialectologists
  • Historians interested in specific regions
  • Philolologists

Further information and example >>

 

Word level based comparative textanalysis 

Many questions of the humanities, which relate to specific text resources, can be reduced to the analysis of vocabulary. Especially the comparison of such vocabulary is of central interest. The aim is to demonstrate to scholars how to answer own scientific questions with the help of comparative text analysis within CLARIN.

Especially relevant for

All scholars that are comparing texts or vocabulary, including:

  • Scholars from the historical sciences
  • Scholars from the political sciences
  • Scholars from all philologies

Further information and example >>

Content analysis of biographical data supported by computational linguistics

Our web service "Textuelle Emigrationsanalyse" (german for textual emigration analysis) offers an example of how facts about emigration that were extracted from large textual corpora using computational linguistic techiques within the CLARIN infrastructure can be explored. The results can be seen either in tabular form, on a map with geographical information or person-centered.

Especially relevant for

  • Historians
  • Political scientists
  • Literary scholars

Further information and example >>

 

Automatic markup of personal and place names in textual sources 

Symbolbild Named Entity in WebLicht

Books, articles, and manuscripts often entail information about people, geographical locations, and organizations. With this tool, names can automatically be marked and categorized. It is frequently the case that the tool marks named-entities.

Especially relevant for

  • Historians
  • Political scientists
  • Literary scholars

Further information and example >>