Curation project: Linguistic Annotation of Non-standard Varieties – Guidelines and Best Practices (WG 7)

Project content

The Curation project for the evaluation of annotation schemes for non-standard varieties has been granted on June 2012.

Current schemata and guideline for linguistic annotation have been developed predominantly for the description of newspaper language, also automatic annotation tools continue to be evaluated mainly on newspaper language.

This curation project aims at annotating data of different domains of so called “non-standard varieties”. Such data comprise a variety of linguistic structures and phenomena which are not covered by current guidelines

In a pilot study the granted curation project will evaluate established annotation schemes for three annotation layers (dependency analysis, named entity recognition and coreference) and where necessary extend them.

For that reason a test corpus of non-standard varieties will be compiled and annotated with the goal to produce guidelines and best practices for the annotation of those varieties.


  • 01.09.2012 – 30.09.2013


Responsible Institution

  • Institut für deutsche Sprache und Linguistik, Humboldt-Universität zu Berlin

  • Sprachwissenschaftliches Institut, Ruhr-Universität Bochum

Executive Staff