Automatic markup of personal and place names in textual sources

Icon Named Entity in WebLicht

Books, articles, and manuscripts often entail information about people, geographical locations, and organizations. Individuals interested in the this information can use an automatic Named-Entity recognizer for creating an overview of a text. With this tool, names can automatically be marked and categorized. The tool is specially useful when working with long texts. It is frequently the case that the tool marks named-entities that are overlooked by a human annotator.

Especially relevant for

  • Historians
  • Political scientists
  • Literary scholars

Starting point:

German text

Task:

Marking of personal, location, and institute names and displaying them in the text

Solution:

Using the named-entities recognizer in the CLARIN-D WebLicht working environment

Related CLARIN-D tools and services

Short guide on how to find people and locations in a text source with WebLicht

  1. goto WebLicht
  2. click on the button "Start WebLicht"
  3. The window 'Sign in via the CLARIN Service Provider Federation' will pop up. Log in through your home institute:
    1. locate your research facility
    2. If your research facility is not listed choose 'clarin.eu website account'
    3. The website will redirect you to the login page of the chosen institute
    4. log in to with your credentials, usually your university account
    5. You will be redirected to the WebLicht working environment
  4. click on the button "Start"
  5. In the following tab there are three different input options:
  6. click on OK
  1. click on the button Advanced Mode
  2. set the processing chain by choosing a service from the upper space, either by double clicking or draging it to the lower space; for example, choose the following webservices in this order:
    1. SFS: To TCF Convertor
    2. SFS: Tokenizer/Sentences (use the standard values)
    3. SFS: POS Tagger - OpenNLP
    4. SFS: German Named Entity (use the standard values)
  3. click on Run Tools
  4. note the process progress
    • under the webservices you can see a progress bar - the bar changes is color as the process progresses
    • When a webservice is finished, the following icons will appear in the webservice box: a down-pointing arrow (for downloading the webservice output) and a line drawing (a "tree").
    • When the last webservice is finished, a tree will appear in the SFS: German Named Entity box
  5. click on the tree in the SFS: German Named Entity field
  6. a new browser TAB will open. goto the new TAB
  7. choose namedEntities under Highlighted view on the left side
  8. You can see personal names(PER), locations (LOC), organizations (ORG) and others (MISC) marked in different colors.
  9. use the arrows below to navigate through the results