Manual annotation

CATMA’s Annotate component enables you to annotate texts for the purpose of analysis. Annotation is done by highlighting parts of your text and then assigning Tags to them—so you don’t have to physically insert a Tag into the text. You can either choose Tags from an existing Tagset or create new Tags while annotating. Every assignment of a Tag allows you to choose individual values for its Properties. An assignment will be visualized as an underlining of the annotated text using the Tag’s color:

Detail view of CATMA tagger

Screenshot with detail view of CATMA Annotate module. The word “Snoopy” was annotated as ‘ANIMAL’ which can have two Properties—’dog’ or ‘cat’

Four key concepts are important to understand how annotation works in CATMA, and will be illustrated with a simple example sentence:

Snoopy had lunch, and Tigger had breakfast.

The four key concepts are:

  • Annotation: Suppose you want to make explicit that the word Snoopy in this particular sentence refers to an animal: you can assign it an individual <Animal> annotation. Tigger is an animal as well: so he’ll get the second <Animal> annotation.
  • Tag: We’ve now used the Tag <Animal> twice—once as the annotation attached to the word Snoopy, and a second time as the annotation attached to the word Tigger.
  • Tag Property: Both Snoopy and Tigger are words for animals—but we also want to make explicit that the former is a dog, and the latter a cat. We can do so by specifying our Tags through individual Property values, such as “dog” and “cat.”
  • Tagset: As we interpret the sentence, both Snoopy and Tigger had a meal. To make this reading explicit through annotation, we’ll define another Tag called <Meal> and assign it both to the words lunch and breakfast (let’s keep it at the general level this time: we won’t assign Properties). Our two Tags <Animal> and <Meal> form a Tagset which is linked to our example text. We can extend, save and reuse a Tagset as part of our annotation vocabulary across texts.

CATMA supports the full range of annotation varieties, including:

  • low-level markup for text structuring elements such as paragraphs and linguistic categories, e.g. morphemes;
  • “hermeneutic markup” (Pietz, 2010) for higher-level semantic phenomena;
  • free form text comments.

In addition, CATMA allows for overlapping and multi-layered markup created by one or more annotators. Annotations are not restricted to word boundaries and can be applied to text chunks of any size, even discontinuous annotations are possible. CATMA does not make any assumptions about the nature of the annotations, so it is possible and perfectly reasonable even for single annotators to contradict themselves, e.g. to express different readings of a text.

Moreover, you can assign more than one annotation to a selected text segment. Suppose you wanted to annotate a chosen text segment like Tigger in our example sentence with the Tag <Animal>: in CATMA you can then also add a second (third, nth …) annotation to that very segment, or to any part of it, using other Tags like “ally,” “opponent,” “fictitious character,” etc. etc. In other words: in CATMA there is literally no limitation to annotation. You, or somebody else, might even decide that Snoopy is in fact NOT an animal, but a human being—in CATMA you can do this and preserve both annotation variants. That’s why we call CATMA “undogmatic!”

Here’s an example of a—more profound—text excerpt: William Faulkner’s A Rose for Emily which was annotated by Lena Schüch using narratological categories to investigate the complicated time structure of the story:

Rich CATMA annotation of "A Rose for Emily" using tags, sub-tags and properties

Rich CATMA annotation of “A Rose for Emily” using tags, sub-tags and properties



  • Piez, Wendell (2010). “Towards Hermeneutic Markup: an Architectural Outline.” Digital Humanities 2010. Conference Abstracts. London: Office for Humanities Communication, Centre for Computing in the Humanities, King’s College London, pp. 202–205.
  • Schüch, Lena (2012): “›Tagging in a huge meadow of time‹ – Computergestützte Analyse der Zeit in literarischen Texten mit Hilfe des Programms CATMA.” Journal of Literary Theory, Conference Proceedings. URL = [last seen: 19.12.2016]