The following standards, which serve the general aim of increasing the searchability and recoverability of information, guide the design and compilation of ParCorOE.
Standard 1: Alignment
An aligned parallel corpus Old English-English consists of a parallel text, that is to say, an Old English text placed along its translation into Present-Day English, with alignment at text, sentence and word level, in such a way that each source language segment is paired with a target language segment. Word, sentence, and text alignment requires tokenisation at these three structural levels. Alignment parings should be marked by means of the highlighting of the source and the target segment.
Standard 2: Annotation
Three types of annotation must be distinguished: mark up at text level, as well as syntactic annotation and morphological tagging at sentence/word level. Fragments (tokens) are comprised of at least one sentence or one syntactically independent period, identified by means of a text number.
Standard 3: Lemmatisation
The corpus must be fully lemmatised, so that all the textual attestations are grouped under the relevant lemma, and each lemma is provided with all its inflections.
Standard 4: Automation
Within the limits imposed by the available written standards and the variation that they present, the annotation of the parallel corpus must be automatic. This includes not only syntactic annotation and morphological tagging, but also the necessary lemmatisation. Lemmas and inflections must be listed dynamically.
Standard 5: Feeding
The corpus must be fed with the information available from The Knowledge Base of Old English (OEKB). The parallel corpus may retrieve information from the relational databases in OEKB in order to maximise the automation of the tasks of tagging, annotation and lemmatisation.
Standard 6: Searchability
The corpus must be searchable by text, fragment and word, as well as by morphological tag and syntactic annotation. Combined searches by inflectional form and lemma are also required. The corpus must be based on a concordance and an index, so that the main layouts are interconnected.
Standard 7: Dissemination
The corpus must be available online in open access and must be searchable with an Internet browser. Users should not have training or previous experience with database software in order to search ParCorOE.