anotator module

anotator.dataset_NER_prepocess(dataset)[source]

Preprocess a dataset before training NER.

Assuming That a clean dataset of Entities should not contain verbs, adverbs, adjectives and random symbols

Parameters:dataset (list) – list of strings for NER trainging
Returns:processed dataset if Sucessful, None otherwise
Return type:list
anotator.dataset_to_spacy(db, entity_label)[source]

Bring a dataset to a spacy trainable state

Parameters:
  • dataset (list) – list of strings for NER trainging
  • entity_label (str) – designated label for the Entity
Returns:

The spacy training ready list if Sucessful, None otherwise

Return type:

list