A PRE-ANNOTATION TOOL FOR EVENT EXTRACTION IN CARDIOLOGY AMBULATORY TEXTS IN BRAZILIAN PORTUGUESE
Resumo
Clinical texts offer medical information regarding the patients that are not present elsewhere, so creating tools that automatically extract this information can provide better and more personalized patient care. Creating these tools demands advanced machine learning techniques that require annotated data provided by burdensome annotation processes. Thus, we proposed a dictionary-based pre-annotation tool to diminish the burden of manually annotating all mentions over the texts. We developed a pre-annotation tool to help in our event annotation for cardiology ambulatory texts. The pre-annotation tool was based on a dictionary created during the annotator's training phase and four rounds of the annotation process. We annotated 126 texts with three annotators from the medicine course. We evaluated the pre-annotation performance based on the inter-annotator agreement, the annotation time, the annotation speed, and the pre-annotation coverage (the number of correct pre-annotations present in the gold standard). We concluded that our dictionary's refinement was beneficial to our pre-annotation; it raised the pre-annotation coverage while not reducing the inter-annotator agreement. We noticed that our annotation time decreased over the rounds, which is expected due to the annotators getting used to the annotation guideline and annotation tool over time.