Your privacy, your choice

We use essential cookies to make sure the site can function. We also use optional cookies for advertising, personalisation of content, usage analysis, and social media.

By accepting optional cookies, you consent to the processing of your personal data - including transfers to third parties. Some third parties are outside of the European Economic Area, with varying standards of data protection.

See our privacy policy for more information on the use of your personal data.

for further information and to change your choices.

Skip to main content

Table 5 Overview of all used models, including important advantages and disadvantages

From: Diagnosis extraction from unstructured Dutch echocardiogram reports using span- and document-level characteristic classification

Model

Advantages

Disadvantages

Span classification

Approximate list lookup

Transparency, flexibility, fast, easy to implement

Time-consuming, operator-dependent, cannot generalize beyond provided list

MedCAT (biLSTM)

Can extract medical concepts and their relationships, leveraging knowledge from existing ontologies

Can have limited adaptability to new terms

SpanCategorizer

Uses pooling to make the model more robust, optimized for span classification

More complex model design may require tuning for optimal results

Document classification

Bag-of-words

Works well with sparse data, simple, easy to implement

Ignores word order and context, which can lead to loss of information

Span classifier heuristic

Allows span-level analysis of results

Suboptimal performance due to the increased complexity of span classification

SetFit

Effective learning from limited data due to few-shot learning

May require hyperparameter tuning to yield optimal results

MedRoBERTa.nl

Pre-trained on Dutch medical text, provides a strong starting point. Capable of capturing context

Requires significant computational resources, may need further domain-specific adaptations

Bidirectional GRU

Captures context backward and forward

May require extensive training to avoid overfitting

Bidirectional CNN

Effective at extracting local patterns and features

May struggle with long-range dependencies

  1. Abbreviations: CNN convolutional neural network, GRU gated recurrent unit, LSTM long-term short memory unit, MedCAT medical concept annotation tool