Model | Advantages | Disadvantages |
---|---|---|
Span classification | ||
Approximate list lookup | Transparency, flexibility, fast, easy to implement | Time-consuming, operator-dependent, cannot generalize beyond provided list |
MedCAT (biLSTM) | Can extract medical concepts and their relationships, leveraging knowledge from existing ontologies | Can have limited adaptability to new terms |
SpanCategorizer | Uses pooling to make the model more robust, optimized for span classification | More complex model design may require tuning for optimal results |
Document classification | ||
Bag-of-words | Works well with sparse data, simple, easy to implement | Ignores word order and context, which can lead to loss of information |
Span classifier heuristic | Allows span-level analysis of results | Suboptimal performance due to the increased complexity of span classification |
SetFit | Effective learning from limited data due to few-shot learning | May require hyperparameter tuning to yield optimal results |
MedRoBERTa.nl | Pre-trained on Dutch medical text, provides a strong starting point. Capable of capturing context | Requires significant computational resources, may need further domain-specific adaptations |
Bidirectional GRU | Captures context backward and forward | May require extensive training to avoid overfitting |
Bidirectional CNN | Effective at extracting local patterns and features | May struggle with long-range dependencies |