Overview
3.X is a new vision for the scrubber.
As we approach "diminishing returns" of improving the REGEX and dictionaries approach, we are now moving towards statistical methods and learning from large bodies of medical information from publications and UMLS dictionaries.
3.X Diagram In Progress
Use Case: Tagging Noun Phrases and UMLS concepts
Precondition:
- Training Data: Genia, PenTree Bank, Mayo Source
- Software: cTakes using features POS tagger & UMLS CUID extractor
...
- Input document (either medical note OR publication) will have POS tagged and UMLS CUIDs.
Use Case: Meta-analysis of text
Precondition:
- Tagging Noun Phrases
- Scubber configured (with or without local dictionary/regex mods)
...
- Text is processed by more than one algorithm "ham vs spam"
Proof Of Principle : Demo
Office Excel | ||
---|---|---|
|