...
Code Block |
---|
McMurry* AJ, Fitch* B, Savova G, Kohane IS, Reis BY. “Improved de-identification of physician notes through integrative modeling of both identifying and non-identifying medical text”, BMC Medical Informatics and Decision Making Accepted minor revise Jan 2013. |
...
Venn Diagram
Example text block -> Feature Set
Office Excel | ||
---|---|---|
|
Info |
---|
3.X is a new vision for the scrubber. As we approached diminishing returns for improving REGEX and whitelists/black lists, we have shifted towards a machine learning methods approach and learning from large bodies of medical information from publications and UMLS dictionaries |
...
Venn Diagram
System Use Cases
...
Use Case: Tagging Noun Phrases and UMLS concepts
...
Precondition:
- Training Data
- Software: cTakes using features speech tagger & UMLS CUID extractor
Steps:
- Block of text is sent to cTakes
- cTakes processing
- start & end position of all POS tags
- part of speech
- Most interested in Nouns because of PHI
Post-condition:
...
Use Case: Meta-analysis of text
...
Precondition:
- Tagging Noun Phrases
- Scubber configured (with or without local dictionary/regex mods)
Steps:
- Each "scrubber" implementation procudes Recorder output
- Passthrough Imp
- Regex
- Word lists
- cTakes Impl (OpenNLP)
- Noun Phrases
- UMLS cuids
- Passthrough Imp
- Performance evaluation (ROC)
- Scrubber standalone
- Scrubber word lists limited by detected noun phrases
- Scrubber word lists limited by detected noun phrases and non-UMLS concepts
Post-Condition
- Text is processed by more than one algorithm "ham vs spam"
Example text block -> Feature Set
...
. |
...