Table of Contents

Intended usages

Default configuration

Info
We recommend starting with the default properties and prebuilt train/test models. The train and test models are anonymized feature sets (not text) generated by scrubber runtime.

Default configuration

Developers can use Scrubber 3.X in "default mode" with the same settings as the provided train and test model files. Input and output settings are managed in scrubber.properties (file paths, database settings, method implementations).

Info
scrubber.properties : all supported config options and features in one place. Apache UIMA, Apache cTAKES, and WEKA distribution jars are loaded dynamically.

...

Annotate word tokens and redact PHI from physician notes
cTAKES lexical parsing and medical dictionary annotation
WEKA multi-class decision tree classifier (plugin default)
Protege UI support for human expert curators (reads output)
Generate feature sets containing lexical properties, medical concept codes, and human defined rules

Models
Prebuilt train and test models can be imported to Weka (default), Matlab, or R
(default) Test your local physician notes without retraining
(optional) Retrain model using local physician note samples, publications, and medical dictionaries.

Classification
Distinguish (classify) private patient data from coded medical concepts and commonly used words words

Compare Text
Compare lexical properties and distributions of public and private text sources

How To

Install / Train / Test / Scrub

Office Word
name scrubber-3.x-runtime-guide.doc
Scrubber Property KEY = VALUE

...

Child pages

Versions Compared

Old Version 37

New Version 38

Key

Intended usages

Default configuration

Default configuration

Models

Classification

Compare Text

How To

Install / Train / Test / Scrub

Child pages

Page History

Versions Compared

Old Version 37

New Version 38

Key

Intended usages

Default configuration

Default configuration

Models

Classification

Compare Text

How To

Install / Train / Test / Scrub