Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migrated to Confluence 4.0

Research Mission

Intended Audience

SOURCE CODE REPOSITORY
ISSUE TRACKER
  • URL here

Wiki Markup
 *Title:SPIN Scrubber User Guide* 
 Author:Andrew McMurry 
 Contact:amcmurry@genetics.med.harvard.edu
\\
\\
!worddavbb43921d9d4a629f7d8d62dbaa286e20.png|height=115,width=575!
\\
\\
*Intended Audience:* 
Technical staff of all levels should be able to configure this module for use with their records. 
Programming experience is NOT required, though a basic understanding of XML is needed to edit the configuration file. 
\\
*Scrubber Overview:* 
This scrubber utility removes confidential identifiers from structured XML or plain text by comparing the input text phrases to a list of known identifiers (names, states, etc) and a series of Regular Expressions. 
\\
While typically used to prepare confidential reports to be compliant with [HIPAA|http://www.hhs.gov/ocr/hipaa/] standards, this utility is practical for any organization looking to protect privacy of their records - regardless if they are being used for medical purposes or not. 
\\
*Required Software:* 
Java 1.5 
\\
*Running the Scrubber Program:* 
\\
+Usage:+ 
scrubber _InputFile(s) \[ConfigurationFile\]_  
\\
+Where+ 

  • The InputFile(s) argument is either a file or directory
  • The ConfigurationFile argument is optional. By default, scrubber will search for ScrubberConfiguration.xml in the same directory as the jar program.

...

  • The Scrubber Implementation denotes if the scrubber should expect XML files or Plain text Files. Valid settings are DEFAULT, XML_TEXT, or TEXT.
  • The TextProcessingRules defines the rules for text scrubbing.
  • The XMLTextProcessingRules defines rules for scrubbing XML files. These rules extend TextProcessingRules.
  • The Recorder captures the identifier matches

...