This project is actively sponsored by Cancer.gov to aid sharing human biospecimens with select diagnostic and treatment criteria.
Status:
2005: Approved for use at four Harvard affiliated teaching hospitals
2006: Initial open source release for Pathology Diagnoses (Linux.com article)
2007: Completely rewritten API to improve performance, reproducibility, and hospital-specific customizations.
2008: Extended to support scrubbing other kinds of notes such as patient discharge summaries.
2009: Approved for use at two large HMO sites.
2010: Machine Learning work begins using millions of peer-reviewed publications to train "ham" (medical concepts) from "spam" (patient identifiers).
2011 Roadmap
- Currently statistical evaluation of the scrubber performance is underway for upcoming publications.
- Active development on De-ID improvements using corpus data.
- Active development on new Concept Extraction module for Scrubber.
- 3.0 planning & use case documents