Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The Shared Pathology Informatics Network (SPIN) originally was funded by the National Cancer Institute to link the vast collections of human specimens that are infrequently shared for cancer research. SPIN sets forth the institutional agreements and distributed database architecture to grant institutional autonomy and protect patient privacy according to HIPAA regulations. SPIN has successfully completed a feasibility study involving seven independent medical centers sharing millions of human specimens.

Using a peer-to-peer architecture, institutions become SPIN members (nodes) by securing institutional review board (IRB) approvals and deploying the SPIN software. At any time, an institution can withdraw from the network without leaving their data behind or disabling the network. SPIN nodes can serve as peers or supernodes to query local databases or networks of child nodes, respectively.

SPIN allows institutions to expose de-identified pathology reports while keeping corresponding reports containing Protected Health Information (PHI) disconnected from the Internet. A randomly generated unique identifier is assigned to both the PHI and de-identified reports in a locally controlled codebook. The machine storing the codebook is disconnected from the Internet and protected according to each participating site's policies. The resulting solution is flexible and compliant with HIPAA regulations.

SPIN provides three levels of increasing access commensurate with investigator credentials and IRB approvals. First, feasibility studies are conducted using a statistical level query that returns only aggregated results. Second, individual de-identified cases are selected by investigators certified by one of the participating institutions. The third level allows requests for specimens and clinical data that must be approved by the institution storing the requested data.

Joining the Network

Our Extract, Transform, Load (ETL) toolkit requires the minimal amount of programming and expertise necessary to get new types of health data flowing in large quantities. Because health data standards are critically lacking adoption, SPIN particularly favors small, simplified schemas that can be used to support BIG investigations with robust sample sizes. Under the control of each SPIN peer, the ETL toolkit also provides a set of anonymization and autocoding tools to prepare medical free text into standard vocabularies that are meaningful for research and public health investigations. Clearly, this approach favors early adoption and the "low hanging fruit" -- the first steps towards timely realization of a National Health Information Network. We have also published a set of decentralized IRB agreements for institutions as a guide for research centers such as those created by the NCI or CTSA.

...