Whether you decide to perform de-identification internally or procure a service provider, it is important that the technology to be used fits your organization’s needs and goals, and has the capacity to support mature de-identification practices.
Organizations seeking to implement de-identification have often begun by purchasing software, only to encounter problems using it. Common difficulties are interpreting risk metrics, making decisions about data releases, and integrating de-identification with business processes. The purpose of our Implementing De-identification series has been to describe the assessment and planning steps that will enable you to procure and implement the de-identification technology that will best fit your organization’s needs and goals. Depending on your needs and existing capacity, it may be most appropriate to purchase software to be used by your staff, or to procure an external service provider to perform de-identification. Your earlier work to define service needs, use cases, and maturity goals will help you to discern the technological capabilities crucial to meeting your service goals and procure suitable software or services.
As the procurement process goes forward, you can continue to prepare for implementation by developing data mapping templates, workflow diagrams, toolkits, etc. These documents and templates will form a library of reference that will aid in the adoption and effective use of de-identification technology.
Technical criteria for de-identification software
We suggest considering these four technical criteria before buying de-identification software:
- Does the de-identification tool work at the interface, network, or database layer? Interface tools, which are usually the most affordable, simply mask data on the end user’s computer screen. These are generally only appropriate when data remains within a secure network. Network tools, installed in a router, intercept traffic and de-identify data on the fly, before it leaves a secure network. These tools use reversible identifiers, meaning that users can edit either the original or de-identified data and changes will be translated to the other version of the database. Database tools, on the other hand, de-identify data before it leaves the source computer, making them the most secure option, though also the most expensive. These include both tools that are integrated within the original database, such as the Oracle database masking and subsetting pack, and those that run as a separate engine, creating a new, de-identified database version.
- Does the de-identification tool integrate into the software design life cycle or is it a separate program? Some tools are standalone programs that are configured to import raw data and export into a copy database. De-identification libraries or development toolkits, on the other hand, provide software code that allows in-house developers to write de-identification functions into the original database. Standalone programs may be the easier solution for organizations that use de-identification infrequently. However, for those that need to de-identify data on a regular basis, integrating de-identification into the source database averts the difficulty of managing multiple copies of the database, as well as the hassle of managing updates and licenses for another new program.
- Does the tool provide an easily understood risk level? Statistical risk measurements tend to be poorly understood by managers and legal advisors who want clear guidelines about acceptable risk levels. Tools that use simple risk classifications can help to make the decision-making process easier. The goal, in all cases, should be to reduce privacy risk to a near-zero level.
- Is the tool plug and play? For users who need to be able to de-identify and release single files such as diagnostic images or biometric data on demand, there are some tools available that can rapidly de-identify data to be released on a USB device or CD.
Your choice of de-identification software should be governed by its capacity to implement mature de-identification practices: not only masking direct identifiers, but also providing easily understood risk metrics to guide the effective de-identification of indirect identifiers. Also key is that if de-identification is to be performed by your organization’s staff, the software procured must be easy to use and be perceived by staff as useful. A suitable tool or service will maximize the value of its implementation and provide the best return on investment in terms of cost efficiency by increasing the volume and quality of de-identified data available to clients.
Implementing De-identification Series
Watch for our final post in the Implementing De-identification series, which will outline our approach to service rollout.