In our first article of our Implementing De-identification series, we discuss how to define use cases as a means to establish who will be receiving de-identified data and what types of data they will be able to access. Use cases guide the development of a de-identification service designed to meet clients’ needs and evaluate risk in a systematic way.

De-identification is increasingly being used by public and private sector organizations as a means to use and share personal data holdings for research and analytical purposes without putting individual privacy at risk. As we have discussed, implementing de-identification effectively requires more than purchasing a software tool and training staff to use it. Business architecture defines the goals, scope, roles, and process for de-identification, ensuring that it is used with clear purposes and rules, integrates smoothly into the organization’s business process, and actually is useful to clients.

The first step towards implementing de-identification services is to establish use cases: that is, who will potentially receive de-identified data, what types of data they need, and how data will be delivered. For example, a major bank’s data client types might include internal users, credit bureaux, investment and insurance companies, the Bank of Canada, the Ministry of Finance, and the public. The type of data these clients request will vary in terms of subject matter (e.g., income, debt, credit, investments) and level of granularity (ranging from aggregate statistics to individual-level bank records). Depending on clients’ needs, data may be accessed through an internal computer system, data feeds, online portals, or data collection tools. Establishing use cases will help you to determine the broad goals and structure of your de-identification service.

There are three factors involved in decisions to release data: the client’s needs, the client’s risk level, and the privacy risk of the data. Client risk is evaluated by assessing the quality of the client’s privacy and security practices, while data risk evaluates the risk that individuals could be identified based on the data disclosed.

De-identification use cases

Based on their needs and risk levels, clients may be allowed access to higher or lower risk data. Defining use cases based on client type, needs, and risks help to establish business rules: an objective framework for deciding on clients’ levels of data access.

Let’s return to the example of a bank. Information on individual banking customers (high risk data) would be available only to trusted, low-risk clients such as internal personal financial advisors and external credit bureaux. Aggregate level information posing a medium privacy risk, such as average incomes and debt ratios by postal code, might be made available to investment planners within the bank, external investment and insurance companies, and financial reporting agencies. General statistics posing a very low privacy risk, such as regional average income and consumer debt, could be released to the public.

Implementing business rules such as these not only contributes to effective risk management; it also dramatically streamlines the data release process. Use cases and associated business rules enable organizations to establish in advance what types of data each client will be able to access, making it unnecessary to conduct a risk analysis of each data request.

%d bloggers like this: