9 Reasons Smart Data Scientists Don’t Touch Personal Data

The production of massive amounts of data as a result of the ongoing ‘Big Data’ revolution has transformed data analysis. The availability of analysis tools and decreasing storage costs, allied with a drive-by business to leverage these datasets with purchased and publicly available data can bring insight and monetize this new resource. This has led to an unprecedented amount of data about the personal attributes of individuals being collected, stored, and lost. This data is valuable for analysis of large populations, but there are a considerable number of drawbacks that data scientists and developers need to consider in order to use this data ethically.

Here are just a few considerations to take into account before ripping open the predictive toolsets from your cloud provider: 

1.Contextual Integrity

Data is collected over different contexts which have different reasons and permissions for capture. Ensure that the data you capture is valid for that context and cannot be misused for other purposes. There could be unintended side effects of mixing public and personal data. An example is notifying other parties of location data without consent, as there are numerous examples of stalkers using applications to track others. 

2. History Aggregation

History is an important part of many efforts to defining …

Read More on Datafloq

Comments are closed, but trackbacks and pingbacks are open.