Even with its pitfalls, big data can benefit the economy without imperilling vulnerable groups

What you need to know:

  • A few weeks ago, stakeholders gave their views on the upcoming Data Protection Bill that will be headed to Parliament for debate.
  • In particular, the principles of fairness, transparency, purpose limitation and data minimisation seem to go against the normal practices that big data companies take for granted.
  • What may be viewed as “unnecessary” data can enable data analytics companies to discover useful patterns between different datasets that were previously hidden.
  • Data collectors or controllers have the power to make positive or negative impacts on society based on their data analytics capabilities.
  • Big data analytics has obvious challenges and blind corners, which is there must be frameworks in place to ensure that the economy benefits from it while not discriminating against segments of the population.

A few weeks ago, stakeholders gave their views on the upcoming Data Protection Bill that will be headed to Parliament for debate. There was obvious tension between some of the data protection principles and dig data practices that are commonly executed by private enterprises.

In particular, the principles of fairness, transparency, purpose limitation and data minimisation seem to go against the normal practices that big data companies take for granted.

Fairness and transparency require that those collecting personal data seek consent from the data subjects or citizens while educating them on how they would use their data.

Purpose limitation ensures that data is processed only for the purpose that was defined before that data was collected while data minimisation limits the scope of what is collected according to the purpose defined.

In other words, if you are registering at a hospital as a patient, the hospital has no business asking for your tribe, county or your academic qualifications since such data may not add value for the purpose at hand – getting medical care.

USEFUL PATTERNS

However, from a big-data perspective, it is such “unnecessary” data that enables data analytics companies to discover useful patterns between different datasets that were previously hidden.

Big data is often described in terms of its four dimensions, the four V’s – volume, variety, velocity and veracity.

Data volume implies that the more data collected the better, and preferably from a variety of different sources, including social media or CCTV cameras. Velocity means that data is always changing or can be real-time while veracity implies that some of the data may be unreliable or not accurate.

With all these factors in mind, data companies would collect and curate all these data points about several subjects and draw valuable insights that could be both useful and detrimental.

In the hypothetical example of patient data, tribal or county-related data about a group of patients may be mined to reveal insights such as a particular population segment being more prone to malaria, cancer, fluorosis or HIV.

POTENTIAL FOR ABUSE

Such insights would obviously lead to better targeting of medical interventions for each of the vulnerable population segments.

However, this very insight could be abused by, for example, ensuring that the vulnerable population is deliberately further exposed to conditions that would worsen their ailment.

In other words, data collectors or controllers have the power to make positive or negative impacts on society based on their data analytics capabilities.

Sometimes the negative impacts are not deliberate but accidental, since the analytics algorithms tend to reinforce inherent biases that already exist in society.

DISTRIMINATIVE POLICIES

For example, employee recruitment algorithms may prefer male CEOs based on big-data sources that insinuate that a majority of successful CEOs are male.

The algorithm would not be sensitive to the fact that whereas that data may be true, it is not based on superior male power, but rather on centuries of discriminative policies against women that saw them enter the formal workforce just seventy or so years ago.

In one country, women were finally allowed to drive just a year ago. Data on female drivers would be very limited in such populations to the extent that the algorithms would conclude that women are not proficient or good candidates to drive vehicles.

These are some of the challenges and blind corners of big data analytics. It is for these reasons that there must be frameworks in place to ensure that the economy does benefit from big data while not discriminating against segments of the population.

It is indeed possible and desirable to have both big data analytics and data protection principles working hand in hand.

Mr Walubengo is a lecturer at Multimedia University of Kenya, Faculty of Computing and IT. Email: [email protected], Twitter: @Jwalu