Creating value with open data, without compromising anonymity

Here Technologies’, Aleksandra Kovacevic, Senior Engineering Manager reflects on the need for careful consideration in encouraging the sharing of data for improved mobility services.

When it comes to data privacy, location data has one of the most complex facets. The potential physical safety implications as a consequence of potential privacy breach, as well as the ability of devices to constantly track their users, brings with it particular sensitivities.

Vehicles, mobile operators and smartphone apps are the most likely to track users’ location, which could make some feel uncomfortable. Consumers should take comfort, however, from knowing that most companies rely on location data only to create safer and smarter services. It is in their interest, therefore, to promote responsible data capture. This has only been heightened with the arrival of global privacy laws, including Europe’s GDPR.

However, to effectively protect the privacy of location data specifically, it is not enough to just comply with GDPR regulation. If the services of the future are to bring together different stakeholders to draw on information from multiple data sources, it is vital that companies open their data and make it available for collaboration.

Depersonalisation does not guarantee anonymity

In the area of location data privacy, users of applications and services may think that deleting personal information preserves anonymity. Depersonalisation, however, does not constitute anonymisation, as it is still possible to trace back to the user. When a user travels, his/her device does not necessarily generate isolated data points. Travelling from one place to another produces a whole sequence of locations and timestamps that come together to chart a path on a map. This whole sequence, called a trajectory, can be particularly revealing – it is what can make this category of confidential data more complex to manage over others.

Although a company can delete all personal information from its data points at any time, it is also possible for anyone – including third parties – to add other publicly available associated data to these trajectories and then use this combination of data for identification. For example, one Australian student was able to locate military bases in the Middle East using anonymised data from a fitness app. In fact, MIT researchers have long known that it is possible to identify individuals using as little as four location data points.

The nuance between privacy and security

In such cases of invasion of privacy, reconstructions of anonymized location data are carried out using publicly available data. In the case of the Australian student, no security breach was officially found; in other words, no illicitly obtained key or password was needed to access confidential information.

Security protection makes sure that data is not provided to the unintended recipient. Privacy issues, on the other hand, arise when companies open certain information to their data users, which can then be exploited for malicious purposes. Well-meaning developers or researchers can rely on open data to design smarter solutions and yet, with wrong intent, this can be cross-referenced with other external information to reveal information that is not intended to be exposed.

Creating value while protecting privacy

Any company that provides third parties with consumer data is likely to inadvertently transmit information that identifies those affected by the data. This is not a problem for companies that confine their data internally, to improve their own services. However, companies that provide open data to foster innovation must take a thoughtful approach before disclosing information.

Businesses that rely on and share open data must operate more intelligently to defend users’ privacy. Research teams whose goal it is to help develop better services and processes need the trust of the people providing the data on which they rely. Maintaining this trust is a crucial issue. While companies must prioritise protecting users’ privacy, it is also essential for their business to retain enough value in data to improve their services and innovate.

Understanding the destination of the data is part of the solution

There is no single, on-fit-all anonnymization solution that protects privacy and preserves data value at the same time. Companies working with open data must first identify how the data will be used. It is also important to specify use cases to understand the possibilities of anonymising data, while maintaining a high quality of service.

For example, a company that evaluates traffic at a given location or route can determine how the data will be used and what data is important. If there is no traffic jam, redundant vehicle speed updates are not needed. Similarly, in the event of slowing down, not all vehicles stuck in traffic need to report the same situation. In fact, there is no need to publish information about individual vehicles. The company can simply provide information when a traffic jam threshold is reached and indicate the number of vehicles above that threshold.

By tailoring the data to its intended use, companies can target and limit the information processed, all without revealing too much.

No company or institution can claim to have the perfect privacy solution for location data. What they can do, however, is act responsibly by assessing the risk of invasion of privacy on the one hand, and the value of data on the other. Only in this way will they be able to adapt anonymisation solutions to create a win-win situation that combines confidentiality and value for services.

Tags: mobile operators, smarter services, smartphone apps, track users, traffic jam

TrafficInfraTech Magazine