Privacy Preserving Protocols to Trace Covid19 Exposure
Last Updated: 2020-04-29 12:40:49 UTC
by Johannes Ullrich (Version: 1)
In recent weeks, you probably heard a lot about the "Covid19 Tracing Apps" that Google, Apple, and others. These news reports usually mention the privacy aspects of such an app, but of course, don't cover the protocols in sufficient depth to address how the privacy challenges are being solved.
The essential function of such an application is to alert you if you recently came in contact with a Covid19 infected person. It is the goal of the application to alert users who are asymptomatic so they can get tested and self-isolate to prevent spreading the virus.
There are a few problems with this generic definition of the function of such an application:
- What does "recent" mean: Usually this refers to 14 days, which is the time most reports suggest as the incubation time.
- "contact" is usually defined as being within about 2 meters (6feet) of an infected individual. Some applications also require that the contact lasted longer than a few seconds.
- The infected person may only realize that they are infected until they are tested, and they learn of the result. The data needs to be stored until that determination is made (again, for about 14 days).
But the application does not need to know where you are or where you have been. Geolocation is not required to fulfill this function.
A key privacy feature implemented by these applications is that the application broadcasts a random, rotating identifier. Some of the protocols send a new identifier with each "ping"; others rotate it after a given time (minutes). This ID rotation prevents the most obvious threat of tracking a user using a unique identifier (as it has been done with MAC addresses). One of the critical parameters of these protocols is how often the identifier rotates. Some protocols suggest rotating them with each ping. Others keep the same ID for minutes (or even a day, which is probably too long).
PACT Tracing Protocol Schematic (https://arxiv.org/pdf/2004.03544.pdf)
At the same time, the application receives "pings" sent by others and records them. Initially, there is no need to store these pings centrally. Only the receiving device stores the IDs it received.
Once a user is identified as positive, things become a bit interesting. They upload the IDs they recently sent (and in some protocols also the pings they received) to a database. Some standards are assuming a single central database. Others suggest a more decentralized data store. Instead of uploading each ID sent, some protocols suggest that the IDs are derived from a seed, and only the seed needs to be uploaded, significantly reducing the amount of data being sent and centrally stored.
In addition, malicious uploads need to be prevented. They could be used to overwhelm the data or to cause false positives. The authentication schemes vary between the protocols, but typically the infected user has to provide some form of authentication code from a healthcare provider. The user should be able to exclude some data from the upload (e.g., based on the time the event happened).
Your device downloads the entire database of reported IDs (or seed keys) to check if you have come into contact with an infected individual. This is a lot easier if only seeds are uploaded (one record for each infected person) instead of having to download all individual IDs sent by infected users (about 2000/user for two weeks if the ID changes every 10 minutes). The protocol proposed by Apple and Google suggests the use of "Temporary Exposure Keys" that rotate daily. These keys are used to derive a "Rolling Proximity Identifier" which is rotated every few minutes.
This protocol should solve most of the privacy issues that arise from such an application. It should not allow a third party to identify individuals, and users will not know who of their contacts was positive (unless they only had contact with one individual person).
To assist with the acceptance of the application, users will have control over when the application is active, and what data is uploaded to any data repository.
Of course, in the past, it has been shown that very large anonymized datasets can be used to track individuals. Probably the best protection, aside from robust cryptographic implementations, is the deletion of data as soon as it is no longer relevant for tracking SARS-Cov2 infections. The user interface of the application needs to be carefully designed to allow the user to make sensible choices as to what data to record and upload to the central database.
Ideally, the application would only share the sent IDs (or seeds to derive them) with the central database. But some applications found it useful to report IDs received, Bluetooth signal strength, and the phone model. Proximity tracing with Bluetooth is tricky. Different phone models use Bluetooth chipsets and antenna configurations with different sensitivity and signal strength. Just measuring the absolute signal strength received is a poor indicator of distance. The Apple/Google standard suggests including some encrypted metadata with each ping. The keys used to encrypt the metadata are derived from the same temporary exposure key as the IDs broadcast by the phone. The metadata can only be decrypted after the user uploaded these temporary exposure keys.
This is a classic example of how one has to weight privacy vs. the value of the information received. What makes this more complicated is that a less privacy-sensitive application may collect more valuable data, but may also find fewer volunteer users. The application is only useful if there are many users (some suggest at least 60% of the population needs to use the application). And of course, the application needs to be released "now" leaving little time for an extensive review period.
A quick summary of the proposed protocols:
Apple/Google Contact Tracing
DP3T (Decentralized Privacy-Preserving Proximity Tracing)
PEPP-PT (Pan European Privacy Preserving Proximity Tracing)
PACT (Privacy-Sensitive Protocols And Mechanisms for Mobile Contact Tracing)
Johannes B. Ullrich, Ph.D. , Dean of Research, SANS Technology Institute
Considering that people /known/ to the person who was contagious over the last n days, often can be traced /without/ an app, such an app is probably most effective for "reaching" people /unknown/ to the contagious person. If so, such an app is likely to be most effective in places where people, unknown to each other, meet for longer periods of time (trains, buses, pubs, restaurants, barber shops, ...) - locations that will become busier once lock down measures are relaxed or lifted.
Taking into account that Bluetooth-based protocols may insufficiently relate to Corona transmission, impose security- and privacy risks, and do not provide for context information (/where/ did the infection take place), I proposed an alternative (in Dutch) in https://security.nl/posting/653782 .
In essence, at locations where many different people meet, unique QR-code stickers are attached to walls. One QR-code sticker may be attached at the entrance of a restaurant, while a unique QR-code sticker may be positioned at each row of seats in a train. In the latter case the QR-code could consist of a GUID representing the carriage number, a sub-location code (row of chairs), and an indicator of the type of location (train carriage).
The app has 2 purposes:
1) Diary functionality, used to remember when you were in specific high-risk places;
2) Contact a central server (somewhat anonimizing requests) to determine afterwards whether someone was contagious when you were near that person.
Like the Bluetooth-based apps, once a person is known to have been contagious, collected information is somewhat anonimized and uploaded to the server.
Of course getting used to the idea, generating and attaching QR-code stickers and maintaining their integrity may be problematic, while a sufficient number of people will have to be prepared to scan QR-codes in such places. On the other hand: an app that "read-only" acquires QR-codes instead of exchanging information with random smartphones, may be trusted by more people, in particular if such an app is open source.
If anyone would like me to, I'll translate my ideas to English in more detail.
Erik van Straten (evs20200403x (at) xs4all.nl)
Apr 29th 2020
3 years ago
According to bluetooth inventor Jaap Haartsen, bluetooth is not accurate enough for contact research into corona. He emphasizes that the range of the signal, which varies between one and twenty meters in the current generation of wireless technology, does not provide sufficient certainty about distance. According to Haartsen, this leads to unreliable results. Moreover, he is very critical of the way in which the government is tackling the development of the app.
Apr 30th 2020
3 years ago
So, Didier is the best candidate in this case. He is from the Dutch side of Belgium, me being from the French side...
Apr 30th 2020
3 years ago
Seems I can't even keep up with English....
May 1st 2020
3 years ago