Is Self-Sovereign Identity the Answer to GDPR Compliance?

April 20, 2020

By Arjun Govind

This piece is the third installment of a series on digital identity called “Please Allow Me to Introduce Myself: The Past, Present, and Future of Digital Identity.” This series aims to explore the evolution of digital identity, the state of self-sovereign identity today, and its use cases.

Check out previous installments:
Part 1: The Evolution of Digital Identity
Part 2: Self-Sovereign Identity: Under the Hood

sovereign identity

Photo by Thomas Kelley on Unsplash

Hopefully, following my last piece, the theoretical advantages of a self-sovereign identity solution are clear. From easy verification to managing revoked credentials, advantages abound — at least on paper. However, that in and of itself is likely inadequate to drive wide-scale adoption. The quintessential example of theoretical benefits being insufficient for widespread adoption is PGP — the benefits were there, but the user experience was far from stellar. The user experience was in fact so problematic that academic papers were written to explain why the average person — or “Johnny” as the paper named him — were simply unable to use it.

In this installment, I take our first look at what I believe is the most important catalyst of adoption: Big Brother. While industry-specific regulations like Anti-Money Laundering (AML) requirements and the 2nd Payment Services Directive (PSD2) are topics for subsequent posts, I want to focus on personal data and consent management — which is the main goal of the EU’s General Data Protection Regulations, or GDPR.

GDPR demands nothing short of a sea change in how organizations store and manage personal data. However, in this article, I want to focus on arguably the most significant clause of GDPR as it relates to identity management: Article 17, or “the right to be forgotten”.

What Is The Right To Be Forgotten?

The right to be forgotten, also called the (distinctly less dramatic) “right to erasure”, allows users to request that the company “forgets” them, or removes every instance of their data from their system. This right to be forgotten is very much in line with the tenets and premise of self-sovereign identity. Thinking back to Christopher Allen’s 10 Principles of SSI I cited in the first installment, the right to be forgotten is a manifestation of the control one should have over the use of their identity.

If my personal data is truly mine, then I should be able to decide not only who can use it, but also when someone can use it.

The law states that “The data subject shall have the right to obtain from the controller the erasure of personal data concerning him or her without undue delay”, essentially asking the holder of the data to “forget” them. However, if left unqualified, such a policy could have disastrous implications. Consider, for instance, a facial recognition algorithm that was trained using Person A’s face in the training set. If Person A then put in a deletion request, would the entire algorithm have to be re-trained?

The short answer is no.

As important as the right to erasure is, it’s not an absolute right. Article 17 lays out a series of conditions, or cases when the right to be forgotten can be exercised.

The right to be forgotten can only be exercised if one of the following criteria is met:

  • The personal data is no longer necessary for the purpose it was originally collected or processed for
  • The data was originally held for a legitimate state interest, but there is no longer any such legitimate state interest
  • The personal data had been processed unlawfully
  • The personal data needs to be deleted to comply with a legal obligation

As such, our example would fail the first test, as that data would be necessary for the purpose it was collected for.

What’s The Problem Here?

So why is the right to be forgotten is proving to be one of the hardest problems to tackle when it comes to GDPR compliance? The challenge it presents is threefold: discerning a valid deletion request, identifying what constitutes “personal data”, and actually deleting the data.

The first problem certainly doesn’t lack for irony. Of course, in the SSI world, this would just be a matter of presenting a verifiable presentation, and all would be well. However, coming back to the username-password world, something as simple as a credential stuffing attack (essentially, trying out usernames and passwords stolen from a data breach) could have disastrous implications. To demonstrate, consider the case of a disgruntled employee acting against their employer by sending a deletion request to Google. Imagine the devastation — Gmail, GCal all vanishing without a trace! This is compounded by the fact that this employee may know or obtain the answers to security questions through social engineering.

Since one of these deletion requests are impossible to undo (that’s the point, right?!), it’s imperative that companies come up with robust approaches to ensuring that all such requests they receive are indeed valid.

However, even after moving past this problem, companies face the problem of figuring out what information should be removed. For instance, if a social media company gets a deletion request from a user who is tagged in photos posted by their friends, what information if any should be removed? I’m not a lawyer, so I can’t give a definitive answer — what constitutes personal data will likely be guided by precedent that develops over time. Companies will have to invest a lot of time and money in attorney fees in figuring out that definition.

So say we know that the deletion request is valid and we know what data to delete. We’re all clear right? Just go and hit delete, right? Well, not quite. In fact, this may be the hardest challenge of them all — finding that data. And it’s a little bit more involved than just Ctrl+F. The reality is that information can stored across several databases, perhaps across several divisions, offices and maybe even countries.

On a technical level, this could happen due to mirroring, a process you may have encountered while downloading software. If I’m based in Philly, it’ll be far quicker for me to access data held in a server in New York or Virginia as compared to one in Shanghai or Mumbai. On the flip side, if I’m in Bangalore, it makes much more sense for me to just use a server somewhere else in India. Redundant as this may seem, it’s very common in practice, especially in applications where the microsecond differences in accessing data matter.

So there you have it. We first need to figure out if a deletion request is valid. Once we do so, we need to reach out to our lawyers and figure out what data needs to be deleted. And once we do that, we need to trace down every last bit of that data across all of our databases and delete it.

And that’s just one fraction of GDPR compliance for you.

Clearly, companies will incur substantial costs in reorganizing data to make it easier to handle these requests, and that’s where I believe an opportunity lies. Instead of risking GDPR’s hefty fines and fees, companies could use this opportunity to invest in an alternative, GDPR-compliant, self-sovereign approach to identity.

But how does SSI fix this?

So why does self-sovereign identity solve this problem? One of the main ways it does so is through data minimization. Consider, for instance, an online liquor store that requires patrons to be 21+ to purchase alcohol on their site. Historically, they would have needed to store documentation like a scan of their driver’s license or passport as a means of age verification. However, in the SSI world, a patron of this site can simply give them a verifiable presentation that has no personally identifiable information, or even their date of birth! How? As we’ve discussed in the earlier piece, instead of using a verifiable presentation that gives the site the user’s date of birth, they can simply give a presentation that verifies whether they’re over 21. Beyond data minimization, credentials also come with expiration times, so this allows personal data to have something of a “shelf life”, unlike conventional data, which can sit in some database for eternity.

As we’ve discussed in this piece, implementing the right to be forgotten can be a nightmare under traditional identity systems. From identifying a valid deletion request to identifying what constitutes “personal data” to actually deleting the data, compliance can be a real challenge. With up to 4% of annual revenues at stake, though, companies really can’t afford to misstep. Self-sovereign identity can make GDPR compliance substantially similar through its credential-based model, allowing minimal data to be shared and held.

In the next piece, we’ll continue our deepdive into how SSI relates to regulation, looking specifically at Know Your Customer (KYC) laws. Stay tuned!

__

Sources:

What is the ‘Right to be Forgotten’? — Medium
GDPR and the Challenges of Digital Memory — Harvard University