The thing which bothers me the most isn't the fine structure — steep fines for bad behaviour are fine (pun intended) — it's the so-called 'right to be forgotten.' It essentially makes simple immutable logs illegal. I'm still not certain how we're going to handle it. Encrypting each record with a per-identifier key works, but that makes doing anything with the data as a whole prohibitively expensive.
And there's the whole question of what counts as identifying information. My understanding is that IP addresses do — so an immutable plaintext server log is now illegal.
Privacy is really, really important. But expanding the definition of privacy too far is just too much.
The GDPR allows for reasonable usecases like this. The principles are pretty broad and only stifling once you get into the realm of sharing data between organisations [0]
Such uses are "Appropriate", but you would be expected to maintain this information securely and you would not be allowed to use it for some other surreptitious purpose?
This actually enables further innovations for instance you can log a cars behaviour without a user fearing you might sell this information on to their insurance company.
What upsets people about GDPR is it does away with the notion of data being the holders property and enforces obligations to the subjects of that data.
Of course in this day and age where "data is the new oil" this is upsetting for some.
At the end of the day however this demonstrates how regulation, apart from propping up inneficiet markets also serves to protect the public from rampant commercialisation of every aspect of their lives!
The GDPR allows for reasonable usecases like this. The principles are pretty broad and only stifling once you get into the realm of sharing data between organisations [0]
The GDPR is rather ambiguous about use cases like this. That's part of the problem. Different national regulators are going to have to decide what they consider reasonable and how much they want to enforce. Unless and until they issue authoritative guidance, everyone is basically in the dark on what this really means, and the penalties for misjudgement could be severe if the regulator strongly disagreed with your interpretation.
As for sharing data between organisations, one of the more unwise (IMHO) aspects of the GDPR is that when it comes to the right to erasure, the obligations to notify other parties to which the data has been supplied have guard conditions about reasonable efforts, but there are no such conditions at all on the immediate obligations to delete the data. If you're not covered by one of the acceptable cases (where the scope is, as noted above, not entirely clear yet) then you must be able to delete the data on demand, regardless of the practicality or cost of doing so.
Yes there’s lots of ambiguity and the details will be hammered out over the coming years. You will be subject to audit and noncompliances will be notified. Only the most egregious “bad faith” violators will suffer fines in the near term. I fully expect a couple of high profile cases will be made to get the ball rolling.
> The GDPR allows for reasonable usecases like this.
It allows for simple logs. It does not allow for simple immutable logs, because your customers must always be able to force you to forget about them. There's nothing about sharing data between organisations: the GDPR requires — with certain very minor exceptions — that someone holding personally-identifying information delete it on request.
As an aside, this:
> This actually enables further innovations for instance you can log a cars behaviour without a user fearing you might sell this information on to their insurance company.
… isn't really true: the logger can still sell it, it just may not.
That’s wrong. Data protection legislation expressly forbids a “Data Controller” from sharing information relating to a “Data Subject” with any third party, consent notwithstanding. There is limited scope for a third party “Data Processor” to operate on the data but this is subject to strict conditions including maintaining the integrity of the data subjects privacy. GDPR elaborates on this and provides tougher enforcement mechanisms driven at commission level rather than purely enacted by individual governments. This elimination of wiggle room is what has people running scared.
> > the logger can still sell it, it just may not.
> That’s wrong. Data protection legislation expressly forbids
I'm sorry, I assume that English is not your first language. 'Can' means 'is able to'; 'may' means 'is allowed to.' The GDPR does not allow selling something, but it's still perfectly possible for someone to do so. People break the law all the time.
I raised the issue of can vs. may because it's germane to the issue of immutability. It is possible to build a system in which it is mathematically impossible (within arbitrarily-determined limits) to alter data; it is not possible to design a system in which it is possible to prevent someone from telling someone else something he knows. It's odd to me that the GDPR bans the former, desirable system.
thanks for clearing that up. you were unclear that you were speaing in terms of utility rather than legality. legally you "can't" and "may not". To do so would be in breach of the law.
GDPR has expectations that Data Controllers take reasonable steps to protect a data subject. Theres no expectation that you anticipate all mathematical possibilities...
Log files are easy (from a design perspective). Keep identifying information (IP, UID, DoB, etc) in one table, and activity data in another, with a join linking the two.
When user asks to be forgotten, simply overwrite all the identifying information with generated data. Backups work the same way, but depending on where and how those are stored it could get expensive.
We're seperating personally indentifying information out of our normal backups (held warm -> cold over time), and keeping PII backups offsite as warm data, so we can delete/ammend relatively pain-free.
Yes. GDPR compliance will cost money and I'm not trivialising that. The degree of panic it's causing baffles me though. Sure, it requires design up front. Oh woe. But regulatory compliance always does. Ref. previous rending of hair and gnashing of teeth in the run-ups to Glass–Steagall, HIPAA, Sarbanes–Oxley, and so on.
> The degree of panic it's causing baffles me though.
I think it's a combination of two things. On the one hand, people are justifiably worried whether or not their design and implementation of necessary changes will be fully correct. The cost of getting that wrong will be huge, and laws are generally not known for being clean and unambiguous. On the other hand, you also have people panicking because their user-hostile practices are no longer legal, and so they'll lose money.
You're right, I think. If anyone worries about whether their design is correct or not it's because they've never treated PII with the respect it deserves. This is doubly so for entities engaging in user-hostile practices.
And it's a sad state of affairs.
~12 years ago, in 2005, Kim Cameron first published his 7 laws of identity[1]. Follow those and you'll ace GDPR. Instead we've been directing the sharpest, brightest and smartest minds on the planet at squeezing revenue from advertising, consequences be damned.
If anyone worries about whether their design is correct or not it's because they've never treated PII with the respect it deserves.
Maybe that should be true, but it isn't.
As a personal example, let's look at one of my own businesses that is not even remotely in the aggressive advertising or other data mining space. The only PII we keep is basic records of how our systems are used, who our customers are, what financial things happened, email correspondence, and so on. Nothing even remotely controversial. Nothing that more or less any online business wouldn't have to keep, often due to legal requirements.
And yet in at least one case we're still worried, because having read the actual GDPR, it is far from clear that things like our backup and logging systems aren't technically in violation. We don't know exactly how we would react if a customer did suddenly request deletion of everything we held on them, because we don't know exactly what our legal obligations would be in that situation.
Clearly we couldn't just delete everything, because we'd be required to keep a lot of that data for legal and accounting reasons. So if nothing else, we'd then have to spend hundreds or even thousands of pounds on taking expert legal advice, and then potentially hundreds or thousands more to update our systems to remove whatever we had to remove.
This is a side business run by a few people, part-time, like many other small or microbusinesses all over Europe. We're not dealing with anything like the scale of personal data of the Facebooks and Googles of the world. Apparently our carefully designed systems, which only collect a reasonable amount of data for legitimate purposes and take care to handle it securely, are already doing significantly better than some of the biggest data hoarding organisations in the world.
And yet, the effort and overheads required to address GDPR could pose an existential threat to the business. Tell me how that isn't disproportionate and how it's all clear and been thought through.
Hey I have no idea what your business is, but every case I've come across where people worry about GDPR is clearcut.
As you say, you must absolutely hold onto transaction history for tax purposes, at least! That said, unless you're selling arms or something (if compliance with the law is ambiguous go see a lawyer) you must honour an individual's right to be forgotten. You can do this without deleting records of transactions with that individual. You would however note that you can no longer identify the buyer because of such request x on date y. That should be as simple as deleting PII and then ammending the transaciton record with a comment.
Your biggest challenge here is how to handle the customer record. Do you delete it outright, and point the buyer field in the transaction record at a single GDPR ghost customer, or do you overwrite the customer data with random, non-PII values (I've always gone with the former)?
If your architecture is solid then these change requests should form part of ongoing BAU, no? Submit change request, develop, test, deploy...
You can do this without deleting records of transactions with that individual. You would however note that you can no longer identify the buyer because of such request x on date y.
Instant EU VAT law violation if you do this without retaining the required proof of customer location for at least 7 years. Formal tax investigation(s) to follow if you're big enough to be interesting and doing this systematically, possibly by up to 28 different member states' tax authorities depending on how much tax revenue each suspects that you have avoided paying them. Good luck. :-)
If your architecture is solid then these change requests should form part of ongoing BAU, no? Submit change request, develop, test, deploy...
And again I come back to my recurring theme that for many smaller organisations, almost every term you mentioned in that paragraph might as well be written in a foreign language. Small organisations don't have dedicated staff or established formal processes for running a lot of their day-to-day activities, never mind things like regulatory compliance.
> Keep identifying information (IP, UID, DoB, etc) in one table, and activity data in another, with a join linking the two.
You just made the log no longer simple. Now writing a log line isn't a matter of writing bytes to disk; it's a matter of looking up PII in one table, reading a key, then writing the key and the non-PII to disk. If the definition of PII gets more expansive … whoops.
> The degree of panic it's causing baffles me though. Sure, it requires design up front. Oh woe. But regulatory compliance always does.
You answered your question. The compliance costs, and it's for what is IMHO a non-right. I do not, and no-one has, a right to demand that my IP activity be erased. I do not, and no-one has, a right to change history.
Much of the rest of the GDPR is wonderful stuff, but the so-called 'right to be forgotten' is Orwellian.
If it contains full IPs or other identifiers, then yes it does. You should either not collect full IPs in the first place, or clean them after some time e.g. during logrotate (Since you might need them for a while, e.g. to detect or count abusive requests, which likely would be a valid reason to keep them around temporarily)
> For HTTP logs, allowed use would be e.g. stripping the last octet of an IPv4, or stripping the last 64 to 80 bytes of an IPv6.
> That’s generally not identifying a single person anymore, and usually good enough for anything else.
'Usually'? Even if true (highly doubtful), that's not the same as 'always.' The whole purpose of logs is to be truthful accounts of pertinent data. A full IP address is a pertinent datum.
I'm going to step up on my soapbox and assert that any law which forbids me from indelibly recording that 192.0.2.17 requested /all-your-records-are-belong-to-us is a bad law.
I'm banned from recording it immutably, which is the only proper way to record a log (it should be impossible to alter a log after it's written).
If I want to record that a particular address accessed my system forever, that is my right.
Interestingly, the GDPR exempts records required for legal compliance. So it's okay to hold onto data for the law's purposes, but not my own? That's a bit one-sided.
I'm still trying to wrap my head around the GDPR myself, but I think (and would love evidence to the contrary if wrong) that IP addresses, certain device attributes, and the like are okay if they're required in order to ensure (or protect against threats to) "the availability, authenticity, integrity and confidentiality of stored or transmitted personal data". [0]
I don't really know about relational data, where a true "delete" would destroy someone's database. Somebody told me that, if hard deletions aren't technically feasible, then soft-deleting (or marking as archived or whatever) is acceptable. Not too sure I believe them though -- simply replacing the sensitive data with garbage and keeping the relations seems like a better idea.
The UK guidance you mentioned there appears to apply to the DPA (the current legislation in this area) and not the GDPR (the new legislation with the stronger rights for deletion etc.).
huehehue was interested in what the GDPR might mean by erasure and deletion. This is the ICO's opinion about it. It'd be a surprise if they suddenly interpret erasure and deletion differently: There would have to be something in the GDPR that suggests that this isn't good enough.
Encrypt all person-specific data with the a key unique to that person, and if the person requests deletion, delete the key. This effectively deletes all backups.
For many businesses that won't be straightforward, it will be rocket science.
Even if your business does have the understanding and technical ability to do it, it adds a whole extra layer of complexity and unreliability to what used to be simple, plain text logs -- exactly the kind of information you probably need to access quickly and reliably if you're in the middle of fixing a major fault, for example.
> Even if your business does have the understanding and technical ability to do it, it adds a whole extra layer of complexity and unreliability to what used to be simple, plain text logs -- exactly the kind of information you probably need to access quickly and reliably if you're in the middle of fixing a major fault, for example.
But do you need to keep them forever? If you delete logs after 30 days you are unlikely to be impacted anyways.
We have server logs going back for years. Moreover, those older server logs have provided valuable information on several occasions for detecting abuse of our systems, attempted fraud, etc, so they are demonstrably useful for legitimate business purposes.
We also have backups going back on a staged basis, more frequent backups from the recent past, less frequent going back further. This has been useful more than once for retrieving older information that someone had accidentally modified or deleted and not noticed immediately, so is also demonstrably useful for legitimate business purposes.
Both of these appear to be at risk of conflict with the right to erasure under GDPR, in that for example old backups of emails will inevitably contain customer correspondence that can't readily be isolated.
How can you possibly know that without knowing anything about our business, what our logs contain, or what kinds of threats we face where evidence from the logs supports us in detecting abuse, countering formal disputes, or even legal proceedings?
The trouble is that the definition of PII could be interpreted so broadly as to include almost anything useful ever logged on a server, because if a log record references any data that could be associated with a specific individual, including in combination with other data, then it counts. Given what we already know about de-anonymisation of supposedly anonymous data sets, even just based on quite simple patterns and correlations in the data, any approach based on pseudonymisation is likely to be simplistic and open to challenge.
In any case, what is reasonable for a business to want to do with that sort of data? Can we analyse which content has been most popular on a web site over the past week/month/year/decade? Can we analyse which content a particular paying customer has been accessing, in order to promote a new plan as they approach a limit or suggest a more cost-effective one if they aren't using what they're paying for right now? Can we analyse access patterns for an account over a period of time to detect shared use of an account contrary to our terms? There are plenty of legitimate business purposes for which you might want to know the entire history of access for a particular account and which do not violate the privacy of the user in any unjustified or unfairly exploitative way, but again, any method based on log pseudonymisation will immediately fail to satisfy those requirements.
Nothing about this discussion is even remotely as clear and straightforward as some in this discussion are suggesting.
That isn’t easy from a technical perspective. The whole point of a backup is to restore to a state from a previous point in time... now they want future state (the “right to be forgotten” request) to modify how a backup should be restored?
No. The whole point of backups is to protect against hardware failures and unforeseen software bugs/defects.
If someone sends you something in the post telling you to delete their data, it should not be difficult to simply keep that record available when you need to restore an unrelated machines backup, unless you have incompetent IT.
This is part of the reason many GDPR consultants suggest applying an IT standard like ISO so that it’s clear you already know what backups are for and how to use it.
A single machine/database with all your business data on it mixed with personal data and the only record of requests from subjects is criminally irresponsible.
> The whole point of backups is to protect against hardware failures and unforeseen software bugs/defects.
And to then restore to a point in time before the software or hardware bugs occurred, no?
> A single machine/database with all your business data on it mixed with personal data and the only record of requests from subjects is criminally irresponsible.
Um what? There are ton of good reasons to stick to a single DB, it’s not unreasonable at all. The main one being that it massively simplifies ensuring data integrity compared to a distributed system. Also it’s just easier for a small team to manage.
A single point of failure isn't a good way to protect people's personal data that you have, but even small teams are unlikely to have a single golden source of data. Data tends to flow from one system into another.
For example:
If someone sends you an email asking to be deleted, why would you store that email in the DB and nowhere else? Your team would read the email then update the database.
After a crash, if you then restored a backup, why wouldn't you be able to review your last days email (since the last backup) for any updates? Of course you would. Even small teams could.
If someone sends you something in the post telling you to delete their data, it should not be difficult to simply keep that record available when you need to restore an unrelated machines backup, unless you have incompetent IT.
"Who's IT?" -- Small and micro-businesses across Europe
This is part of the reason many GDPR consultants suggest applying an IT standard like ISO so that it’s clear you already know what backups are for and how to use it.
"We can't even afford a GDPR consultant. And who's IT?" -- Small and micro-businesses across Europe
A single machine/database with all your business data on it mixed with personal data and the only record of requests from subjects is criminally irresponsible.
Again, this is simply detached from the reality of many, many small and microbusinesses, community organisations, non-profits, etc.
It's absurd to think that every organisation that has reasonable need to handle personal data has a dedicated IT team, the people and budget to drop a few thousand on professional advice, the budget to buy extra equipment that isn't otherwise needed, etc.
It's also absurd to think that without those things, an organisation can't take reasonable precautions to protect that data or prevent its exploitation for illegitimate purposes.
Unfortunately the GDPR (as often happens with EU legislation) makes little allowance for any of this and thus imposes absurdly disproportionate overheads on smaller organisations.
A flower shop owner who runs his own website off a laptop in his shop unattended but plugged into his DSL using Microsoft Personal web Server and Windows 98, who records customer data in that form without explaining to his customers that he's a fucking idiot deserves what he gets. It's a shame perhaps that the European courts probably won't bother with him.
But the photographer that hosts her website and help forms on wix[1] isn't going to have any problems. The local restaurant using viewtouch PoS and a facebook page isn't going to have any problems[2].
Very few real people will have any problems, because the GDPR is fundamentally about tradeoffs: If you have a lot of personal data, you need to protect it and inform the subject what you do with it. If you can't afford to protect it and think that the subject wouldn't want you doing what you're doing with their data, then you probably shouldn't have that personal data.
Sure. And if that same human error can also delete all the email and postal mail that Mr. Smith sent to your company, then your IT team is criminally irresponsible.
Do IPv6 addresses count as identifiable? Because they really shouldn't if you understand how SLAAC works with privacy extensions (the default in almost every client stack).
Right now I always get the same IPv4 address, which uniquely identifies my traffic. After a switch to IPv6, I'll likely always get the same /64 or /56 subnet, that uniquely identifies my traffic. Randomizing some of the address doesn't help if another part of it is already enough to identify.
Most people don't always get the same IPv4, they get a random one from the ISP, though it is still identifying.
IIRC zeroing the last one or two groups of IPv4 and the /56 or /48 of the IPv6 address block should be fine.
Alternatively you can use a short hash function (xxhash32 cut to 16bits for IPv4 or pure xxhash64) to effectively destroy most of the data but still get a good pseudo-identifier.
It identifies your network segment, but not you as an individual. I guess if you lived alone that could only be you, but it would only identify your household in a family situation. One could make the argument either way.
It ultimately depends on what you’re doing (and can do) with those IPv6 addresses.
If you can identify someone then it probably is. That’s for example if you have a web form with their details filled in— just make sure they know you’re keeping their IPv6 address for the avoidance of doubt.
If you can give it to someone like their ISP, and they can identify someone for you, then it might be: especially if you have a relationship with that ISP where you do that.
Honestly i think the GDPR is a great idea, unfortunately after going through 3 different firms who are supposed experts on the GDPR. We are still struggling. When you ask them about a possible situation, and the response you get is "that is kind of a gray area", to "that is open to interpretation".
For example nobody has been able to give me a clear definition of what counts as identifying information. I have heard IP addresses count, so what about our cloudfront logs? They have the client ip address. If i pipe them straight to a logging SaaS, is it on the logging service to handle it? Or is it on me?
Furthermore for the right to be forgotten, how do i handle that with backups that are not under our control? E.g. AWS RDS backups? Amazon says there is no way to modify those backups, and our TAM has suggested managing our backups our selves.
What about data that our customers pipe to our service, how is that handled by the GDPR, it is all encrypted. But apparently we need to handle cases, but we don't have the decryption keys. Is it on the customer who uses our SaaS or is it on us? How do these scenarios work?
While i feel more and more we are ready for the GDPR at the same time i am terrified. I feel many of the laws can be interpreted far too many different ways which makes me uncomfortable.
Allowing the court to decide makes it possible to interpret degrees of wrongness and malice of intent.
IP addresses may be personal data if you have a database containing personal data (like a web form) and the IP address. Don’t have that database, or only make it accessible to your firm and processors to verify data integrity/protect against fraud: no problem. Dropping the IP column from that table after a month is probably easy enough and good enough.
The GDPR also doesn’t mandate encryption. It says you’re responsible if you get hacked and could’ve prevented it. Applying an IT security standard (eg ISO) is just one (probably easily) defensible way to do it. Not getting hacked is another.
Your consultants should fully understand your business and data flows: individual things taken in isolation are a “grey area” but looked through the lens of your whole company, actions and intent, versus value to the subject themselves (instead of just your customers) is what the European courts will do if they decide to give your company attention; they will not nit.
The GDPR strikes me as an urexample of regulations gone wrong. If you read some of the later posts in this series, one thing they say is that "the regulation can't be that onerous because this is all best practices anyways." But the reason the regulations are onerous are because what qualifies as falling under the purview is indefinite, and the regulators don't want to clarify so they can use the ambiguities to punch the punching bags when the opportunity arises. So the risk-averse have to assume that the requirements stretch to the truly insane (e.g., requiring an email for anti-spam blog commenting purposes triggering these provisions) with the only guidance really being "we're not really targeting you; we just care about the punching bags."
> If i pipe them straight to a logging SaaS, is it on the logging service to handle it? Or is it on me?
That’s not gray at all. That’s very clear cut. The SaaS acts as a data processor if it gets IP adresses or other PII.
> Furthermore for the right to be forgotten, how do i handle that with backups that are not under our control?
You need to ensure PII in your database is encrypted already with a per customer key and have these keys be backed up separately where the retention is low.
The gdpr does mention encryption and pseudonymisation. It does not mention ip adresses because the law is very clear about anything being in scope that can be used to identify individuals.
We had very long discussions with our lawyers to make our email analytics solution [1] compliant to the very strict german privacy laws and GDPR.
Ip addresses are personally identifiable information because someone is able to find the real person an ip address belongs to - your ISP. That you have no possibility to get them to hand you over this information and it wouldn‘t be legal makes no difference. It‘s a big joke.
The solution is to anonymize the last ip adresses octet. You can (within some boundaries) work with the full ip address.
In your case technically you would be violationg these rules because you are saving person identifiable information without your customer aknowledging this before. And you even transmit these information to other services. The joke is, this is how the internet works and this rule is practically not enforcable for everyone (we solved it for our solution). The least you could do to be a little bit compliant is to state in your rules that you are logging the customer‘s ip address and tell them every service you transmit this information to for transparency reasons.
For the backup case with AWS RDS it‘s again not practical. The law has not been made by texhnicans. The simple answer is everyone is violating the law. In my opinion it will take 2-3 years and all these very openly stated rules will be clarified and in the end will be more logical.
Until this day you could store an information which information should be deleted in another database and if AWS needs to restore your main database you use the seconds database information to delete the information again.
You can then state its technically not feasible to delete some information from backups that you did your best. But then you should not store backups for 10 years, they should have much smaller timespans.
> Honestly i think the GDPR is a great idea, unfortunately after going through 3 different firms who are supposed experts on the GDPR. We are still struggling. When you ask them about a possible situation, and the response you get is "that is kind of a gray area", to "that is open to interpretation".
In the USA, this would be an opportunity for entryism by opportunistic outside consultants who would propose all manner of insane policies, and hang the threat of noncompliance over management's head like a Sword of Damocles if they are not followed. See Sarbanes-Oxley compliance for a relevant comparison and note that SOX is probably one reason why we've seen a precipitous drop in IPO rate in recent decades.
But the US regulatory infrastructure is, compared to civilized countries, from fucking Mars, so I have no idea if it will hold for GDPR.
Unfortunately, the standard MO for the EU is to hang the Sword of Damocles over the head of small organisations itself by passing regulations obviously aimed at problems that normally come with much larger organisations. It does it with VAT. It does it with the consumer protection rules. Now it's doing it with the privacy and data protection rules. The moderating effect is that organisations too small to have dedicated in-house staff to deal with these kinds of issues are probably too small for resource-constrained regulators to bother with.
If you're responsible for one of those small organisations, your choices are typically to try to stay on the right side of the law but in doing so accept disproportionate overheads for something that probably won't ever matter, or not to try, not to incur the overheads, and to hope that you don't get caught and penalised, which realistically you probably won't unless something happens that is so catastrophic that your organisation has already ceased to exist anyway.
As far as smaller businesses are concerned, these kinds of rules hand a big competitive advantage to those who don't make a good faith attempt to follow them, at the expense of those who try to do the right thing legally speaking. As someone who is generally a strong believer in doing the right thing in business, and in fair consumer protections and strong privacy rights from a personal point of view, I find this outcome infuriating.
I haven't really dug into it being in the USA, but from what I've heard the big takeaway for my clients might be: "You can never again accept a European as a patient, and for your own protection you might need to dismiss any European patients you have." I'm not really worried about legitimate cases where someone wants their data removed, as that seems unlikely. I'm less sure about the risks of some enterprising young lawyer over there finding a way (or a possible way that would end up thrown out) to leverage it to extort money.
Read at its bluntest, as I understand it this law seems to basically ban backups, at least those retained for any significant amount of time. Have a customer/patient database on a cycle of daily/weekly/monthly backups that get pruned as they get older? You need to be able to purge any individual who requests it from those backups. Have regulatory or contractual requirements on keeping records? Well, that's what the courts are for, hope you like the attorneys you're going to be paying. Heck, for medical practices I've heard of insurance companies doing clawbacks of payments years after the date of service - what does a practice do if it serves a European with temporary insurance, gets a request to purge that person's data, then gets a request for clarification from an insurance company?
Perhaps this is a complete non-issue for US companies, but I'm not so certain of that that I'd be willing to just dismiss it out of hand and if it's not dismissed then parts of it are pretty scary.
I’m currently doing GDPR consulting for an American company.
The GDPR doesn’t ban backups. You must inform the subject of how long you keep those backups though. If payment processing stays open for years, then years is fine. I have a customer who keeps data for ten years.
The GDPR doesn’t say that a European can claim “right to be forgotten” and force you to destroy invoices proving you did Work for hire.
If for some reason they want you to stop processing (right to object) then they end up owing the money themselves- since they can’t force you to forget the invoice, having you unable to send their invoice to insurance sounds like a bad move on their part.
Reading the GDPR “at its bluntest” is not the correct way to do things: European courts are not amused by a narrow view of the law when it is used to argue something is technically legal, not do they use their maximum force for minor infraction.
That’s why you don’t need to worry about an “enterprising” European taking you to the cleaners because the courts won’t think it’s funny either.
It also means that if you know what personal data you have, what you do with it, disclose to the subject those things including who specifically you share it with, and protect that personal data like your business depends on it, then you’ll be fine because the point of the GDPR is that this is what you should be doing.
The GDPR doesn’t ban backups. You must inform the subject of how long you keep those backups though.
And if you use a deduplicating backup system -- and you may have few viable alternatives at small scale -- the answer to that latter part will be "indefinitely". So yes, the GDPR does effectively ban backups, at least in this sort of scenario.
Reading the GDPR “at its bluntest” is not the correct way to do things
And yet there is nothing else you can reasonably do in the absence of more specific, enduring and legally binding guidance from your national regulator, and even that guidance is at risk of being overturned in court at European level.
It also means that if you know what personal data you have, what you do with it, disclose to the subject those things including who specifically you share it with, and protect that personal data like your business depends on it, then you’ll be fine because the point of the GDPR is that this is what you should be doing.
Nope, because the right to erasure is not negated because of those sorts of common sense, only specific criteria listed explicitly in the GDPR. And as things stand right now, that right to erasure immediately runs into all the issues of uncertainty raised by many in this discussion and other forums.
> And if you use a deduplicating backup system -- and you may have few viable alternatives at small scale -- the answer to that latter part will be "indefinitely". So yes, the GDPR does effectively ban backups, at least in this sort of scenario.
If the blocks are deduplicated because the reference count of the bit map remains greater than one then I don't see the problem: Just because John Smith can ask you to delete him from your database doesn't mean he can ask you to delete all John Smiths.
Also: European courts aren't amused by this kind of nonsense.
> And yet there is nothing else you can reasonably do in the absence of more specific, enduring and legally binding guidance from your national regulator, and even that guidance is at risk of being overturned in court at European level.
Sure you can. You can look to the historical application of various other similar laws and get an idea of what the regulators are likely to do.
For what it's worth: European courts cannot be legally bound to do anything. If they think you're being an asshole but are "technically within the law" they'll still fine you.
That's just how it works here: Hurting people is simply not ok.
> Nope, because the right to erasure is not negated because of those sorts of common sense
It's not "common sense" at all: Being afraid of the government and the police isn't something civilised countries have to deal with.
Why would anyone think that the government is trying to hurt people if everyone agrees hurting people is not okay?
The at least theoretical problem I face is that there are potentially conflicting laws. Sure, that's why courts exist, but courts are expensive. That's why patent trolls try for "settlements" of less than $50k - it's cheaper than defending a case.
Theoretical case: 5-doctor pediatric practice running an on-site EMR. Backups include fairly short-duration full images for ease and speed of disaster recovery, plus longer-term file and database backups such that basically no scanned document or received fax will ever be lost from the backups and end-of-month database backups will be retained indefinitely. Beyond a year or two it will be possible for incremental changes to be lost if multiple changes happen within a single month.
The practice is required by US law to retain some of that data for years (7, 10, until age 18, 7 years after 18, whatever) or indefinitely (e.g. vaccination records) even if/when the patient leaves the practice. With paper charts that meant that inactive charts went into long-term storage, then eventually were shredded after the expiration of that time period. I'm not sure how vaccination records were kept, but I suspect there were either copies both in the chart and in a separate vaccination-only list or that they were pulled from the charts either before storage or before shredding. There were no backups.
With EMRs, there's no good way to pull a single "chart" out of those backups. There are also not separate systems for tracking things depending on how long they have to be retained. In the case of a pediatrician's office then the answer to "how long do you keep the records" is "until the practice closes, and at that point they must be passed along to another practice."
As for the case of destroying invoices, it's not that simple. The "invoice" would be the bill sent to the insurance company or patient including diagnostic and procedure codes, but the insurance companies may come back even years later for "chart review" on charges they've paid and in that case they're not looking for the codes they're looking for the supporting documentation in the patient chart.
Finally, regarding "enterprising" attorneys wasn't there a whole situation some years back where third party attorneys in Germany were going after alleged copyright infringers "on behalf of" companies that had never heard of them? My Google-fu failed me on this, but I didn't spend a huge amount of time looking. Is there a comparable risk here? I'm not worried about someone actually getting hit with a 10 million Euro fine, I'm more worried about that annoyance-factor "Remove my client's data from the data that you are required by law to keep, come to Europe to defend your failure to do so, or pay my client to abandon this action."
> That's why patent trolls try for "settlements" of less than $50k - it's cheaper than defending a case
FYI: In Europe, if someone sues you and loses, you can usually recover your costs.
> Theoretical case: 5-doctor pediatric practice running an on-site EMR. Backups include fairly short-duration full images for ease and speed of disaster recovery, plus longer-term file and database backups such that basically no scanned document or received fax will ever be lost from the backups and end-of-month database backups will be retained indefinitely. Beyond a year or two it will be possible for incremental changes to be lost if multiple changes happen within a single month.
There's no need to worry about this. Simply record a list of who wants to be forgotten and consult it when restoring from backup.
> The practice is required by US law to retain some of that data for years (7, 10, until age 18, 7 years after 18, whatever) or indefinitely (e.g. vaccination records) even if/when the patient leaves the practice.
Explain to the europeans that visit your on-site EMR that you are required by law to keep data in the way you just explained it.
> With EMRs, there's no good way to pull a single "chart" out of those backups. There are also not separate systems for tracking things depending on how long they have to be retained. In the case of a pediatrician's office then the answer to "how long do you keep the records" is "until the practice closes, and at that point they must be passed along to another practice."
That's perfectly fine.
Make it a process to contact the Europeans every ten years to make sure they haven't moved to another practice. If you can't reach them, then discard what you're legally permitted to discard. Publish this policy.
> As for the case of destroying invoices, it's not that simple. The "invoice" would be the bill sent to the insurance company or patient including diagnostic and procedure codes, but the insurance companies may come back even years later for "chart review" on charges they've paid and in that case they're not looking for the codes they're looking for the supporting documentation in the patient chart.
You're not required to destroy invoices under the GDPR.
> Finally, regarding "enterprising" attorneys wasn't there a whole situation some years back where third party attorneys in Germany were going after alleged copyright infringers "on behalf of" companies that had never heard of them? My Google-fu failed me on this, but I didn't spend a huge amount of time looking. Is there a comparable risk here? I'm not worried about someone actually getting hit with a 10 million Euro fine, I'm more worried about that annoyance-factor "Remove my client's data from the data that you are required by law to keep, come to Europe to defend your failure to do so, or pay my client to abandon this action."
You would ignore such a claim.
In a legitimate claim, the person would complain to the regulators and the regulator would decide what action to take.
An invoice is either a contractual or legal obligation, which takes precedence over the data subject rights. Even if you are asked to destroy the subject personal information, that information should never be included and is against the law.
Eh... that's the voice of one regulator and you never know who may decide to be more draconian with the regulations. My suggestion is to read it yourself and think carefully about the various data protection and portability requirements. IMO, all it takes is a small band of trolls who want to harm your company to start to cause problems. Here's a clear way to read through it: https://www.privacy-regulation.eu/en/index.htm
> IMO, all it takes is a small band of trolls who want to harm your company to start to cause problems.
I suspect strongly the courts will use their energy to go after the biggest most flagrant violators of European law first: the equifaxes, facebooks, googles, and so on.
The fines are there for those companies who continue to have a flagrant disregard for IT security practices, or who try to make personal data their business by trying to trick the subject instead of offering real value in trade.
If you’re not trying to play legal chicken with privacy law, you know what personal data you have and protect it like your business depends on it, then you’re probably going to be fine.
The facebooks and googles can afford to follow GDPR. If you know some people that work there, you'll realize there have been multi-year efforts to become GDPR compliant in those companies.
It's the small companies who literally can't afford the staff to properly implement the GDPR. It looks like it's turning into a regulatory barrier that protects the big guys and scares the small guys away.
There is nothing in the GDPR that says: "if your a small < 5 person corp not in europe that isn't a subsidary of another corporation, then the GDPR does not apply (ie: Joe's blog, Sue's landscaping)". Or "If your revenues & employee counts are within this band, then the maximum fine can only be applied on a percentage basis and not millions.". IF these regulators really didn't want to scare away the small guys, they would of put that into the law.
"Don't worry, the european courts are nice people" is not a strong enough guarantee for most people and corporations.
“Following the GDPR” is easy for most companies because the amount of personal data they keep and have at risk is extremely low. The ICO has some great guidance for them:
> “Don't worry, the european courts are nice people" is not a strong enough guarantee for most people and corporations.
The GDPR is already law and people, and corporations are still here.
That’s how courts work in Europe. They can fine you for hurting consumers or putting them at risk even if the specific actions you’re taking aren’t necessarily coded and specified.
As an American living here I understand how strange that can seem, but the advantage is that companies getting prepared for the GDPR are treating the personal data they keep as a liability instead of treating it as an asset.
This kind of subjective enforcement annoys me. In the meantime, in my risk assessment I have to weigh the largest possible risk. It is quite strange how often many EU laws are defended because of enforcement and not the letter of the law.
If they are only going after the most flagrant, codify it. If they are not going to put SMB's out of business with a maximum fine, codify it. This article is a "trust us" which I abhor in lieu of more specific punishment rules. The rest of us running SMBs and not needing more customers in the EU are just not going to gamble with what side of the bed the punishers wake up on.
It is very difficult in my experience to make lawmakers and lawyers experts in digital marketing, telesales, medical processing, machine learning, banking and anti-money laundering and so on, so a lot of our regulation is just “do your best” and “don’t be a dick” anyway. This works remarkably well- CEOs of billion dollar companies are asking if their IT/IS systems are actually secure which is a massive improvement over what I used to see: “are we compliant”
Well Equifax was probably compliant.
I much prefer companies seek more than sheer profit, and to have restraint when it comes to putting peoples lives and livelihoods at risk.
> (we) are just not going to gamble with what side of the bed the punishers wake up on.
With all due respect, unless you go to war (which has different risks), or you ask everyone if they’re European when they reach your website (ie before you’ve given them a cookie or logged their data in your access logs; or telephone and so on) you’re going to have to. The GDPR is already law.
The good news is that European law is rather fair, and if you’re not dealing with personal data as a product you probably aren’t going to be affected at all.
One thing I don't know is: Is this a law? Is there judicial oversight? Is there judicial approval first?
If this is something that a motivated executive can apply unilaterally, it's bad.
If this is something that can only be applied after a judicial system rules on compliance and gives you the opportunity to fix the issue, then it's probably okay.
GDPR has been in "preview mode" for two years but it is currently the law in many countries. It will start being enforced in May so you already have had two years to fix the issue if I read you correctly.
Extraterritoriality is new. Most websites are probably not in the business of complying with the law in random EU countries where they have no legal or logistical presence. HN, for example, doesn’t.
I'm as Remain as the next person but I think extraterritoriality is a really bad idea whatever the circumstances. But it's not as if this is the first instance of extraterritorial law, and it's not like the US doesn't tend to assume its law applies everywhere too (e.g. gambling).
To me it seems to have parallels with tax law and AML. If it's not extraterritorial, a big business of tax havens and money laundering springs up. The extraterritorial GDPR rules are there to prevent "data laundering" similarly.
Well, let's not pretend that it will stop EU countries from trying to engage in this glorified rent seeking exercise cum morality play against companies everywhere.
The case where CNIL told my company to do remove information we had collected on our site. We are not based in the EU nor have any assets they can exert any influence over in the EU.
We promptly ignored the request. But did have some private words we exchanged amongst ourselves at the futility of such request.
It doesn’t matter if you are based in the EU or not. You are already required to register for VAT as a non EU company and that’s one of the many ways in. Privacy Shield is another.
You are already required to register for VAT as a non EU company and that’s one of the many ways in.
By that logic, any EU company is also subject to the tax laws of every other jurisdiction in the world, whether or not it has any presence there. No doubt the governments of all the other countries seeking to impose sales taxes and the like would be in favour of that. Anyone hoping to start a small business selling online would probably take a different view, though, so if you want people to be able to start new businesses, your national governments in the EU had better be on their side on this one.
You are still legally required. Pretending that it’s okay to ignore your legal obligations because you think they have no leverage does not seem like a sensible thing to do. Neither morally nor legally unless you are willing to write off the EU market.
As a non EU person or company with no legal assets in the EU, even if you or any EU institution says this, there are no repercussions from us continuing to engage in the activities (collecting data people put willing into the open) because there is no ability for the EU to enforce these laws upon us, short of arresting us outside of EU's borders.
In order to stop us technically, the EU must pass laws that disallow their citizens to give information to platforms of various forms that could be used in various ways to amass data upon them as well as against allowing their citizens from accessing such information on sites like ours that mine crypto based on the time they spend on the site engaging with such information.
>are willing to write off the EU market.
We don't discriminate against our site visitors; we want to collect, correlate, cross-reference, and process it all and put it into the public to be used however people everywhere see fit, just like governments do today!
There are years worth of backtaxes for failure to pay VAT. If you think you can skirt it for forever that might work for you. Very few companies are like that though and even small companies I know did not fare well with this approach. It’s your responsibilty as a business not to accept customers from countries you cannot do business with.
//edit:
> We don't discriminate against our site visitors; we want to collect, correlate, cross-reference, and process it all and put it into the public to be used however people everywhere see fit, just like governments do today!
>Someone will eventually drag the topic to your door.
This has already happened, I expect it will continue to happen as along as people engage in behaviors contrary to their signaled beliefs.
>European end users care and so do businesses doing business with them.
If Europeans visiting our site is now considered doing business with them, then this is an example of European user behaviors acting contrary to their signaled beliefs.
"So right away, let us cast aside the technological protocols, that are usually referred to as “the internet”, that of which was built upon that make accessing or publishing information public between two or more machines…
Because talking about such things would require most internet users to cast aside social constructs they willingly suspend on a daily basis upon engaging with such technology/services (without any care to understand for oneself, one might add) and then demand collectively in retrospect to have their cries pacified while continuing to use such services (of which, most for free).
Yup, let us look past all that and believe (because that’s all we can do for ourselves) that institutions/organizations/companies/governments, that all consist of our fellow human beings in all of our qualities and flaws, can provide for the individual that which he chooses not to do for himself, to a satisfactory level in which his desires are forever coddled and placated." "The Banality of Privacy As A Service" [0]
Processing data about political views is not permitted under GDPR without prior approval that HN certainly doesn’t have. Your position commits you to supporting evisceration of YCombinator by the EU for accepting your comment into its database.
The GDPR was specifically sold as limiting the things that well-known US tech companies (Facebook, Google, Twitter, etc.) can do with respect to EU citizens. The sad irony is that only well-resourced tech companies with a small army of lawyers and a large army of programmers can afford to be GDPR compliant.
The sort of unintuitive machinations it takes to maintain honest compliance while providing useful services is kind of mind-blowing. Every bit of it that I've delt with has left me depressed about what this will do to small companies and innovation. Facebook will have no trouble at all being GDPR compliant, but your average 50-person startup or small-town business hasn't got a chance.
I’m currently doing some GDPR consulting for an American company.
I don’t think the GDPR is as difficult as you suggest. The biggest problem companies seem to struggle with is this idea that the personal data they keep isn’t theirs and they need to protect it like anything else in their possession that isn’t theirs.
Then, there’s also the issue that Americans aren’t used to the idea that a European court might think they’re in their jurisdiction, and they don’t know how to interact with a European court. Treating them as adversaries (as is often done in the USA) doesn’t go well. The courts basically decide if you fucked up and did harm that you could’ve prevented, and not if you were technically against the law.
Are you treating someone’s personal data the way they would want you to?
Really?
If so then you’re probably better than 90% of the way there.
I don’t think the GDPR is as difficult as you suggest. The biggest problem companies seem to struggle with is this idea that the personal data they keep isn’t theirs and they need to protect it like anything else in their possession that isn’t theirs.
You keep writing things like this, and I'm not going to just post the same reply every time, so let's try another one here.
Let us assume for the sake of debate that Privacy Shield will at some point be struck down by the courts, like Safe Harbor before it, since the fundamental objections involving US government access have not changed.
At that point, please explain the conditions under which an EU business can share PII with a US business without violating the GDPR.
Sure thing, that's described in articles 44-50.
In short:
1) if the EU has declared a country "adequate", you can transfer data (there is a list of adequate countries. Canada is on it, the US with Privacy Shield too)
2) in absence of an adequacy decision, there are other possibilities: binding corporate rules (internal rules for data transfers within multinational companies[1]), contractual arrangements (for example, the EU approved clauses), adherence to a code of conduct with a binding commitment (look at this like some kind of "privacy certification")
3) Finally, if the above are not possible, a transfer is still possible if the subject gives consent after being informed of all risks.
So, for the sake of debate: I would go with either binding corporate rules (in case of a multinational) or contractual arrangements.
This is exactly the sort of scaremongering this article was trying to address.
Being compliant is about not abusing personal data. I think small companies will find that much easier to do than bigcorps whose business model is based on privacy invasion, like facebook or google.
I was not joking or sarcastic. They are a regulatory institution. They get paid to look after companies, so that those keep to the law.
Presumably, the workers at this institution do not get paid for how high the fines are that they can push through. Nor does their office get paved with gold floors. So, they have no incentive to swing the biggest stick they have. There's no concrete reason to distrust them here.
Really, part of their job description for which they do get paid, is probably that they should not fine companies inadequately. If companies leave the country or even just do less business there, that's probably more damaging to the country than recieving that one-time 4% of turnover.
I think this ignores the reality of who populates these beuraucracies. I worked for the DoD alongside the FBI, NSA and others and the sheer delight that case agents and officers get for "hammering" people/entities under scrutiny cannot be overstated - it's part of the ethos.
It has little to do with how much they get paid. It has everything to do with wielding power backed by a state enforcement entity.
Wonder how much of a secondary consultancy market this is going to create. I can just see "GDPR consultant charging $450/hour with a minimum 2 weeks worth of work" job listings.
It would be interesting what companies do about it. Will any of them pull out of the EU market. Or maybe make users sign away their rights on their next login "If you use this service, you give us permission to use your data in this and that way which may or may not be GDPR compliant, if you refuse we'll delete your profile and you'll have to find an another service to solve your problem".
As a consumer, I like what GDPR is doing. As a developer having to implement compliance for it, it is a huge pain.
> Or maybe make users sign away their rights on their next login
The GDPR explicitly disallows this. You're not allowed to make access to your service conditional on consent because it's considered to be "not true consent".
Expanding on this, official guideline on consent [1] gives 2 relevant examples. TLDR: you cannot offer no, degraded or higher-priced service if consent is not given. I would say this is effectively banning targeted ads as revenue stream. IMO this goes too far.
[Example 1]
A mobile app for photo editing asks its users to have their GPS localisation activated for the use of its services.
The app also tells its users it will use the collected data for behavioural advertising purposes. Neither geo-localisation or online behavioural advertising are necessary for the provision of the photo editing service and go beyond the delivery of the core service provided. Since users cannot use the app without consenting to these purposes, the consent cannot be considered as being freely given.
[Example 6]
A bank asks customers for consent to use their payment details for marketing purposes. This processing activity is not necessary for the performance of the contract with the customer and the delivery of ordinary bank account services. If the customer’s refusal to consent to this processing purpose would lead to the denial of banking services, closure of the bank account, or an increase of the fee, consent cannot be freely given or revoked.
And IMO this is exactly right, and I'm actually surprised people tolerated the current state of things for so long. The examples you quoted are clear-cut cases of abusing the user.
Marketing is cancer on society, and it infected everything deeply. I welcome anything that changes the incentives so that it's less rather than more profitable to try and suck in every piece of data you can get your hands on.
I agree with the spirit of this but can’t help but believe that the Googles and Facebooks if the world will just ignore the intent and get away with it. Their profit margins basically require them to do so.
And to calm the people needing to implement the GDPR in companies: This does not apply, if you actually need the data to provide the service. For example, in an online shop, you obviously need to process the address of your customer in order to ship them whatever they bought. You also need to store their name and such, because law requires that when doing financial transactions. So, if the user does not want to provide you these, you do not need to make the impossible possible.
More interestingly, I wonder how many sites, even with the best of intentions, just stop selling to EU customers instead of guarantee compliance. I know if I was in a startup with very limited funding and customer reach was not my problem, I'd just as soon not sell on the continent. Once I can absorb the risk or I need more customers that'd change of course, but while you have your head down working and maybe you have a limited beta or something, it's clear what citizens carry more baggage with their usership.
Many, probably, and it's completely understandable. But I don't think it's going to be that big of a problem for European consumers. For each site like that, I expect EU-based competitors to pop up and fill the space with services that follow GDPR.
Our company receives sensitive data from external members by email, which is considered insecure. So we need to buy or create our own solution to receive these files encrypted.
Does anyone here have github repo to such a solution?
Whilst the point about this stuff being well known is sound, I shall just point out that FTPS alone is quite likely (according to circumstances of course) to be not enough. I know of at least one system that already used FTPS for a decade or so that now has to be modified to use GPG encryption as well, for GDPR compliance. Security of data once they have arrived is as important as security of data that are in transit.
Scott Helme has a self-hosted contact form which uses his PGP key to encrypt messages from website visitors, with a link to his 'how to' blog post just below the form: https://scotthelme.co.uk/contact/
As opposed to "Trust us, we're Google/Facebook/Doubleclick/etc"? Not arguing for government necessarily. Just wonder who or what else would be the honest booked?
And there's the whole question of what counts as identifying information. My understanding is that IP addresses do — so an immutable plaintext server log is now illegal.
Privacy is really, really important. But expanding the definition of privacy too far is just too much.