“Do not sell my personal information”, “Reject all cookies”, “Do not show me personalised ads”, “Do not use my Maps activity for improvements”, “Do not store my voice recordings”, “Permanently delete my account”
Do any of these sentences sound familiar to you?
If so, you’ve probably heard about Amazon no longer allowing customers to keep their Amazon Echo voice recordings private [archived version]. This caused quite a stir, and for a good reason. Some people wish to use products and services with at least some privacy. In today’s day and age, retaining your privacy online has become particularly complicated. Every single service out there wants access to your personal information. And what do you get in exchange? Sometimes personalised ads that will try to entice you to buy more shit you don’t need by making sure to show you precisely what you want - and sometimes, products you didn’t even know you wanted. Most of the time, you get nothing at all. Companies run away with your data, sell it to various brokers and providers, benefit from your personal information, and share no profits with you.
The Amazon Echo case is just one of many, the most recent one.
I’m writing this post because I’ve noticed a disturbing trend: Software toggles.
Data that’s never produced, transmitted, or copied remains private
This is a fundamental principle that not many people understand. Even some privacy advocates have long instructions manual on changing your Google account settings for better privacy.
This dangerous trend of software toggles probably began with the DNT header around 2009. This header was intended to discourage companies (especially AdTech companies) from tracking you across the web. By setting this header, you politely requested that companies respect your privacy. The problems with this implementation were as apparent to me back then as it is to everyone now:
- It directly affected the bottom line of AdTech companies.
- It had zero enforcement.
- Mozilla started it without consulting any corporation.
- AdTech companies that adopt and respect this header will gain zero benefits, and the downsides will be financial losses. There are also no downsides to ignoring it or not respecting it.
DNT works on an honour system: Companies don’t face lawsuits or other business risks if they fail to adopt or comply with it. This DNT header is now deprecated.
Possibly the worst downside is that sending this DNT header indicates to AdTech companies that you are a privacy-conscious user. This can be exploited in two particular ways.
- By telling companies you care about privacy, they now know to target you with privacy offerings like VPNs, Monero wallets, or open source products and services.
- This also means your personal information is significantly more valuable than the average user’s because you take measures to protect your privacy. Therefore, basic profiling might not work on you, so AdTech companies would have to scrutinise your behaviour and activity even harder to build a personal profile from your information. This is the opposite of what you want if you set this header!
Fast forward to today, we’re inundated with privacy choices, privacy settings, subsections, pages, and dialogues. It would seem as if Google or Facebook cared about your privacy. “Do you want us to use cookies? Or would you rather not have us profit from selling your personal information?”. And sure enough, regulations like GDPR have codified certain privacy practices and controls into law, but the problems of 2009 remain.
Think about it: It will be stored in a database when you tell Google or Amazon not to build a personalised profile from your personal information via a checkbox on their website. But what happens if they were just to… Ignore it? How can you prove they ignored your privacy request and targeted you with personalised ads?
This thought experiment can be repeated for every single sentence at the top of the article. When you request a company to “permanently delete” your account, can they prove your data was deleted permanently? Even assuming they had the best intentions and deleted the data from their database, did they also go through their database backups to scrub your PII (Personally Identifiable Information)? Did they go through their cold storage backups stored in magnetic tape? Did they remove your details from every single web server log, every single web server trace, and every reference to your details from any purchase and interaction you’ve ever had with them?
The answer is very obviously no. You have to trust them. And in doing so, you lose agency and acountability.
Soft deleting information
There’s a concept in software engineering called “soft-deletes”.
For example, GORM (an Object-Relational Mapping layer for Go) implements a feature by which the deletion of items does not result in issuing a DELETE
command on the target table. Instead, GORM adds a DeletedAt
field, which gets a timestamp assigned when this item is deleted. In this way, the object no longer shows in the web application that is using GORM, but the data persists in the SQL table. At this point, even if you were to try to query a deleted user or ask the application for something you requested to be deleted, you’d be forgiven for thinking it was indeed permanently deleted. However, the SQL database backing this application would still have all this information intact.
It’s not just GORM; it is standard practice to implement soft deletes similarly, for many reasons. In many countries, companies and organisations are obligated to keep customer records for a given length of time because of fraud and sometimes national security reasons. For example:
- US: Financial regulations require American companies to retain customer records for at least 6 years.
- Germany: Financial and tax regulations require retaining customer data for at least 8 years.
- France: Companies must keep staff records for at least 50 years.
- India: Requires telecommunication companies to maintain complaints and reports for at least 3 years.
You can simultaneously satisfy privacy and anti-money laundering regulations by implementing a soft-delete mechanism.
Information deletion is impossible to prove
Yes, you can go to your file browser right now, right-click any file, and choose “delete.” And yes, I know it will disappear from view instantly. But was it actually deleted? Not so fast!
Computers generally operate on a few basic Information Theory principles:
- Information is measurable and has a weight (generally speaking, information is measured in bits - more generally, bytes or megabytes)
- There’s a limit to lossless compression (for perfectly random data, compression is not always possible because lossless compression relies on pattern recognition and sorting the data in a way that can be rebuilt into a perfect copy of the original information)
- Perfect communication can be achieved over imperfect channels (in other words, you can send perfect 1:1 replicas of any piece of information by making use of redundancy, checksums and other control mechanisms)
- Digital information can be replicated in a way that the copy is indistinguishable from the original (this principle is heavily used in computer networks to transfer information between two machines)
- Redundancy is the key to reliability (since digital information is perfect, but storage mediums are vulnerable to hardware failure, redundancy can extend the lifespan of digital information)
- Data deletion is impossible to prove, because “deletion” does not exist on a physical level. There exist only two operations on physical media: Reading, and writing.
This last point is crucial, and not many people realise it. Even many highly technical people operate under the assumption that when a computer is told to delete some information, the information is irrecoverable, but that’s not true at all. Even assuming there are no backups or copies of this data, software operates on a different layer than hardware.
This is why there are standards for data deletion practices. Several alternatives exist since it is unclear and impossible to prove whether any computer deletes data.
- Physical drive destruction : This method is typically employed by companies handling highly sensitive records like trade secrets. By physically destroying the disk, the company loses money on the cost of the hardware, but ensures data will be irrecoverable. This destruction can be done either by shredding the hardware into pieces or, in the case of magnetic media, applying a strong magnetic field that permanently alters the information stored.
- Encryption header overwriting: Most encryption methods assign an encryption header to the encrypted content. This allows you to change the password (which encrypts a few bytes of information) without re-encrypting the entire data region. By heavily overwriting this critical header region with random data, it is possible to make the encrypted area permanently unrecoverable. However, if a sufficiently weak encryption algorithm is used, the encryption key can always be recovered from the encrypted data.
- Firmware-level secure erase: Modern disks and flash storage with controllers can be wiped by destroying the controller storage. This does not destroy the disk contents, but it permanently deletes the order of the data stored inside. This operation is much faster than deleting several terabytes of disk data. Since objects of several megabytes are chopped down into unsorted byte-sized blocks, rearranging them in a logical order is almost impossible without knowing what they contain. At this point, data recovery isn’t necessary.
Software toggles are lying to you
There are thousands of guides on tuning your Google, Facebook, or Amazon account settings to ensure they retain the least information about you. There are scripts for web browsers that automatically click “reject cookies” on all cookie consent prompts.
I find this all a waste of time.
The only cookie that is rejected is the cookie that is never saved to the web browser. In some cases, cookies are required for website functionality, such as signing in or making payments.
Firefox has integrated privacy protections [archived version] that prevent third-party cookies from being installed in your browser. This is effective, as your browser is the ultimate cookie consent. It can simply refuse to install third-party cookies, or expire them after some time, like Safari does [archived version].
This is how privacy protections should be: enforced at the user software level, not at the company’s web server level.
Use extensions such as:
- uBlock Origin
- Ghostery
- AdNauseam - Note: Fork of uBlock Origin, do not use it simultaneously with other ad blockers.
Then there’s Pi-Hole which is more than just an extension. You can install it in a Raspberry Pi and it helps you block ads anywhere in your home network, even your Smart TV or phone, automatically.
Or get a private browser such as:
Ultimately, you are the sole responsible for your own privacy and security. There are many organisations that are happy to help you educate yourself and embrace a more private life. See:
Conclusion
Do not trust software to do what you want. Enforce it. Minimize information leakage.
Interested in privacy? Read Operational Security: Staying safe online.