Web3 101: Part II- Your Data on Web2
Jul 28, 2021
Last updated on Jul 28, 2021
In our last Web3 101 article, The Centralized Web, we discussed how the web became largely owned and controlled by just a few entities very early on. What was supposed to be a free and open platform has become a centralized breeding ground for data collection.
Facebook is collecting our personal info, Google knows us better than we know ourselves, ads are following us around, yada yada. Today, these stories are so common that they are becoming background noise - a constant hum behind the digital screeching of the internet.
As familiar as we are becoming with the idea of big tech owning our data, the jury is still out on how worried we should be about it. We know something is off, but it’s hard to pinpoint the source of the data problem. And when we talk about our data, what are we even referring to anyway?
To answer all of this, let’s dive into the data-fueled world of Web2.
Web2 and the Data Problem
The current version of the web, Web2, is commonly referred to as the “read-write web,” stemming from users being able to add, or write, their own content. This is in contrast to the old Web1 days when a very small percentage of people were creating the web’s content, and the vast majority of users were simply reading it.
The ability to add your own content to the web ushered in a massive shift. Starting with sites like GeoCities in the mid 90s, the web quickly transformed from a digital library to a virtual town center. Users formed a two-way relationship with the web, where interacting with web pages and other users became possible.
The web was no longer just about finding information, it was about sharing ideas, expressing yourself, and connecting with like-minded individuals. Naturally, this desire to connect and communicate gave way to social media networks, which would come to rule the web of today. These are sites like Six Degrees, Friendster, MySpace, and eventually, Facebook.
Facebook began its life in 2004 as a social networking site made for universities. You know the rest of the story. Facebook usage skyrocketed, literally everyone and their mother joined, and by 2012 the site had hit one billion monthly users. Today, in 2021, that number is closer to three billion.
Facebook is free to use, but in 2020, the social network generated about $86 billion in revenue. How? By selling the data of its users. And just to be clear, Facebook is absolutely not alone in this practice. There is an entire data broker industry, made up of companies you’ve never even heard of. They all deal in the business of collecting and organizing our data to sell to advertisers.
In fact, this buying and selling of personal data is the default business model of today’s web.
The Value of Data
So, what is this data and why is it so valuable?
When we talk about your data, we’re talking about....everything. Name, date of birth, location, contact info — these we give to social media sites and apps willingly. Then our preferences, likes, dislikes, and habits are all compiled from our online activity using cookies — of course we just click “Accept” on those popups. Even our facial features and voices are collected — ever use a Snapchat or Instagram filter?
Past these basics, data around the types of content we create, how that content is perceived, and how we interact with others is harvested and added to the mix. All that hard work you’ve put into building a following and creating a brand online? While that could be seen as valuable intellectual property, it too is vacuumed up and resold.
Nothing is sacred here.
Throw all of this data into an artificial intelligence algorithm, which continues tracking and compiling your every move, and there you have it. AI connects all of the dots and spits out neatly packaged models of internet users. These models are then sold to advertisers and used to target the ideal audience members for countless products and services.
For the record, we’re not sounding the alarm on some conspiracy. To quote Mr. Corleone, it’s strictly business. The early web found it difficult to monetize websites, and subscription services and paywalls would have hindered the ability of social networks to grow. Less users, less value. The answer became offering a free-to-use product, making user adoption as frictionless as possible, and then monetizing the attention of all of those users by selling their data.
Today, this data economy is worth trillions of dollars and fuels the majority of the web. That’s trillions of dollars worth of user data, straight from users like you and me.
If this is our data, then this begs the question - why don’t we own it?
Who Owns My Data?
To be honest, this is where things start getting messy. Current laws protecting our personal data on the web are in their infancy. And the ones that do exist are still rough around the edges. Because of this lack of regulation, personal data harvesting is pervasive on the web today.
So, do you own your data? Not exactly. In truth, nobody does. These platforms and advertisers may be buying and selling the use of our data, but that doesn’t mean they own the sole rights to it.
And that’s exactly the problem - we are yet to reach consensus on how our personal data should be governed on the web. This is a new issue that we are yet to confront in any meaningful way.
However, there are some steps being made. From state level legislation in the US to the European Union’s GDPR, let’s look at the personal data laws that are paving the way.
The GDPR, or General Data Protection Regulation, is a set of rules which aim to give citizens control of their personal data, and went into effect on May 25, 2018. Although technically a European regulation, the GDPR affects any company or organization that stores or processes the data of European residents. Because of this, the legislation affects many companies in the US and elsewhere just the same as it does EU companies. Failing to comply with the GDPR can result in some very costly fines — up to $24.1 million or 4% of a business’s global annual revenue from the previous fiscal year, whichever is higher.
Taken straight from the GDPR website, their objectives are outlined as follows:
- This Regulation lays down rules relating to the protection of natural persons with regard to the processing of personal data and rules relating to the free movement of personal data.
- This Regulation protects fundamental rights and freedoms of natural persons and in particular their right to the protection of personal data.
- The free movement of personal data within the Union shall be neither restricted nor prohibited for reasons connected with the protection of natural persons with regard to the processing of personal data.
Yeah it’s a little wordy - here’s the TDLR. Under the GDPR, businesses must provide sufficient reasoning behind why they need to collect a user's data, and explain what they are going to do with it. This goes as far as requiring them to list all of the other businesses to which they will be providing or selling user data. The GDPR also requires that businesses provide or delete a user’s data upon request, and that they have the personnel on-staff to properly handle all of this data-related work.
So, what effect has the GDPR had?
Technically, you are supposed to be able to reject or opt out of these extra cookies. Some sites make this easy, by simply putting a “Reject Cookies” button next to the “Accept Cookies” button. Others, though, bury the option to opt out of these cookies on another page, and make the process of finding it painful enough that most of us won’t even go looking.
Other businesses take a different path to subtly sidestep the GDPR guidelines. Within the GDPR there is a basis called legitimate interests. A business may consider the collection of certain data necessary, meaning users cannot opt out, if it falls under that business’s legitimate interests. According to the GDPR, legitimate interests are “particularly flexible” and “may be applicable in a wide range of different situations.”
By labeling the collection of certain data as a legitimate interest, a business can effectively bypass the GDPR’s other bases.
Of course, this has to be done within reason, and there are consequences for taking this too far. But, just how flexible is the basis of legitimate interests? As an example of how stretchy this can be, Facebook cites “providing an innovative, personalized, safe, and profitable service” as one of their legitimate interests. So, it’s hazy.
State Level Laws
Zooming into the USA, there are currently no comprehensive data protection laws at the federal level. However, there are a few state level laws that mirror much of the language found in the GDPR.
The most comprehensive state level data law is California’s CCPA, or California Consumer Privacy Act. The CCPA, which went into effect on January 1, 2020, seeks to give California consumers more control over their personal information collected by for-profit businesses. Similar to the GDPR, the CCPA requires that these businesses provide information around what data they will be collecting and why. The law also gives users the right to request a copy of their data, as well as request that their data be corrected or deleted altogether. Unlike the GDPR, though, the CCPA does not require user consent. Instead, it allows businesses to collect data by default, and gives users the option to opt out.
Businesses that fail to comply with the CCPA can also expect some hefty fines — up to $2,500 per unintentional violation, or $7,500 per intentional violation. At first glance, those numbers may not seem like much. But, when they say violation, that means per person. So, if 10,000 people visited your site within a time frame where you were found to be violating the CCPA, that’s 10,000 violations. Slap on a $2,500 fine for each of those users, and you’re looking at a $25 million fine. As they say in DeFi, rekt.
Traveling east, we have Colorado’s CPA, or Colorado Privacy Act, which will go into effect on January 1, 2024. There is also Virginia’s CDPA, or Consumer Data Privacy Act, which goes live on January 1, 2023. Both of these are largely similar to the CCPA, but differ in a few key areas. These differences are mostly visible in the size of the fines and how the laws define personally identifiable information. Still, the goal of giving the user rights and ownership over their data is front and center.
There’s no doubt that the GDPR and these assorted state level laws are a step in the right direction, but we still have a lot of work to do. There is far too much ambiguity here to consider these laws solutions to the data problem.
Searching for Solutions
Much like the centralization of the web that we discussed in our last Web3 101 article, the collection and selling of our data is infecting the internet. This combination of centralization and data collection has created a system that extracts from users, and incentivizes stealing attention over providing value. The reliance on data harvesting has thrown privacy out the window, along with the user’s ability to do anything about it. Even as the web has become a vital part of everyone’s lives, the antiquated ad revenue structure underpinning the entire system has gone largely unchanged.
This is not just unsustainable, it is dangerous.
But as these legacy systems work to maintain their hold on the web, a paradigm shift is gaining momentum in the background. We are, of course, talking about Web3.
Web3 may very well hold the keys to decentralizing the web, making it the free and open platform it was always meant to be. It holds the promise of spreading the immense wealth captured by the data economy back to those who actually create it. It may be able to give us back not only our data, but our identities.
In our next Web3 101 article, we’ll find out how.