Data Creators Should Share in the Profits From Big Data

This post is part of CoinDesk’s 2019 Year in Review, a collection of 100 op-eds, interviews and takes on the state of blockchain and the world. Alex McDougall in the co-founder Bicameral Ventures, a venture capital firm focused on blockchain, interoperability, data and identity self-sovereignty, personalized AI, and Web 3.0.

Striking it rich. The phrase derives from natural resources exploitation. Prospectors would strike an easy patch of oil, extract it from the ground, sell it and get rich. Sometimes it was hyper-focused geological engineering that made the strike, sometimes the speculator stumbled upon it in a TV hillbilly swamp. But the story remains the same: someone finds a resource, accesses it, and sells it.

We’ve had a checkered history with resources compensation dating back to the colonial era, but in the modern capitalist era we’ve been relatively good about acknowledging who owns the land oil is found on and assigning some level of compensation to them for the use of “their” resource. Not so with personal data. “The new oil” is worth billions of dollars, but we haven’t worked out how to extract, refine, sell and establish its value. 

We’ve been pretty good about compensating “owners” of oil – the generic, fungible, naturally-made, public good resource that the “owner” did nothing to actually make. But somehow we’re atrocious at compensating the owner of the highly personalized data that they exclusively created. We don’t tell data creators that we’re “mining” them and then we manipulate their subconscious biases via engagement algorithms to get them to take actions that make their data even more valuable. It’s like if an oil company came to your oil-filled swamp and instead of paying you for the rights to extract said oil, they convince you that if you extract, refine, package and leave the oil on your front doorstep that five million strangers may like you.

The new oil is worth billions of dollars, but we haven’t worked out how to extract, refine, sell and establish its value.

It’s no secret that the most valuable companies of today are driven by data and artificial intelligence (Google, Facebook, Amazon, and so on). It’s also no surprise that large corporations have been the first to realize the value of the resource and have been extremely efficient in harnessing and monetizing it. It’s super unsurprising they have generally tried wherever possible to keep as much of the value chain for themselves as possible.

We completely exclude data creators to the point where we don’t even have a model to understand how we could do it better. While oil companies from history would happily completely exclude “owners” from the value chain if they could, they haven’t been able to because land ownership is something ingrained in us and because five oil rigs showing up on your property is something people tend to notice.

Who owns the searches you make on Google Chrome? Who owns the words you type into Gmail? There are no rigs, no trucks, no smoke, just opaque terms, and a business model average users find confusing (“everything is free you say?”), and an incredible user experience that happens to deliver creepily targeted ads to you on random webpages. Worse, even if your physical oil is extracted without compensation, a drop of it doesn’t contain your medical records, credit card information, or where your kid goes to school. 2019 saw yet another increase in data breaches (data spills?) partially because of the disparity between how different parties value data and the antiquated methodologies used to protect it.

With ongoing privacy breaches and growing dissatisfaction with how privacy and data are handled, we’re finally starting to demand better of our platforms. Altruism and “doing what’s right” is one way to solve this extraction problem, regulation and establishing penalties for egregious data policies is another, but in our current system the lasting solutions are revenue and profit driven.

Luckily, our data is far more valuable when we share it willingly. Big data’s value is a byproduct of bad data. Extracted data is often bad data and you need sufficient bad data to scrub out the signal from the noise and determine what is actually a valuable insight underneath all the byproduct. 

Shared data is timelier, more accurate, more relevant and more ethical. While shared data comes with its own incentive issues, the quality of a platform and insights built on shared data vs. extracted data is night and day. What is the best way to get us to share it willingly? Help us understand why it’s valuable, who it’s valuable to, how we can make it more valuable and then let us share in that value in a simple tangible way. Ideally all of this happens via platforms and behaviours we already do on a day to day basis and value is shared back to us in tangible, creative ways.  

In 2020, let’s see commuters earning mobility credits towards free metro rides because they’ve agreed to work with a platform to organize mobility preferences en masse to help municipalities, wellness groups, favorite chain restaurants, scooter companies, real estate investment firms, car manufacturers and other mobility stakeholders in understanding how commuters are moving around a city. This type of shared data-centric optimization is an exciting potential evolution of what Velocia, a Bicameral portfolio project, has recently launched a mobility rewards platform in Miami.

Next year, let’s see closed loop platforms start experimenting with open platform ideas like “data-portability” whereby you can use reputation and experience generated on one platform to gain status or better user experience on another platform. For example, we should be able to port driver ratings between Uber and Lyft, or order histories between food delivery apps. This doesn’t even need to be driven by the platform but can begin to be built by third party tool developers. The Open Application Network, another Bicameral portfolio project, has been working to create and popularize these types of tools.

At Bicameral, we’re exploring brand new models whereby we partner with emerging market ISPs and households to leverage valuable shared data profiles to improve the economics of laying expensive fiber broadband directly to the home. Giving households next generation routers and IoT smart-hubs, we can help data creators organize and optimize their 360-degree home data profiles simply by continuing to browse, research, watch, buy and take the digital actions they already do. These profiles can be shared with a transparent insights and analytics platforms to both earn discounts and dollars and increase the quality of data-driven interactions with consumer brands, educational tools, market researchers, government agencies and even protect you from manipulative third-party algorithms and bots.

Let’s be frank, it’s not going to be a data paradise overnight. 2020 isn’t going to be the year we see a world of complete transparency into our data supply chain and we’re not going to be able to pinpoint the exact unit of value we’re getting for the exact piece of meta-data we’re sharing with an exact brand. But we can take meaningful steps forward to change the mindsets of transparency and value sharing. We can create models that leverage our existing hardware, software and consumer behavior to start bringing data creators into the value chain and we can start proving out the model that shared data is better than extracted data.

We can start proving the hypothesis that if you compensate us and help us understand how our data is valuable, we’ll opt in willingly and you don’t have to trick us. Maybe 2020 will even be the year some of us even start turning off data feeds or deleting extractive apps that offer us nothing in return? We may not end up millionaires like the hillbillies, but I’m confident that 2020 will be the year the tide changes and we as data creators start taking our rightful place in the flow.

Disclosure Read More

The leader in blockchain news, CoinDesk is a media outlet that strives for the highest journalistic standards and abides by a strict set of editorial policies. CoinDesk is an independent operating subsidiary of Digital Currency Group, which invests in cryptocurrencies and blockchain startups.