At the recent Consumer Insights and Analytics in Banking conference (which was very good, BTW), no fewer than 5 speakers made it a point to tell the audience:
“<Fill-in-a-number> of <fill-in-a-data-measurement-metric> are created every <fill-in-a-time-period>.”
Needless to say, there was no consistency in the metric. Two petabyes a day, five terabytes a week, the numbers were all over the map.
Sadly, this is what it has come down to: A made up statistic is used to justify a made up management concept.
I think I’m safe in assuming that in business conferences everywhere, across industries, that same claim about the amount of data being created day is being made by speakers.
It’s sad when you realize what uncritical, unthinking sloths we’ve become to allow this to happen. For two reasons:
1) The claim has no emotional impact on people. I’m willing to bet that the conference speakers who throw out this new-data-every-day stat do so in an attempt to impress upon the audience the vast quantity of new data being created every day. Just one problem: 99.9% of us have no clue what a terrabyte, petabyte, zettabyte or yottabyte is. (And why wouldn’t they go in alphabetical order when naming this stuff? What kind of idiots are in charge of this?)
If you tell someone that something weighs a ton, we get it. A ton is 2 thousand pounds. We typically weigh anywhere from 100 to 300 pounds. There’s a sense of context and relativity.
What the hell is a terabyte or petabyte? Is that a lot? I bet most of us have no idea how many gigabytes of information we have on our hard disks. There’s no sense of context or relativity when it comes to measures of data for the vast majority of us.
So when a conference speaker proclaims “We create 2 petabytes every day!” it means nothing — absolutely fricking nothing — to pretty much everybody out there.
2) There’s no way anybody can estimate how much data is created. In the movie The Usual Suspects (one of my favorite) Verbal Kent says “The greatest trick the Devil ever pulled was convincing the world he didn’t exist. And like that, poof. He’s gone.” The greatest trick the consulting devils ever pulled was convincing the world that 2.5 quintillion bytes of data are created every day. And like that, poof. It’s true.”
There are three unknowable things in life: 1) How we got here; 2) How much data is created on a daily basis; and 3) What my wife will spend our money on today.
I shouldn’t have to explain this to you. There is simply NO WAY IN HELL anybody can even begin to estimate the amount of data that is created. The reasons start with the explanation that “data” in and of itself is not a commonly defined concept.
But debunking the claims of the amount of data that is created every day doesn’t get us around the problem here: Conference speakers need a statistic they can throw out there to impress the audience and make Big Data appear to be something that it’s not (i.e., real).
So I would like to propose a standard here: Can we all agree that from here on out we claim that “42 bigadatabytes of data are created every day.“
Why 42? As my friend @GeoffIDC tweeted “42, right? It’s always 42.” Yep. 42 it is.
Big thanks and apologies to @gilesnelson and @GeoffIDC who kind of came up with the concept for this post. I have no problem stealing ideas from other people. But I do want to give them credit for it.