What's new The latest news and insight

So long and thanks for all the fish

29 November 2012

by Joe Lewis, Research Manager, BARB

"Space is big. You just won't believe how vastly, hugely, mind-blowingly big it is. I mean, you may think it's a long way down the road to the chemist's, but that's just peanuts to space."

Douglas Adams, Hitchhikers Guide to the Galaxy

As Douglas Adams quite nicely put it, space is big. Incomprehensibly so in fact. Distances and concepts previously thought of as big are mere specks of dust in comparison to space.

Now some people would have you believe that traditional consumer surveys are mere specks of dust in contrast to Big Data. But what is Big Data? Everyone’s got it, it seems. But where does it come from, what does it say, and what exactly should we be doing with it?

Now, working at BARB, I’m familiar with the term Big Data and of its obvious benefits and potentials. It’s not a new concept as it has been around for some years. Circulation figures, sales data and general election polling returns are all forms of it. In times gone by, I worked for the Office for National Statistics and used Big Data from national insurance returns to calculate household income statistics for the treasury. Believe me, that was a big dataset.

Big Data is not something that has suddenly appeared, although the road to get to it has now been nicely paved. I refer, of course, to digital consumption and access.

The continued proliferation of IP delivered services now offers a wealth of real time, census information on purchase and consumption points. In a media measurement world, this means that it is possible to see the actual number of devices accessing content and what was delivered. So there we have it, no more 95% confidence intervals - goodbye sampling error, we don’t need you anymore.

Given all this great census based information, surely we don’t need surveys then? Well, of course if your census information explains all avenues of consumption and you know and understand who it is that is consuming them, then no you wouldn’t. But it doesn’t and here’s where we at BARB must put a bit of a reality check in place.

The majority of television viewing comes from over the air transmissions with no census information or return path datastream available. This consumption is completely anonymous to broadcaster and advertiser alike. If you go home, switch the TV on and watch something on Freeview, no-one will automatically know. And additionally, no one will know what it was you actually watched.

Some will argue that existing forms of television measurement are in danger of becoming out of touch with the real world. But it’s not that BARB hasn’t joined the real world. We very much work in the real world, balancing the hard realities that need to be faced on both sides of the argument.

Firstly, let’s look at the Olympics. The BBC put on a very impressive offering, both via the television set and via its online services. The Olympics drove a bumper month for BBC Online with traffic to its services at levels never seen before. I’m sure most of us reading this enjoyed the great services being delivered. I know I did.

The BBC reported that some 106 million 1 requests for online Olympics coverage were made during the games. That is a very impressive amount of viewing in only two weeks and is based on actual data received back to the BBC.

In terms of measurement, this is clearly an example of big data possibilities, and a very exciting one at that. But what about television? How does that compare with TV viewing as a whole? Well, at BARB we don’t traditionally use terms such as requests although an equivalent would be to aggregate the 1 minute reach of each and every BBC Olympics programme event.

If we did this over the same period, what would we get as an equivalent to requests? The answer is that, during the Olympics, there were 2.3 billion BBC Olympics programme BARB video views to the television set. So in this case it seems that census based measurement captured less than 5% of all viewings.

This example shows the clear disconnect of the total universe consumption and that explained by census IP based measurement (aka Big Data). Until content is delivered nearly universally via this method there will always be limitations on what is being measured, no matter how granular and impressive that measurement is. As an industry we need to be aware of this and clear on the substance. Otherwise we are in danger of both overstating the significance and, more disturbingly, misinterpreting the insight.

Secondly, we need to know who was watching. This seems like an obvious question, but is it being asked? Returning to the BBC Olympics, who made these 106 million requests? What was their age, their social grouping, do they have children, what ethnic background are they from? Do we indeed care? Of course we care and the BBC Audience Measurement team go to great lengths with a variety of additional bespoke surveys to find out just that.

Simply put, we will never get this level of information from the raw census data itself. Goodbye reach and frequency.

So, am I some sort of luddite, standing in the way of progress? Perhaps I am like Arthur Dent, Douglas Adams’ reluctant hero who was ignorant of mankind’s impending doom? Maybe Big Data advocates are like the dolphins who heeded the warnings, moved with the times and left the rest of us behind with the immortal words “so long, and thanks for all the fish”?

Well, pragmatism is a better description of our approach. I am as keen as the next person on the benefits of Big Data, it interests me greatly and I relish working with it. Yet I can’t help but wince as protagonists claim it is the answer to all and everything for media measurement. It’s simply not the case and, given the scale of industry investment which is made off the back of BARB numbers, it’s critical we don’t get this wrong.

But what of sample surveys I hear you ask? Knowing the demographics of a sample is all well and good but where does that get you when you’re trying to measure consumption in a niche and fragmented world? After all, isn’t it the case that BARB measurement is already littered with zero rating minutes for channels big and small and this will only be exacerbated further with the more services and delivery mechanisms on offer?

These issues are significant for us at BARB and ones that we are addressing. But it’s not simply a case of replacing one form of measurement with another. There is a third way and its one BARB has been working towards for some time.

It is clear that samples would need to increase significantly in order to accurately capture and report all forms of media consumption in the digital world. It is also clear that current big datasets either only explain a small part of today’s reality, perhaps existing in multiple different forms and places with little care as to who it is doing the consuming.

The alternative is that BARB is actively pursuing the ability to integrate additional datasets into its gold standard survey data with the objective of delivering unified measurement. This will allow accurate, granular reporting of viewing across the fragmented digital ecosystem (based on census information) but with behavioural insight and context gained from knowing who is doing the viewing (survey based).

Unified measurement can only come from a measurement design that dovetails the strengths of both sources of information, making them even stronger when they are put together. A critical component to this solution can be seen in BARB’s plans to rollout Kantar Media’s Virtual Meter software to a further 500 homes on its panel. This software will allow the future integration of survey consumption of television content via PCs, laptops and in time tablets, with site centric data. More news on our future plans can be found in the What’s Next? section of this website.

So, is that the answer then, a happy marriage of survey data living in harmony with big data? Well, I’ll leave you with the words of ‘Deep Thought’, the second most powerful computer ever built. After spending seven and a half million years computing the answer to the ultimate question of life the universe and everything, it produced the rather cryptic answer of “42” before stating that a second computer, one even more powerful than itself should be built in order to work out what the ultimate question is.

“Once you know what the question actually is, you'll know what the answer means“, he mused.