What is big data for?

speak to meFor a number of years I co-edited the Journal of Cultural Economics, and the core duty of the editor of a research journal is to guide through revision and publication papers that have findings new and interesting enough to advance our knowledge, and to turn away those papers that are either mistaken in their applications of theory and empirical work, or simply not all that relevant. The journal had to be selective – fewer than 20 percent of submissions were eventually published – and so a lot had to be turned away. What I found was that few of the rejections were incompetent; authors knew their way around economic theory, and the econometric application of data to the hypotheses generated by the model. Most of the rejections were of papers that, while competent, were not very interesting, addressing questions that were not well-motivated. The authors had data, and the tools to process the data, but could not find an interesting question to ask of the data.

And so I worry a little when arts organizations come to over-estimate what “big data” can do for them. The idea behind big data, or even “medium data”, is that the wealth of information generated through various transactions, and the computing power available to apply various algorithms to the data, can help us answer important questions in ways that we could not before. And in some fields, it has. In the commercial world, big data helps companies target customers with information likely to lead to a purchase: Amazon.com can use the data of the millions of page views and purchases of its customers, together with my own buying and browsing history, to suggest to me when I look at the page for this book I might also want to take a look at that book. My grocery store targets coupons to me, based on its data, that it hopes will keep me coming back, and not going to a rival store. In public policy, big data is having significant impacts on our understanding of business conditions, crime, and epidemiology.

What all of the above examples, from business or policy, have in common is that there are clear questions that analysts want answered: how can our company make more sales? How can we predict when a neighborhood ought to have a more significant police presence?

If big data is going to be of importance to, say, museums, as this report from the American Alliance of Museums, or this post from Center for the Future of Museums, suggest, then the key is not what data might be available. The key is on what questions we hope the data can provide enlightenment. And on that I’m afraid I still don’t see what museums are hoping for. If I find I am getting more visitors from zip code 77005 than 77025, what of it? (And after all, that could have easily been obtained through simple, “small data”, visitor surveys). Finding out who buys what in the museum shop, in which rooms visitors tend to linger, is maybe interesting for limited purposes, but it didn’t take a data revolution to get answers, if we really wanted them. The CFM writes:

What if we added cultural engagement to the linked data sets of health and education? If we track how people—children or adults—Interact with museums, historic sites, libraries, performing arts, and put that in the big data mix, we could finally document the effects of, say, family museum visits on kids’ educational attainment, or the impact of engagement with the arts on health and well-being.

But actually big data is not much different than small data in trying to answer those questions – in other words, it is not very helpful. It has been impossible to separate correlation with causation in studies linking arts education or participation with educational and health outcomes, and big data does not actually help get around that problem – you just get more correlations, albeit with more observations.

Applications of data are very useful when the researcher knows what she wants to ask. But the data won’t come up with the questions; humans still have to do that. Many (almost all?) of the important questions in the arts are not ones where the answers have eluded us because we lack data or computing power. A well-designed survey with an unbiased sample can give us something very helpful, even with only a few hundred observations, if we know what to ask.

My final caveat is this: investing in data analysis comes with an opportunity cost. Given the human and financial resource constraints faced by arts organizations, more attention paid to this (correlations revealed by big data) means less attention paid to that (all the other questions arts leadership ought to be studying). If arts organizations really want to invest in big data analysis, think first, and carefully, about what problem you are trying to solve.

Share on FacebookTweet about this on TwitterShare on RedditEmail this to someone


  1. […] What is big data for? AJBlog: For What It’s Worth | Published 2014-04-18 Creatures Under the Skin AJBlog: Dancebeat | Published 2014-04-17 Voice of a generation AJBlog: Sandow | Published 2014-04-17 Bilious About Billboards: A Dissenting BlogBack from Advertising Association and Heated Tweets AJBlog: CultureGrrl | Published 2014-04-17 How Important is a Writer’s Routine? Plus, McMansions AJBlog: CultureCrash | Published 2014-04-17 […]

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>