I was recently drawn in by a link on the New Zealand Herald website that boldly proclaimed "NZ's favourite rom-com has been found" (spoiler alert, it's Love, Actually).
As a not-so-secret fan of rom-coms and frivolous online listicles, I clicked through to find out more. After a short blurb, the article heads straight into three lists: New Zealand's all-time favourite rom-coms; New Zealand's favourite romantic hero; and New Zealand's favourite romantic heroine. Being fairly sure the data to back these bold assertions didn't come from the last Census, I had a closer read to find out where this insight came from.
Turns out the data was from a survey of 200 Elite Singles members, the majority of whom are "educated, relatively affluent, and between the ages of 30 and 55". While using a sample group in order to gauge the make-up of a wider population is a common statistical method, in order to get an representative result, the sample group should be picked at random from the wider population. While I'm no statistician, I'm pretty sure a small group of single, middle class people aged 30-55 with a university degree isn't representative of all New Zealanders.
The use of such an unrepresentative sample means that Love, Actually may well not be New Zealand's favourite rom-com. While it's of little-to-no consequence if Hugh Grant isn't actually New Zealand's favourite romantic hero, there are times where the collection of, and reporting on, data does have real life consequences — like a general election. In saying that, there are guidelines in place for conducting general election polling in New Zealand, as well as guidance for media reporting on the polls. Hopefully this means while we will never know for sure our favourite collective rom-com, the New Zealand Herald won't be predicting our next Prime Minister based on the views of a couple of hundred people looking for love.
Using data to back up arguments and inform decisions is a really good thing. It's just important to remember that there are usually multiple steps in between the collection of data and the end result, whether it's a policy decision or a showbiz listicle. This process is likely to get more complex and less easy to interrogate as more and more decisions are made from huge quantities of data. It's important that data journalists and other people using data to present arguments know how to interpret and present it truthfully.
On that note, I suggest this article is re-titled "Single, middle class people in New Zealand quite like Love, Actually".