Holiday Special: Wisdom of the crowd versus the Chief Statistician
Ashley Ward (Ashley.WARD@oecd.org), Statistics and Data Directorate (OECD)
What do statisticians and other assorted analysts do at their holiday parties? Drink too much and embarrass themselves? Of course not, they guesstimate the number of chocolates in a glass jar (and embarrass themselves). All fun, games and merriment, until someone steals the data on their way out the door…
Livestock fairs and statistics
As I watched our sample of keen-eyed statistical professionals put pen to paper, it occurred to me that this would be the perfect opportunity to test out that old chestnut: “the wisdom of crowds”.
The potentially apocryphal, or at least embellished, story proceeds that in 1906 an English polymath named Francis Galton visited a livestock fair and observed a curious contest. On prominent display was an ox and the villagers were invited, for a small entry fee, to guess the animal’s weight. Not one of the 800 eager participants landed on the actual weight of 1198 pounds (or 543kg in the modern Système International d’unités), but the middlemost of those guesses came astonishingly close, only 0.8% off, at 1207 pounds.

While the anecdote feels too alluring to be entirely factual, it does illustrate the concept quite well. In James Surowiecki’s The Wisdom of Crowds he contends that “Under the right circumstances, groups are remarkably intelligent, and are often smarter than the smartest people in them.” In short, with a large enough group at your disposal, the errors tend to cancel one another out, and once those errors are removed, what you’re left with is information. One colleague has since mentioned that this is very similar to the reasoning for using the average of the best models when nowcasting, as opposed to taking the best model alone – but that’s a story for another day and another article. For now, I’ll just note that Surowieki outlines three key criteria for this to play out as hoped – diversity, decentralisation and independence – that are worth remembering for later.
There’s more than one way to cook an egg
Back to last Thursday’s party, where everyone had their own theory. Some were trying to remember their secondary school geometry: “Is it pi multiplied by the diameter? Or the radius squared? What about the volume? Did anyone actually bring a ruler?” Another group crowded around a table with an equivalent set of chocolates and calculated the weight per bonbon, ruminating over their propensity to tessellate – evidence-based chocolate analysis at its finest! Others theorised on the motivations of a profit maximising Swiss chocolate corporation. I have no comment on my approach, which will remain a closely held secret.
Deck the halls with boughs of data
I know what you’re thinking, too much storytelling and not enough data. Very well, I’m more than happy to oblige.
Spoiler alert: the actual number of chocolates in the jar was 89 and no one hit the nail on the head. But how close did our crowd of 35 statistically minded colleagues get? We can see that our guesses were spread quite widely, spanning the range 57 to 250 (at least one colleague is glad I anonymised Figure 1). The mean of our speculation was 112. Not quite the statistical triumph of the infamous livestock fair, even if we allow ourselves to omit the most problematic outlier, bringing the number down to 107. A more generous view, and the one taken by Galton, would take the median, freeing us of some additional outliers and coming to 100 – a significantly improved estimate. On a more sociological note, the median for the second half of guesses was further off (111.5) than the first half (97), the cause of which I’ll leave up to speculation.
Flawed experiment, flawed results
Let’s briefly touch on some of the flaws in this natural experiment and why our collective wisdom might have landed further away than the theory promised. I’ll start with the palpable variance in the depth of consideration amongst our self-selected crowd, who were not attending a livestock fair and were not charged an entry fee. Some merely hazarded a guess, while others staked their expertise on their answer.
Now thinking back to our three criteria, Surowiecki would likely criticise our group for lacking some forms of diversity, sharing too many of the same assumptions and demonstrating some of the pitfalls of “groupthink”. A wider selection of professions from a wider selection of holiday parties might have better approached his criteria. In my view, we’re in serious need of a master chocolatier and a marketing executive if we’re to do better next year.
To further taint the experiment, there was a distinct lack of decentralisation and independence, with all answers in full view of prospective participants. This certainly looks to have created fertile ground for anchoring bias, with notable clusters of especially high guesses, as well as providing a tactical advantage to discerning latecomers.
Finally, and this is something anyone with a background in statistics has probably been screaming throughout this read, there’s the sample size. A larger crowd should in theory be a wiser crowd, providing more information and more easily cancelling out those errors. Maybe we’ll have to invite some additional colleagues next time around.
To the victor goes the spoils
At the end of the night it was Vincent Siegerink, recent blog contributor from the OECD’s Centre on Well-being, Inclusion, Sustainability and Equal Opportunity (WISE), who held a bounty of chocolates in his hands and pride in his heart. For the record he claims that his “original guess would have been 89”, which he then adjusted to 86 because he “noticed that in that case [he] would win with any number between 86 and 89”. Maybe there really is something to that WISE acronym after all.
And our Chief Statistician? Well, Paul Schreyer, transparent as always, was kind enough to let me reveal that his estimate was a “somewhat hasty and very much optimistic” 144. This time the crowd really did outperform even the very brightest of stars…even if not one member thought to turn the jar over and read the count on the underside.