I was reading Robert Coe’s 2013 talk today, got a little way in, and was reminded of my love/hate relationship with statistics, and especially their use in Public Office / Government / Education / etc. It has always seemed to me that stats are bused left, right, and center, and despite some exceptional scrutiny (e.g. the UK’s Office for National Statistics), the Media is/acts too stupid to hold anyone to account … ever.
On the PGCE, we’ve been repeatedly told that most of the “evidence” for Educational theories and practices isn’t actually evidence, it’s baseless research that withers and dies under the slightest scrutiny. There’s no malice, but a lot of (compound) incompetence. I wonder: how much are these issues related?
Let’s look at one aspect: officials lie, cheat, arguably even defraud the public, with statistics that they (surely?) know are false, fake, invalid. Why? … HOW?
I think it comes down to this:
Most people find statistics boring; confronted by a 4-line sentence of stats-on-stats-on-stats, eyes glaze over and people start wondering if boredom can be fatal; given a precise sentence on tightly defined statistics, most people are already asleep, or running for the exit
My art-subject friends happily state that they “hate” statistics or simply “couldn’t care less”. Boring, pointless, interesting only to autistic maths-geeks. Even many scientists I’ve known find statistics dry and off-putting – it’s far more interesting to talk about theories. And Mathematicians are probably the worst of all: they don’t even try to hide their scathing disrespect of stats “oh, you mean that fake-Maths for idiots, the stuff that Economists like to use?”.
And we’re all wrong, that think any of that. Well, we’re correct – it is boring, it is meaningless, it has no relevance to us. But that’s because of our culture and – coincidentally – because of the way our education systems have set us up over the past 50 years to think about stats.
How cheaters use stats
If you lie, and are caught, you are censured. If you do it while in public office, or lie about money – especially other people’s money – the full weight of public outrage comes down upon you (supposedly).
Here’s a sneaky idea: If the stats you reported are a lie … and this is found out … the rage is deflected off you and onto the stats themselves. Since we find it hard to attach emotions to abstract ideas, the rage dissipates rapidly, with no outlet.
In effect, the liar has said “it was not I that lied; it was the words themselves! The words are lies! BURN THE WORDS! (let me go free, thankyouverymuch, so I can pull the same trick again next week)”
That’s certainly true, but it’s sophistry. We would (I hope) hold someone to account for check the meaning of their words; for some reason, we don’t hold them to the same standards for the statistics they misappropriate.
Why you should care about stats
If you “care” about making any change for the better – in the world, in an organization, or even in a single individual – then stats are crucial. The one thing we fail to explain to people is this: the stats are there to help you, they are an enabler, they make you succeed better.
You cannot improve the world without using stats; if you try to, your work is as useful as a shout into the wind: it’s possible it has some effect, but very unlikely, and you’ll never know if it did or did not.
Statistics, used correctly, do three things:
- Tell us where to focus our time for BIGGEST improvement with SMALLEST effort
- Tell us when we’ve succeeded – how do you know when you’ve won? Changing the world is not a race, it’s far more nuanced and subtle than that
- When we’re failing, tell us where, which normally implies the “Why”, and enables us to fix it quickly and easily
The right statistic, presented well, is usually a fascinating, interesting, exciting thing
Statistics also, sadly, give a score for how well we’ve done. I say “sadly” because this is – in my experience – the absolute worst-possible use of a statistic. I say this as a game-developer with more than a decade spent designing, building, and playing computer games. Computer games are often held-up by schools, teachers, and consultants as a prime example of “statistics as score”. They are never used that way by actual game-designers, who know it to be false. It was briefly true 30 years ago; the world has moved on – the world of Computing has moved on a lot – since then.
I suspect (nothing more than gut feel) that this mis-interpretation of statistics, this misguided attempt to explain them to children in terms of something relatable to their own lives (i.e. games) is in large part the source of the public’s weird and twisted relationship with them. If you’re justifiying stats as a concept, please – never use computer games as an example!
But whatever we do, we have to know if the stats are “true” or “false” – and here be careful, because professional liars know very well that every stat is true, in some sense. They can argue it forever.
The skill I want everyone to have: querying a statistic
We have a popular story for young children, that shows (among other things) how much our culture values catching-out deceivers, and bucking the trend of group-think. It is reprinted and retold in a vast array of forms every year: The Emperor’s New Clothes.
At conferences, and in meetings, I have often been the metaphorical child when it comes to statistics. A good-seeming quote is provided with actual numbers – statistics – and everyone nods sagely or lifts their eyebrows – “Really? Well, the speaker used a percentage, so it MUST be true!”.
This has to stop.
One time may even have got me fired; a company CEO claimed that his was the most successful company and made the most revenue in a particular area … measured across the whole of Europe. Both statements were bald-faced lies, and I happened to have personally worked with one of the competition in the UK (leaving aside Europe). Unlike most people in the room I knew the actual figures. I called him on it. I queried how he was accounting his figures that our (known, small) revenue was larger than (an approximation of competitor’s, using public data) revenue – how, exactly, was 1 “greater than” 10?
He couldn’t back it up; it wasn’t true, and I’m sure he knew it. Many people afterwards told me they thought it sounded too good to be true, but daren’t query it … who would be foolish enough to lie about a statistic?
So long as we (generally) hold such faith in printed numbers, we’ll continue to be misled – both maliciously and accidentally. I’ve certainly seen stats used incorrectly with no malice, but simple lack of diligence by the quoter: they’ve failed to examine the context of the stat, and ended up abusing it.
What to do when you encounter statistics
Going back to Coe’s talk, I spotted a short sentence that summarised neatly one factor here:
“It is a commonplace of research methods training that correlation is not causation; causal claims need careful causal arguments and evidence to support them (Shadish, Cook and Campbell, 2002)” – Coe, 2013 [emphasis mine]
Coe’s piece is short, worth reading. It has some nice, simple examples of how stats are bent and twisted – some obvious explanations of how to achieve it, intentionally or otherwise. Although most readers will probably want to skip the intense detail near the start with all the graphs.
Meanwhile, here are some small simple tips of my own, off the cuff. I’ll come back and rewrite these in a few weeks, after more thought. When you hear a statistic, and it sounds great:
- Immediately use your iPhone/Android and check the source.
- If it’s uncited – interrupt the speaker and demand the source.
- There is never an excuse for unreferenced statistics
- An unreferenced statistic is declaring to the world “I’m lying, and I know it!”
- …I’ve used unreferenced stats myself in the past; I’m ashamed, and I work very hard not to do it any more, ever. No matter how tempting.
- Easy win: If there’s no date, the speaker is probably lying. You can prove almost anything if you’re allowed to arbitrarily choose the year of the data…
- e.g. “France is the richest country in the world!” (If you use stats from 1685…)
- Consider implications; if that stat is true, what else does that imply?
- …which may tell you it cannot be true
- …else the speaker would be focussing on the far more world-changing results it implies
- Work out how it can be true
- The beauty of surprising stats is that they start us on a path of discovery; they tell us something in the world is not as it seems, and beg us to find out what/why/how
Above all, remember: just as a Thought Experiment fundamentally betrays the concept of an “Experiment”, and undermines the entirety of Science … Statistics used blindy without question go from being “fact” to “might as well be pure fiction”.