Why Covid-19 reporting needs more independence, humanity and ownership
October 4, 2020 • Reading time 2 minutes
Published 5th October
Yesterday (4th October) saw the most massive increase in confirmed Covid-19 cases in England since records began – 22,961.* This was a recording error.
The Excel file ran out of columns. So after a week of flat Covid-19 cases, there was a spike of +22,900 cases yesterday (x3 those reported on Friday) – see chart from the PHE dashboard below.
The missing data on ~15,900 people testing positive weren’t passed on for contact tracing, so may have carried on spreading the disease. Potentially more worrying is that the contacts of these people were not traced – that could mean an additional ~44,000 people unwittingly spreading Covid-19.**
The fury, bafflement and mild amusement that Excel was being used may lead to some new way of storing and sharing the results. There are plenty of software and database solutions.
But the real problem is the lack of humanity and ownership of the data. There are only seven major labs in the UK – why did no one spot the difference between local reports and the national picture?
“We’ve seen a 20% increase in positive cases, but nationally it is flat, so I guess we are an outlier. Any ideas why?”
Good analytics isn’t about having the best software, algorithm or abacus. It’s about an intimate familiarity with the data and the story it is telling. Being familiar is hard when there is lots of data. Dashboards do not solve this problem – they tend to obscure the truth. Instead, it is essential to be like a good detective, forensically hunting down any leads that look odd – ideally from the person-level data.
Sadly it’s a similar issue to the A-level exam fiasco – “it was a mutant algorithm“. No. Someone did not check things properly.
Part of a simple solution that is doable now is to double-blind core calculations. Keep the current process, but have someone independent that works in parallel in another program (e.g. R/Stata) with the raw, unadulterated data. Edge Health would happily do this, but I still maintain that Test and Trace should share this type of data openly.
With thanks to my colleagues Jon Bruce and Christian Moroy for their input
* Interestingly 22,961 is the first prime number of reported cases in around a week
** Normally TAT contacts 60-70% of people testing positive and then identifies around x4 times as many contacts that are advised to self-isolate.