Book Review by Frank Cerwin
The Data Detective: Ten Easy Rules to Make Sense of Statistics
By Tim Harford Riverhead Books, 2021; 279 pages.
I have to admit that I’m a data geek if I’m reading books on statistics. However, if you’re a Data Scientist or interested in becoming one, this book is a must read. Furthermore, even if you’re not, you will find this book to be a very informative read as way to interpret statistics that you hear in the news every day. The Data Detective is a book that was published in January 2021 and written by Tim Harford, an honorary fellow of the Royal Statistical Society in the UK and is an award-winning author, columnist, and economist.
The book starts off with an interesting introduction entitled “How to Lie with Statistics” which was the title of a book published in 1954 and created wide-spread skepticism about published statistics. The objective of the chapters that follow are to provide statistical ideas and claim fact-checking approaches that separate truthful statistics from deceiving statistics. Statistical analysis is based on collecting, understanding, comparing, processing, and presenting data – things each of us in our field perform on a daily basis. All of these activities are topics covered in this book. We depend on reliable numbers to shape our decisions. However, as the author states, “Doubt is a powerful weapon” meaning that introducing any doubt in the minds of people about the statistics they see can undermine truthful statistical data. He goes on the say “It may be easy to lie with statistics – but it’s even easier to lie without them”. As this is a very recently published book, Tim Harford presents many examples based on the Covid-19 pandemic and other current events. Examples that all of us can relate to.
Within the chapters, the author covers topics related to data sourcing, unconscious bias, causation, sample sizes, and analytical algorithms. I found a couple of these topics particularly interesting. In the discussion about analytical algorithms, the author refers to information found in a book titled “Weapons of Math Destruction” (I love the title) by Cathy O’Neil that states the real problem with data algorithms is the lack of scrutiny, transparency, and debate. In other words, if the algorithms are just a “black box” and you don’t understand how they work then how can you trust their results. A quote related to Big Data that caught my attention was from David Spiegelhalter, a Cambridge University professor, who stated that “There are a lot of small data problems that occur in big data. They don’t disappear because you’ve got lots of the stuff. They get worse”. Another quote about big data is “Big data is found data” which implies that it is missing the “unfound” data. The unfound data is what statistician David Hand refers to a “dark data”. A definition that varies from “dark data” as data that is found but unused. Collectively, much food for thought. Finally, one of the most fascinating stories that Tim Harford tells is about the “mother of infographics” – Florence Nightingale. That’s right, Florence Nightingale, the famous nurse, who used statistics and her famous infographic called “the Rose Diagram” to prove her theories that changed hospital healthcare.
Obviously, The Data Detective is full of very interesting and relevant information to our data management discipline. Tim Harford uses his expertise and story-telling talent to drive home his ten rules to make sense of statistics. It’s well worth the time to read it.