Statistical Illogic and the Crisis of Modern Science
by Aubrey Clayton
This title was previously available on NetGalley and is now archived.
Pub Date 03 Aug 2021 | Archive Date 10 Nov 2021
There is a logical flaw in the statistical methods used across experimental science. This fault is not just a minor academic quibble: it underlies a reproducibility crisis now threatening entire disciplines. In an increasingly data-reliant culture, this same deeply rooted error shapes decisions in medicine, law, and public policy with profound consequences. The foundation of the problem is a misunderstanding of probability and our ability to make inferences from data.
Aubrey Clayton traces the history of how statistics went astray, beginning with the groundbreaking work of the seventeenth-century mathematician Jacob Bernoulli and winding through gambling, astronomy, and genetics. He recounts the feuds among rival schools of statistics, exploring the surprisingly human problems that gave rise to the discipline and the all-too-human shortcomings that derailed it. Clayton highlights how influential nineteenth- and twentieth-century figures developed a statistical methodology they claimed was purely objective in order to silence critics of their political agendas, including eugenics.
Clayton provides a clear account of the mathematics and logic of probability, conveying complex concepts accessibly for readers interested in the statistical methods that frame our understanding of the world. He contends that we need to take a Bayesian approach—incorporating prior knowledge when reasoning with incomplete information—in order to resolve the crisis. Ranging across math, philosophy, and culture, Bernoulli’s Fallacy explains why something has gone wrong with how we use data—and how to fix it.
ABOUT THE AUTHOR
Aubrey Clayton is a mathematician who teaches the philosophy of probability and statistics at the Harvard Extension School. He holds a PhD from the University of California, Berkeley, and his writing has appeared in Pacific Standard, Nautilus, and the Boston Globe.
"This story of the 'statistics wars' is gripping, and Clayton is an excellent writer. He argues that scientists have been doing statistics all wrong, a case that should have profound ramifications for medicine, biology, psychology, the social sciences, and other empirical disciplines. Few books accessible to a broad audience lay out the Bayesian case so clearly."
--Eric-Jan Wagenmakers, coauthor of Bayesian Cognitive Modeling: A Practical Course
Available on NetGalley
Average rating from 7 members
You better be wearing your big boy pants if you attempt to read this book. I took many college courses in statistics while obtaining both my Bachelor's and Master's degrees and this book was too tough even for me. (It did not help at all that the Kindle galley that I was sent did not correctly depict any of the formulas or graphs. Perhaps the final copy will correct this important shortcoming.) Mr. Clayton starts out great in Chapter One which I followed quite easily BUT then things got really tough. Unless you have a Ph.D. in statistics (I do not) this book is going to be way over your head. I suspect that for this very small group of readers that this is an excellent book but I can not be certain. Five stars for the experts who read this book and one star for the rest of us averages out to three stars overall.
Lies, Damn Lies, and Statistics. On the one hand, if this text is true, the words often attributed to Mark Twain have likely never been more true. If this text is true, you can effectively toss out any and all probaballistic claims you've ever heard. Which means virtually everything about any social science (psychology, sociology, etc). The vast bulk of climate science. Indeed, most anything that cannot be repeatedly accurately measured in verifiable ways is pretty much *gone*. On the other, the claims herein could be seen as constituting yet another battle in yet another Ivory Tower world with little real-world implications at all. Indeed, one section in particular - where the author imagines a super computer trained in the ways of the opposing camp and an unknowing statistics student - could be argued as being little more than a straight up straw man attack. And it is these very points - regarding the possibility of this being little more than an Ivory Tower battle and the seeming straw man - that form part of the reasoning for the star deduction. The other two points are these: 1) Lack of bibliography. As the text repeatedly and painfully makes the point of astounding claims requiring astounding proof, the fact that this bibliography is only about 10% of this (advance reader copy, so potentially fixable before publication) copy is quite remarkable. Particularly when considering that other science books this reader has read within the last few weeks have made far less astounding claims and yet had much lengthier bibliographies. 2) There isn't a way around this one: This is one *dense* book. I fully cop to not being able to follow *all* of the math, but the explanations seem reasonable themselves. This is simply an extremely dense book that someone that hasn't had at least Statistics 1 in college likely won't be able to follow at all, even as it not only proposes new systems of statistics but also follows the historical development of statistics and statistical thinking. And it is based, largely, on a paper that came out roughly when this reader was indeed *in* said Statistics 1 class in college - 2003. As to the actual mathematical arguments presented here and their validity, this reader will simply note that he has but a Bachelor of Science in Computer Science - and thus at least *some* knowledge of the field, but isn't anywhere near being able to confirm or refute someone possessing a PhD in some Statistics-adjacent field. But as someone who reads many books across many genres and disciplines, the overall points made in this one... well, go back to the beginning of the review. If true, they are indeed earth quaking if not shattering. But one could easily see them to just as likely be just another academic war. In the end, this is a book that is indeed recommended, though one may wish to assess their own mathematical and statistical knowledge before attempting to read this polemic.
I thought this book was really good. I love the Clayton didn't shy away from including equations and calculations in the book (even though I couldn't see them because of the atrocious pre-publication formatting). Clayton writes from a very specific point of view, but it's one I found persuasive. I thought this was a good explanation of a lot of the philosophical and scientific issues surrounding statistics and probability.
I'm not how to take this book or who the real audience is. The book assumes a lot more familiarity with probability theory than I have so I can't speak to the underlying theme (Bernoulli is wrong and Bayes is right). The issue is that it reads like a rant, with Clayton offering E.T. Jaynes' interpretation of Bayes as the one truth, while making blanket statements like "all challenges to the fact of systemic racism in the US justice system are wrong". It's hard to accept the message when the messenger comes across so hardnosed and doesn't follow his own advice when giving examples. I was expecting a little more historical info on the development of the field, and a balanced treatment of what's right and what's not (and why). That isn't what we have here. Recommended for students of probability theory who want exposure to other viewpoints.
Bernoulli's Fallacy by Aubrey Clayton is a well-argued case against what has passed for probability over the past century plus. While his explanations are straightforward and the math is presented in a clear manner, it is still a read that will, and should, take more effort than many other books. The reward, however, is well worth the effort. While some may mistakenly think this is just some feud within academia and so doesn't really matter beyond those walls, that is wrong and Clayton makes that clear with many of the social science as well as science examples he cites. When people's lives can be harmed if not ended at least partly because of improper use of data expressed as probability, then this is anything but an ivory tower debate. It takes place largely within those walls because that is where these theories are taught and because the "experts" who pronounce the so-called probabilities on policy issues are still pulled from academia's ranks. While my first undergraduate degree was a mathematics heavy degree (EE) it has been very long ago and my subsequent degrees were all humanities and social science. So I am not going to try to explain what Clayton goes through. To put it as basically as possible, what passes for probability is often just frequency, with little or no predictive or explanatory value. Yet it is used to predict and to explain, which then becomes part of future policy, which more often than not fails. A good example of a Bayesian approach is an article Clayton wrote for the Boston Globe in June of last year about the statistical paradox of police killings. Without taking prior information into account when assessing limited or skewed information, a faulty and quite deadly conclusion can be made which seems, on the surface, to be based on sound scientific information. That article, quite short, is well worth looking up to offer a real world glimpse of what Clayton is arguing against. While the book is dense, it is accessible to most readers who either have some math background (especially if you still use it frequently) or is willing to read a bit slower and wrestle with the concepts. Clayton's explanations and examples, as well as the history lesson, can be read largely without too much concern for understanding the nuance of every formula he shows. If you understand that if a figure in a particular place in a formula can have an outsize effect on the result, then understanding the nuance is less important since Clayton explains what we need to understand for the big picture argument. In other words, if you're interested or concerned about the reproducibility crisis in science as well as the social sciences this book will be well worth any effort you might have to put into it. But it is, bottom line, accessible to most who want to understand. Using my experience as an example, I had to progress rather slowly and make an effort to understand each bit of information, each aspect of the history as well as of the mathematics. I feel like I managed to do so at a reasonable level for a first read. What I haven't yet done but anticipate doing with subsequent readings is connecting these still, in my mind, separate pieces into a better understanding of the whole. Clayton's explanations allowed me to understand the big picture without every detail being in perfect focus. Now I can connect the dots (my small pieces of understanding) to make the big picture come into sharper focus. Okay, maybe I didn't help with this paragraph, but maybe someone will understand what I am trying to say. Quick aside, ignore the "sky is falling" people who imply that all statistics and all that we do with them is pointless, that is throwing the baby out with the bath water, and probably makes the screamer feel smart. This is a wide ranging problem and touches almost every aspect of policy making as well as research, but it is not a case of "everything that has been done before is now meaningless." Keep the data and use it better, don't panic and throw everything out and hyperventilate. Also, to clear up some misunderstandings, the review copy I had, both the Kindle version and the one I read on Adobe Digital Editions, had substantial notes (many of which were bibliographic in nature) as well as several pages of a bibliography. So anyone interested in checking Clayton's sources can do so. Not sure why the mix up, but rest assured, this is both well-researched and well-documented. Reviewed from a copy made available by the publisher via NetGalley.