WEAPONS OF MATH DESTRUCTION BY CATHY O NEIL
BOOK REVIEWS BY BINOD
BINOD’S RATING: 7/10
In my 19th book review of 2020, I welcome you to peek into the dark side of Big Data.
Cathy O’Neil earned a Ph.D. in math from Harvard, was a postdoc at the MIT math department, and a professor at Barnard College where she published a number of research papers in arithmetic algebraic geometry. She then worked as a quant for the hedge fund D.E. Shaw in the middle of the credit crisis. She left finance in 2011 to work as a data scientist. Evolving from academic mathematician to quant to blogger, O’Neil has accumulated an unusual set of experiences and expertise and she has all the creds.
This is the Big Data economy, and it promises spectacular gains.
A software can speed through thousands of résumés or loan applications in seconds and sort them with the most promising candidates on top. This not only saves time but is also marketed as fair and objective.
But are we putting too much faith in the models?
O’Neil describes the way that math can be manipulated by biases and affect every aspect of our lives. Weapons of math destruction is a term O’Neil uses to describe algorithms that are important, secret and destructive. The kinds of algorithms that she worries about have three ominous features:
They are high-impact and affect a lot of people. It's widespread and it's an important decision that the scoring pertains to like a job or going to jail. These models have scale.
They are opaque. It means either the people who get the scores don't understand how they're computed or sometimes that means that they don't even know they're getting evaluated and scored.
They are destructive and can really screw up somebody's life. Most of the time these algorithms are created with good intentions, but this destructiveness typically undermines that good intention and actually creates a destructive feedback loop. PS- This makes the term “weapons” a bit misleading, since the damage caused by ordinary weapons is typically intentional.
Many of these models and algorithms encode human prejudice, misunderstanding and bias into the software that increasingly manages our lives. Also, their verdicts, even when wrong or harmful, were beyond dispute or appeal.
The book has many examples where big data has not lived up to its promise. Examples:
Models are often used to measure the likelihood an individual will relapse into criminal behavior. When someone is classed as “high risk”, they’re more likely to get a longer sentence and find it harder to find a job when they eventually do get out. That person is then more likely to commit another crime, and so the model looks like it got it right.
Models generate credit scores which are being used to deny people jobs. And that actually creates worse credit scores. Someone who doesn't get a job because they have a bad credit score goes on to having even worse credit scores.
Models generate the U.S. News college rankings used heavily by students and their parents. But the rankings have had an impact beyond students. Colleges compete for students. College administrators put a great deal of effort into figuring out how to improve their schools’ rankings. U.S. News figures its rankings by selecting a slew of criteria, each measured and weighted differently. Pressure exerted by the model distorts the values and priorities of colleges, resulting in drastically increased costs for students (e.g. expensive gyms and hostels), without improvements in the actual quality of education.
Models are used to target users. Facebook can send individual users particular news stories they are likely to click, and political campaigns can contact the particular voters they might easily influence. This customization sounds great. But highly curated news feeds may expose people to only the ideas and information that appealed to their micro-targeted demographic, especially in light of reports that “fake news” stories have appeared in some Facebook users’ news feeds, but not in others.
Often, we don’t even know where to look for those important algorithms, because by definition the most dangerous ones are also the most secretive. That’s why the catalogue of case studies in O’Neil’s book are so important; she’s telling us where to look.
O Neil insists that mathematical models cannot capture concepts, attitudes, and ideas. She warns that every model omits some components of a situation in order to highlight others and reminds us that it can be too easy to move models developed for one situation to another where they do not apply.
O’Neil also notes that the success (some) models have had in predicting (some kinds of) events has paradoxically made modeling as a whole both more ubiquitous and more dangerous. Because those who design and use models are easily lulled into confidence that their models are more neutral, robust, and scalable than events can prove them to be
She is not hostile to mathematics, modeling, or big data, considering mathematical models “the engines of the digital economy”. Of course, mathematics, as an objective science, cannot itself be blamed for contributing to social inequality. Organizations intent on maximizing profits might inappropriately choose to misuse mathematical models. The tool is not to blame, it’s the user of the tool.
This is thankfully not a math book, and it is not investigative journalism. It is short -- you can read it in an afternoon -- and it doesn't have any detailed data analysis, formulae, graphs etc.
My favorite quote? “Models are opinions embedded in mathematics.”
And the book is not all grim. In the last chapter, she shares some ideas of how we can disarm WMDs and use big data for good. She proposes a Hippocratic Oath for data scientists and writes about how to regulate math models.
An informative and entertaining read.