How Big Data Increases Inequality and Threatens Democracy
Ratings81
Average rating3.7
Math! And social justice! Two of my favorite things! What's not to like?
Unfortunately, kind of a lot. Look: people who read math books for fun are math nerds. Dumbing down math concepts with cutesy terms is not needed. It will not make people who would not otherwise read math for fun read your book and it will piss off the rest of us. Also, it's lazy. And it's bad math – O'Neil uses the term “weapon of math destruction” (over and over) very vaguely, so that she doesn't have to define exactly what she's talking about. Oh, she claims that she has a clear definition, but then she calls things like Racial Profiling a WMD (cringe). Racial Profiling isn't an algorithm; it's a cognitive heuristic and it doesn't relay on Big Data.
More problematically, I think she uses this term to obscure that a lot of her points are actually about cognitive biases, racial inequality and socioeconomic inequality, rather than the data science used to enforce these. She herself acknowledges that some things (like, e.g. racial profiling) have happened to exactly the current degree long before data science was available.
Overall, I found her approach really shallow. She's a former tenured ivy league math professor! I wanted her to write a book that only she could write – full of nuance and equations I needed a scratchpad to struggle through.
Nonetheless, I think some of her points were good: that machine-learning algorithms are dense and require supervision and critical thinking as to their results rather than blind trust. It's an important book for the math-phobic.
If you're looking for hard data or a deep exploration into mathematical algorithms, this book will disappoint. It is, however, an eye-opening, bird's-eye view of a field that is quietly taking over quite a few parts of our lives. I applaud the author for expressing such a high level of empathy for people whose plights she does not share, and for providing such a well-written overview that even the layperson can understand.
For those that are being introduced to this topic, I highly recommend this book (my only criticism is the term Weapons of Math Destruction - or WMD - itself, and how often it is overused within the book). If you are interested in learning more about the specific ways in which machine learning and mathematical algorithms are wreaking havoc in different parts of society, other books are better poised to teach on the details of those topics, such as The New Jim Crow and Automating Inequality.
Changed my mind in respect to various number crunching instances, especially in cases where bias is baked into the institution developing the algorithms.
This book is a powerful critique and warning about the dangers of big data: how use of algorithms at a broad scale throughout or society to inform hiring decisions, financial offerings, policing, etc. can increase inequality and ruin the lives of vulnerable individuals. As big data becomes more ubiquitous, this book provides a compelling argument for creating accountability and applying analyses in a thoughtful way to harness their potential for good and challenge their threat to do harm.
3.5 Stars. Nothing revolutionary, and a lot of the basic ideas are covered better by books like Automating Inequality, but I'm being a little generous, because this was one of the first books to start the conversation on this topic
Informative but ar times very dense with information. I really liked the many examples that were given
A short book about how “WMDs” pose a great threat to society. The book actually makes some good arguments, and its subject is relevant to a thesis I'm writing on the use machine learning in public policy, and I'm actually on board with the author's critique. However, I don't think the critique goes far enough. The problem is not the encroachment of mathematics in our lives, but the existing social and economic inequalities that are amplified by the use of sophisticated mathematical models. The author also offers no alternatives, we can hardly step back from our data-intensive society. I may be overly harsh, however, as the alternatives posed by authors usually range from very useless to less useless.
EDIT: I must admit that I wrote the above before reading the concluding arguments, where the author (mostly adequately) adresses the above concerns. As such, I've revised my rating up to 4 stars. Recommended to all, including the non-technical reader.
I really enjoyed the book and I would definitely recommend it to others. I actually have been actively trying to get more people to read it.
The book demystifies big data and statistics and raise awareness about the topic, through the chapters Cathy shows how deeply intertwined it is with public policies and day to day opportunities like buying a car or an apartment, getting affordable and good education for you or your children, even being stop by the police on the sole premise of ethnicity.
I think this is a good first book about the subject, most of the data is from US and a bit from Europe but consequences are global so it would be great to have more data from other countries as well.
It's interesting to read this book and think about the media trying to scare us about China's “oppressive social credit score” system.
Meanwhile we have a patchwork of far less transparent black box systems that control...
• if you get into college
• if you get offered a job
• If you get a mortgage
• If you get targeted by scam universities or scam credit systems
• if you get approve to rent a home
• if you get fired or promoted
• if you get stopped by the police
• if you get bail
• if you get a longer criminal sentence
• if you get probation
And more. Existing systemic bias is coded into these algorithms, resulting in a venire of “science” and “objectivity” used to justify further systemic oppression.
Racist cops find more crime in poor non-white neighborhoods → algorithms designed to find “where crime might happen” takes this garbage data and outputs garbage results → Cops further oppress these neighborhoods, locking up more poor people → An algorithm looks at the material conditions of a defendant and determines that since he's poor, his friends and family are and have had run-ins with the law, and he has few professional prospects, he is likely to reoffend and gets a more stringent sentence.
This feedback loop reinforces our racist, classist criminal justice system while claiming to use “scientific, non-biased” tools. This is just one of the many examples of “big data” run amuck outlined in ths book.
Many more include leveraging big data to suck as much money out of poor people as they can possibly get away with. Because when we have a global economic system primarily driven by profit instead of helping people, the newest technological revolutionary tools will be used not to push humanity forward, but to suck up all our personal information to serve us targeted ads, many of which include ads to scam us.
Great book. highly recommended.
Excellent review of a lot of cases where big data is failing us right now. O'Neil terms them Weapons of Math Destruction, they are the algorithms and filters and data crunching methods that help people make decisions on who to hire, who to fire, who to give a loan to and how much to charge you. They are oversimplified, non-transparent and static, and they usually end up being feedback engines that help the rich get richer and discriminate against the poor. Not that humans before them weren't terribly biased and greedy in their decision making process, but now it happens on a larger scale without us necessarily noticing, because everyone trusts algorithms, because algorithms are fair, right?
Any decisions outsourced to big data will never be completely fair, the same way humans can never be completely fair. But raising awareness and having these discussions now is super important, so we learn how to finetune these tools so they'll be as fair and transparent as they can be.
O'Neil's chapter on micro-targeting of citizens with political ads on facebook is very on-point for these days.
An interesting topic which deserves better treatment than a collection of Vox-style op-eds. This is not a book that wants to teach you how mathematical models can fail, it's a book that wants you to feel OUTRAGED about UNFAIRNESS.
Here's how it works. There's some area that's supposed to be improved by using a mathematical model (say, teacher evaluation in public schools). But after implementing this system there are some casualties (say, unfairly fired teacher who was well-liked and respected both by students and parents), which is bad and leads to a lengthy discussion of perils of capitalism.
Don't get me wrong, all things discussed in the book (which include recidivism, future job performance, and insurance) are indeed hard to model, but that's not a good way to discuss this models. One of the book's ideas is that you should forgo some of the model's accuracy to make it more fair. However, it's hard to talk about trade-offs without talking about how much we have in accuracy and utility. Did this teacher evaluation model improve overall school performance? If it did, would it be fair to students to make them go back to their horribly unimproved previous school performance? Or was it actually not that bad, and their test results improved simply because of better lunches (or even less lead in water)?
The chapter on credit scores grudgingly admits that human curation wasn't perfect (painting an expected picture of a banker discussing credits with his golf partners). Skip ten pages, and there's a friendly woman who helps to clean up the mess made by automated system that confused a client with a criminal namesake. Humans are winning again!
Except that they still have their own models, which are also bad (albeit in a different way). However, it is much easier to fix biases in algorithms and data if you're dealing with computers. One of the common complaints of the book is that computers can only project past data on the future, saving all those biases. It's not a problem that can't be fixed. Humans are.
More generally, it may be fun to complain about the issues of the model, but it's only useful to compare it to the alternatives. An implicit message of the book seems to be that we should ban usage of some algorithms and data (as expected, there's no discussion of second-order effects—if credits become more expensive, what will happen to the economy? Is this trade-off useful?). However, we can't simply ban things and forget about them, we can only replace them with something else.
I don't think that a book that is strictly about negative sides of something should necessarily strive to be objective. However, I would like to see less diatribes against greediness and more interviews with people who designed the models. What do they think about these problems?
(By the way, if you explain something by greediness, you‘re already wrong).
Some quotes are amazing, though.
fairness is squishy and hard to quantify. It is a concept. And computers, for all of their advances in language and logic, still struggle mightily with concepts. They “understand” beauty only as a word associated with the Grand Canyon, ocean sunsets, and grooming tips in Vogue magazine. They try in vain to measure “friendship” by counting likes and connections on Facebook. And the concept of fairness utterly escapes them. Programmers don't know how to code for it, and few of their bosses ask them to.
But I would argue that the chief reason has to do with profits. If an insurer has a system that can pull in an extra $1,552 a year from a driver with a clean record, why change it?
I love how much in depth Cathy O'Neil goes into her journey from working in academia as a professor in mathematics to working at a hedge fund, and then leaving after the 2008 recession. I love how accessible the book is to a wide variety of audiences.
This, I think, is one of the most important books I've read this year. For, one cannot expect to grasp even the most sketchy outline of our socio-economic reality if one is not familiar with the now-prevailing currency, namely data.
Computer is good at doing things fast, really fast. So, when it errs, it errs like the flash, resulting a gigantic accumulation of errors. It shouldn't be surprising that big data (a match made between statistics and computer science) with its inbuilt measures of inaccuracies paired with shortcomings in creating mathematical model that sufficiently mirror the reality will create tools of horrible injustice.
However, it is not always easy to notice. Technical difficulties and self-fulfilling feedback loop can deceive us quickly.
However, the writer herself have has been deep in this systems and saw these things closely. With her deep knowledge and a very conscientious mind, she is well equipped to discuss the matter in great depth and honesty.