2 Chapter Summaries
Chapter 1 introduces the concept of models as heuristic tools that formulate responses based on data inputs. The author explains the concept of models by explaining the usage of them in sports like baseball where managers, statisticians, and scouts input hundreds if not thousands of different variables into computer algorithms. The algorithms then find linear regressions comparing the most related variable to that of success whether it is strikeouts, home runs, on-base percentage, or sheer wins. She then uses her own life as an example of the efficiency of models, but also drawbacks. When it comes to making dinner, there is simply too much information for a human brain to remember when it comes to building a suitable model for a dynamic game like feeding children. Many of the variables inputted can be subconsciously influenced through personal biases which not only result in flaws for the model’s efficiency but in certain environments can even be seen as illegal. This would include but not be limited to racial, sexual, socioeconomic, ethnic, and/or religious biases towards the other. Thus, it is important to be cognizant of this and to be constantly refining models to ensure their accuracy and effectiveness are always maximized. She states that:
“Models are opinions embedded in mathematics”, this is important because it summarizes both the advantages and disadvantages of model building, where they can be highly accurate and algebraic, but also be skewed due to human involvement in their creation.
There are many examples of flawed models that are counterproductive in achieving their goals. One example is the LSI-R or Level of Service Inventory-Revised. It consists of a questionnaire filled out by inmates that aim at collecting data to predict the likelihood of recidivism. This also aids judges in determining sentence punishments among other things. Unfortunately, the questionnaire consists of many questions that do not portray the likelihood of a repeated offense contingent upon that person’s individual likelihood to repeat the said crime. It aims to instead classify inmates based upon demographic factors like the number of siblings in prison, the number of police encounters, drug and alcohol usage, socioeconomic status, etc. While the model is correct in that these variables are associated with crime rates, classifying this trend and then implementing harsher penalties upon these trends only further exacerbates the relationship. This creates a positive-feedback loop that only perpetuates structural violence, primarily of African-American men.
Chapter 2 provides a brief synopsis of the author’s career choices that led her to where she is today. After teaching at Barnard College for many years she left to work at a hedge-fund where she could use her skills in the real world. It was there that she was exposed to her first WMDs. Models built to analyze the risk of mortgage defaults and translate them into the viability of mortgage-backed securities had failed due to their inability to process future events (the sudden collapse of the housing market at once compared to sparse foreclosures here and there). The models also failed to take into consideration the blatant corruption of the banking system. Ranging from rating agencies selling AAA ratings in exchange for fees, Wells Fargo dealing high-interest loans to minority populations in exchange for fees, brokers selling a mortgage to any person that wanted one, and lying about the ability to refinance in exchange for fees (there seems to be a pattern). All of the practices centered around the mortgage bond market seemed to assume that the industry was “too big to fail”, and thus did anything to reap a profit from this elusive environment. However, when millions of homeowners defaulted on loans and everything went belly-up, the entire system collapsed. It was then that Cathy O’Neill realized her role in the financial system and decided to leave it for good to uncover the WMDs that led to its demise.
Chapter 3 provides an empirical example of the scaling effect models with WMDs can inflict. Looking at the model used by U.S. News in their rankings of colleges across the United States, it is clear that their results (whether right or wrong) led to a nationwide transformation in the goals and agenda-setting of college administrators. Officials began investing millions of dollars into top professors, building entire university halls for rather small departments, some even went as far as falsifying data such as SAT scores, freshman retention, and graduation rates, alumni donations, etc. This all led to a hyper-competitive atmosphere where colleges (many of prestigious quality) contended with one another to climb a list of rankings that in no way shape or form, accurately portrayed the success or prestige of the said university. This introduced the idea of creating a model from proxies. Identifying variables that are correlated with another variable, then using that to support output considerations in model-building. This results in gamifying the proxy to receive better outcomes, which in itself, reduces the effectiveness of the proxy. Regardless of the metric used in evaluating these rankings, they can almost always be gamified which obscures the effectiveness of the model.
Chapter 4 reveals the harsh world of for-profit colleges and their relation to the predatory practices of online propaganda. Institutions like the University of Phoenix spend hundreds of millions of dollars annually targeting low-income and minority students. Using the internet as a data mine they can obtain vast amounts of information related to an individual’s likelihood to accept government loans, attend college, socioeconomic status, self-esteem, etc. Using these data points they can decide if the prospect is worth the money to try and recruit for the school. These universities spend an average of over 2,000 per student, just for marketing purposes. While that same student receives an equivalent of only 800 in spending towards improving their quality of education. These practices are predatory in nature but fueled by large numbers in the population. Even getting just 0.001% of the student population to attend your university, translates into outrageous profits when the student is spending upwards of $70,000 annually in tuition and fees. These machine learning algorithms are not efficient in nature but have the potential to be when coupled with the vast amounts of data on the internet that they have at their fingertips to learn from. O’Neill says it best when they claim that:
“Math, in the form of complex models, fuels the predatory advertising that brings in prospects for these colleges.”
Chapter 5 describes the prison-industrial complex and the WMDs that support it. The over-policing of Black and Latino men in poor neighborhoods has been fueled by predictive policing models like PredPol. The intentions of these models are not bad, but the practice tends to juxtapose the original goals of decreasing crime rates. The reasoning behind the implementation of these models is to decrease crime in mainly bad neighborhoods. They achieve this by mapping where crime happens geographically and then placing more patrol units nearby to deter crime from occurring. This technique has proven successful in many cities as there has been a considerable decrease in violent crime. However, Cathy O’Neill points out that this had led to the tendency to over police certain areas and cross boundaries that lead to an increase in arrests. Michael Bloomberg’s “Stop and Frisk” policies led to a substantial reduction in homicide, but it also led to an explosion in arrests for Black and Latino populations. These models when coupled with human judgment tend to result in the use of force to eliminate rather than reduce crime. While in effect it is successful, that is arresting people that possess a non-zero chance of committing a crime does reduce future crime, it also leads to an increase in recidivism later down the road. Greater prison populations lead to more people being stigmatized by society preventing them from accessing adequate jobs and such. This prevents adequate social mobility forcing them to resort to crime to survive day-to-day. While many models are efficient in predicting future crime waves, the implementation of policing measures is almost always reliant upon a human judgment which clouds effectiveness through cognitive biases. Thus, O’Neill advocates for an engendering process between police and the policed. By decreasing tensions between the groups, one can hope that crime rates will decrease.
Chapter 6 highlights unseen prejudices that certain algorithms may have when it comes to job applications. Recently, there has been an extensive amount of research towards emotional intelligence and personality, particularly in the Industrial/Organization field of Psychology. Big companies are paying top dollar for models that can predict some of these traits in job prospects because they believe that many of those viable traits like empathy, teamwork, self-control, social awareness, etc, can be determinative in an applicant’s likelihood to not only be successful at the company but also stay there for awhile. Many of the models that these companies rely on however can be predatory and prone to racial and sexual discrimination. Having algorithms weed out applicants that are deemed as a liability or a red flag to the company because of conflating personality fits could be a use of medical examination to filter selection practices, which is a strict violation of the Americans with Disabilities Act of 1990. Many of these models have been adapted from human decision-making to streamline efficiency. That technique alone reproduces notions of human discrimination but with breathtaking efficiency. Despite this, many companies are creating models that are self-aware of human discrimination and thus implement methods of fairness to prevent the exclusion of certain populations.
Chapter 7 is about the grim nature of businesses using data modeling to create hyper-efficient schedules for their employees. Accurately predicting the optimal demand for labor under certain conditions allows for businesses to reduce costs and maximize profits. This process hurts employees the most because it often results in last-minute scheduling changes as new data becomes available. O’Neill uses the term cloning. It refers to the process of an employee closing the shop the night before and opening it in the morning. This can lead to sleep deprivation and a litany of other negative side effects but is efficient and profit-driven for the firm. The process for how this occurs often begins at the corporate levels. Managers at store locations can be reprimanded if labor budgets exceed certain thresholds which force their hands to create grueling schedules for their workers. This often results in employees being worked every minute of a shift to maximize revenues per worker. The chapter even has an underlying critique of capitalism writ large. It claims we need countervailing forces willing to expose the injustices of corporations related to worker discrimination and exploitation, and WMDs are just one aspect of that.
Chapter 8 illustrates a problem that many Americans face each day, acquiring credit. Many companies have been tasked with collecting large amounts of consumer information and selling that information to larger data corporations. There are multiple examples of flawed data collection resulting in people with the same name being conflated. This has resulted in people being rejected from job interviews, credit applications, federal housing assistance, and many more. To make matters worse, corporations have done little to correct misleading information which forces the human (the product in all of these transactions) to sit down with another human (usually the person running the background check for a loan, FHA, or job application) and sort all of this out. This takes time and money and can often only resolve the issue with one data company, leaving many of the alternatives with the same faulty information. When automatic systems are tasked with collecting data to complete a demographic profile on someone, it often leads to a classification of them in one way or another that can create a positive feedback loop. People who often shop for used cars on the marketplace may be the same people receiving online advertisements for payday loans which only fuels the poverty cycle. Algorithms have been credited with automatically adjusting credit rates based on shopping patterns which can directly lead to someone having greater trouble in paying bills on time, thus making them poorer.
Chapter 9 begins with a short anecdote of Frederick Hoffman’s potent WMD mistake in 1896 when he published a paper analyzing the risk of insuring African Americans due to their precarious lives. He failed to account for structural factors that contributed to this like education, poor housing, sanitation, and crime rates in these neighborhoods. This created the practice of redlining which deeply segregated America for the next century. This is an example of when statisticians fail to identify the difference between causation and correlation in their models. Insurance companies often use consumer data to influence premiums for car insurance. At first, this may seem ridiculous given insurance companies should be focused on how safe of a driver someone is and not what they buy at the supermarket. Yet, when they can use consumer data to justify higher premiums they are successfully able to squeeze every dollar from the consumer, maximizing profits, and paired with the fact that consumer data is readily accessible and cheap, it is a no brainer. Employers are now even beginning to roll out wellness plans. These consist of mandatory health goals (punishable by increased premiums if one does not participate) aimed at improving the overall risk of the pool and thus lowering healthcare costs to the firm. These have embraced their own WMD as they are not correlated with decreased health expenses for the company due to non-statistically significant data points including Body Mass Index and the number of daily steps.
Chapter 10 discusses the implications of WMDs on civics and the devastation that it can reap on the U.S. political system. Facebook has proven its ability to influence political elections simply by prioritizing different content in the feeds of its users. This can make it so users feel more inclined to vote in an election and can even affect the likelihood of who they vote for. There has been an empirical use of machine learning systems to gather data about potential voters to drain as much money out of them as possible, this is incredibly useful for those that have already decided who they are voting for. There is even a growing usage of WMDs to spread misinformation. The book references an incident in 2015 where an antiabortion group spread doctored images of a stillborn to portray Planned Parenthood as a terrorist organization. All of this exists to show that certain individuals (and a small number of them too) can be targeted by models to influence national elections. When most elections come down to a few key swing states, the voters that determine those outcomes can be worth tens of thousands of dollars to these data collection services, especially in the next election, so it is of utmost importance for their services to be correct and efficient. Yet, this within itself poses a major threat to the legitimacy of our democracy.