3 Final Review
While the actual coding of the software may not be inherently flawed, the data that goes into the software may be flawed leading to skewed outcomes or WMDs that can lead to exclusionary practices. This book was an interesting read through every chapter. I particularly enjoyed the separation in issues among each chapter. Ranging from problems in higher education to the 2008 financial crisis, it truly shows that big data plays a major role in all of our lives. I most enjoyed chapter 9 which focused on the use of WMDs in the insurance industry. It blew my mind when I read:
“And in Florida, adults with clean driving records and poor credit scores paid an average of $1,552 more than the same drivers with excellent credit and a drunk driving conviction.”
I think that it is absolutely ridiculous that insurance companies can get away with such arbitrary practices in their businesses. An insurance company charging extravagant prices for the coverage of something that is in no way tied to the individual’s behavior seems outrageous and counterintuitive. However, it makes sense when you realize that they are using these models solely as a means to milk every dollar they can out of the consumer. I agree with her analysis in the conclusion of the book where she states that it is difficult to combat WMDs when they all feed off of each other. Poor people have bad credit in bad neighborhoods and are shown ads for bad schools and can’t get good jobs which all lead to increased crime and recidivism rates. It is a positive-feedback loop that results in a perpetuation of structural violence. Despite this, I do not think that the system is irredeemable. I think that it is surely possible to refine models to remove discriminatory conclusions on people and instead seek answers through an objective lens. The solution to many of these WMDs is the objective of the problem. Instead of using models to push poor people out of good jobs, use the algorithms to identify the poorer areas that need job growth. The same could be said for auto insurance. Instead of increasing premiums to cover for losses, invest money into cleaning up the streets or rebuilding the roads to allow for safer transit. There are ways to go about using large machine learning systems for the good without the negative drawbacks of flawed assumptions. When we were working with models in class we often would attempt to boost the model to improve efficiency. However, many times we did not go back to analyze the actual data in the spreadsheets to identify potential cases of multicollinearity, serial correlations, or flawed variables that did not belong in our model. Going back to chapter one, Cathy O’Neill claims that:
“Models are opinions embedded in mathematics”, this is more true than ever before and it is up to data scientists to make those opinions as objective as possible.