Recently, the Illinois Department of Children and Family Services (DCFS) decided to discontinue using a predictive analytics tool that would predict if children were likely to die from abuse or neglect or other physical threats within the next two years. Director BJ Walker reported that the program was both 1) not predicting cases of actual child deaths, including the homicide of 17-month old Semaj Crosby earlier in April of this year, and 2) alerting way too many cases of over 100% probability of death. When it comes to a matters like abuse, neglect, and deaths of children, false positives (Type I Errors) and false negatives (Type II Errors) have enormous consequences, which makes productionalizing and applying these tools with unstable error rates even more consequential.
The algorithm used by the DCFS is based on a program called the Rapid Safety Feedback program, the brainchild of Will Jones and Eckerd Connects, a Florida-based non-profit dedicated to helping children and families. First applied in Hillsborough County back in 2012 by then-Eckerd Youth Alternatives, the predictive software read in data and records about children’s parents, family history, and other agency records. Some factors going into the algorithm include whether there was a new boyfriend or girlfriend in the house, whether the child had been previously removed for sexual abuse, and whether the parent had also been a victim of abuse or neglect previously. Using these and many other factors, the software would rank each child a score of 0 to 100 on how likely it was that a death would occur in the next two years. Caseworkers would be alerted of those with high risk scores and with proper training and knowledge, intervene in the family. In Hillsborough County, the program was a success in seemingly reducing child deaths after its implementation. The author and director of the program acknowledged that they cannot 100% attribute the decrease to the program, as there could be other factors, but for the most part, the County saw a decrease in child deaths, and that’s a good outcome.
Since then, the program has gained attention from different states and agencies looking to improve child welfare. One such state was Illinois. However, the program reported that more than 4,000 children were reported to have more than a 90% probability of death or injury, and that 369 children under the age of 9 had a 100% probability. Through the program, caseworkers are trained not to immediately act solely based on these numbers, but the fact of the matter was that this was a very high and clearly unreasonable number. High positive matches brings the conversation to the impact of false positives on welfare and families. A false positive means that a caseworker could intervene in the family and potentially remove the child from the parents. If abuse or neglect were not actually happening, and the algorithm was wrong, then the mental and emotional impacts on the families can be devastating. Not only could the intervention unnecessarily tear apart families physically, but they can traumatize and devastate the family emotionally. In addition, trust in child services and the government agencies involved would deteriorate rapidly.
On top of the high false positives, the program also failed to predict two high-profile child deaths this year, of 17-month old Semaj Crosby and 22-month-old Itachi Boyle. As Director Walker said, the predictive algorithm wasn’t predicting things. The impact of these Type II errors in this case don’t even have to be discussed in detail.
On top of the dilemmas with the pure algorithm in this predictive software, the DCFS caseworkers also complained about the language of the alerts, which used harsh language like “Please note that the two youngest children, ages 1 year and 4 years have been assigned a 99% probability by the Eckerd Rapid Safety Feedback metrics of serious harm or death in the next two years.” Eckerd acknowledged that language could have been improved, which brings in another topic of discussion around communicating the findings of data science results well. The numbers might spit out a 99% probability, but when we’re dealing with such sensitive and emotional topics, the language of the alerts matters. Even if the numbers were entirely accurate, figuring out how to apply such technology into the actual industry of child protective services is another problem altogether.
When it comes to utilizing data science tools like predictive software into government agencies, how small should these error rates be? How big is too big to actually implement? Is failing to predict one child death enough to render the software a failure, or is it better than having no software at all and missing more? Is accidentally devastating some families in the search for those who are actually mistreating their children worth saving those in need? Do the financial costs of the software outweigh the benefit of some predictive assistance, and if not, how do you measure the cost of losing a child? Is having the software helpful at all? As data science and analytics becomes more and more applied to this industry of social services, these are the questions many agencies will be trying to answer. And as more and more agencies look towards taking proactive and predictive steps to better protect their children and families, these are the questions that data scientists should be tackling in order to better integrate these products into society.