Who should hold the reins?

Recently I went to my first meetup (just a group activity amongst strangers that is facilitated through the meetup website) that focused on Data Science in Chicago and was organized by Mike Stringer from Datascope Analytics. At this meeting Brett Goldstein, the Chief Data Officer of Chicago, was presenting and I have to say it was a wonderful talk and experience.

The talk primarily focused on how the City of Chicago is harnessing the vast amount of data available in a big city but also detailing their efforts in open-sourcing it to everyone in the city. As an almost data scientist (it seems as appropriate of a descriptor as biologist four years into my PhD. Ah how things change, just no one tell the granting agencies.) I greatly appreciate the city’s commitment to not only analyzing real data to help improve the services but also to make it publicly available, as both a scientist that relies on the data of others and Chicago resident.

I find it odd that given how government has been going that I immediately question if citizens have the right to this data, but I quickly remind myself that as a Chicago City taxpayer this is more or less my data to begin with. I think that this alone is an important and healthy step in the relationship between citizens and government. It enables us to use this data to not only help ourselves (the parking reminders to move your car for street cleaning are an excellent example) but also to keep watch over how services, and therefore tax money, are utilized (snowplow trackers). The push for participation from citizens is also great, because the accessibility of this data without any other prior requirements enables projects that benefit everyone that living in Chicago.

However, the biggest question that I have after the warm fuzzies fade away from government turning over a new leaf to a better system is at what point should this data analysis dictate policy decisions? Can it even do that? It should come as no surprise that I am far more in favor of giving weight to data analysis to make decisions than I am to legislators advancing their own political agendas, but the world where everything that is predicted to improve life in the city and is capable of being funded will never come to fruition.

For me, I think it’s simple to say if there is negligible monetary impact that data analysis should take charge of decision making once the method has been vetted. This could simply be like in the case of the snow plows, where the most trafficked streets and surrounding areas are done first. This requires time and effort in the optimization (since it is a balance between coverage of important streets versus efficient use of the plow time) but it should be doable. The harder part, comes when there are large monetary costs associated with a choice and the modeling efforts are not as clear cut. In the former case if a new set of plow routes are deployed when it snows and it actually isn’t better due to some unforeseen reason then it isn’t a big deal to change the plow routes for the next time it snows.

However, if 200 million dollars are spent expanding the CTA train lines and it turns out that adding yet another train line spoke on our already existing hub and spoke model (you can ask any Chicago resident about it, about 3 weeks after moving here everyone comes to the conclusion that this is a ridiculous set up) is ineffective, then that is a sunk cost that is impossible to reclaim. I think the best thing to do is to not let the analysis dictate policy, but to still let it have a voice. We make many assumptions that we think are best (that highways should go directly into city centers which was a paradigm in the 1950s and now results in obscene traffic) but may not be that good and this serves as a powerful check against that. It also raises the possibility that with increased use our knowledge and experience will advance and allow us to rely on this analysis more and more.

Despite this flaw I truly do believe that it is in our best interests as a society to move away from the model where policy decisions are dictated as much by whim as (in)accurate information to one where objective analysis plays an important role. I think that the transition to objective analysis influencing decisions may not be perfect and marked with missteps, but that it should not deter us from the path.

— Adam Pah