David Brooks on Big Data… Again!

Dear Commons Community,

David Brooks devoted his column today on a discussion of the limits of big data and analytics.  He had a similar column in February of this year.   He discusses big data’s reliance on correlations and cautions that correlation is not causality.  More specifically:

“In my columns, I’m trying to appreciate the big data revolution, but also probe its limits. One limit is that correlations are actually not all that clear. A zillion things can correlate with each other, depending on how you structure the data and what you compare. To discern meaningful correlations from meaningless ones, you often have to rely on some causal hypothesis about what is leading to what. You wind up back in the land of human theorizing.

Another obvious problem is that unlike physical objects and even animals, people are discontinuous. We have multiple selves. We are ambiguous and ambivalent. We get bored, and we self-deceive. We learn and mislearn from experience. Thus, the passing of time can produce gigantic and unpredictable changes in taste and behavior, changes that are poorly anticipated by looking at patterns of data on what just happened.

Another limit is that the world is error-prone and dynamic. I recently interviewed George Soros about his financial decision-making. While big data looks for patterns of preferences, Soros often looks for patterns of error. People will misinterpret reality, and those misinterpretations will sometimes create a self-reinforcing feedback loop. Housing prices skyrocket to unsustainable levels.

If you are relying just on data, you will have a tendency to trust preferences and anticipate a continuation of what is happening right now. Soros makes money by exploiting other people’s misinterpretations and anticipating when they will become unsustainable.”

Brooks concludes:

“One of my take-aways is that big data is really good at telling you what to pay attention to. It can tell you what sort of student is likely to fall behind. But then to actually intervene to help that student, you have to get back in the world of causality, back into the world of responsibility, back in the world of advising someone to do x because it will cause y.

Big data is like the offensive coordinator up in the booth at a football game who, with altitude, can see patterns others miss. But the head coach and players still need to be on the field of subjectivity.

Most of the advocates understand data is a tool, not a worldview. My worries mostly concentrate on the cultural impact of the big data vogue.”

As someone who has been following the big data and learning analytics evolution, I am glad to see that Brooks is bringing some of the discussion to the popular press.  I know that my colleagues here at the Graduate Center and other colleges and universities likewise are struggling with understanding the potential of big data analysis.


Comments are closed.