Dear Commons Community,
Marc Parry has an article in The Chronicle of Higher Education, examining the ups and downs of big data research. While there are various definitions of “big data”, generically, the term assumes that the information or database system(s) used as the main storage facility is capable of storing large quantities of data longitudinally and down to very specific transactions for subsequent use in studying human behavior. Depending upon the application, big data could involve capturing every keystroke relating to the focus or behavior of an inquiry.
Parry starts by referring to questions to David Lazer:
“In 2009, David Lazer sounded the call for a fresh approach to social science. By analyzing large-scale data about human behavior—from social-network profiles to transit-card swipes—researchers could “transform our understanding of our lives, organizations, and societies,” Mr. Lazer, a professor of political science and computer science at Northeastern University, wrote in Science. The professor, joined by 14 co-authors, dubbed this field “computational social science.”
This month Mr. Lazer published a new Science article that seemed to dump a bucket of cold water on such data-mining excitement. The paper dissected the failures of Google Flu Trends, a flu-monitoring system that became a Big Data poster child. The technology, which mines people’s flu-related search queries to detect outbreaks, had been “persistently overestimating” flu prevalence, Mr. Lazer and three colleagues wrote. Its creators suffered from “Big Data hubris.” An onslaught of headlines and tweets followed. The reaction, from some, boiled down to this: Aha! Big Data has been overhyped. It’s bunk.”
Big data is surely being hyped as have many other new technological approaches only to have to come down to earth at some point. My opinion is that big data is a natural evolution of decision support systems that have been developing for the last fifty years. Database technology has been evolving steadily as the use of the Internet has greatly expanded hardware and software data-capturing facilities. For researchers using big data techniques, one of the major problems as Parry astutely mentions is;
“The emerging problems highlight another challenge: bridging the “Grand Canyon,” as Mr. Lazer calls it, between “social scientists who aren’t computationally talented and computer scientists who aren’t social-scientifically talented.” As universities are set up now, he says, “it would be very weird” for a computer scientist to teach courses to social-science doctoral students, or for a social scientist to teach research methods to information-science students. Both, he says, should be happening.”
The disconnect between disciplines is a major issue and there needs to be an integration of skills in order to realize the potential of big data research in human behavior. In addition, a recognition of the cultural differences among disciplines has to be recognized. The latter is much more difficult to achieve.