Big Data Meet the Humanities!

Dear Commons Community,

The New York Times has an article today describing the use of big data approaches to support research in the humanities and the social sciences.  It specifically describes the use of big data to do large-scale literary analysis.   The article opens by posing the question of who are the leading (English language) novelists of the 19th century. Would they be Charles Dickens, Thomas Hardy, Herman Melville, Nathaniel Hawthorne and Mark Twain. The article then goes to describe a study by Matthew Jockers using big data technques.

“… a recent study has found, Jane Austen, author of “Pride and Prejudice, “ and Sir Walter Scott, the creator of “Ivanhoe,” had the greatest effect on other authors, in terms of writing style and themes.

These two were “the literary equivalent of Homo erectus, or, if you prefer, Adam and Eve,” Matthew L. Jockers wrote in research published last year. He based his conclusion on an analysis of 3,592 works published from 1780 to 1900. It was a lot of digging, and a computer did it.

The study, which involved statistical parsing and aggregation of thousands of novels, made other striking observations. For example, Austen’s works cluster tightly together in style and theme, while those of George Eliot (a k a Mary Ann Evans) range more broadly, and more closely resemble the patterns of male writers. Using similar criteria, Harriet Beecher Stowe was 20 years ahead of her time, said Mr. Jockers, whose research will soon be published in a book, “Macroanalysis: Digital Methods and Literary History” (University of Illinois Press).

… At this stage, this kind of digital analysis is mostly an intriguing sign that Big Data technology is steadily pushing beyond the Internet industry and scientific research into seemingly foreign fields like the social sciences and the humanities. The new tools of discovery provide a fresh look at culture, much as the microscope gave us a closer look at the subtleties of life and the telescope opened the way to faraway galaxies.

Traditionally, literary history was done by studying a relative handful of texts,” says Mr. Jockers, an assistant professor of English and a researcher at the Center for Digital Research in the Humanities at the University of Nebraska. “What this technology does is let you see the big picture — the context in which a writer worked — on a scale we’ve never seen before.”

Interesting and worth a read!!

Tony

 

One comment