Dear Commons Community,
Randall Stross, professor of business at San Jose State University, had a piece in yesterday’s New York Times on computer scoring of essays specifically of the kind that are used on the standardized tests in K-12 schools. He described:
“This spring, the William and Flora Hewlitt Foundation sponsored a competition to see how well algorithms submitted by professional data scientists and amateur statistics wizards could predict the scores assigned by human graders. The winners were announced last month — and the predictive algorithms were eerily accurate.
The competition was hosted by Kaggle, a Web site that runs predictive-modeling contests for client organizations — thus giving them the benefit of a global crowd of data scientists working on their behalf…
The essay-scoring competition that just concluded offered $60,000 as a first prize and drew 159 teams. At the same time, the Hewlett Foundation sponsored a study of automated essay-scoring engines now offered by commercial vendors. The researchers found that these produced scores effectively identical to those of human graders.
Barbara Chow, education program director at the Hewlett Foundation, says: “We had heard the claim that the machine algorithms are as good as human graders, but we wanted to create a neutral and fair platform to assess the various claims of the vendors. It turns out the claims are not hype.”
If the thought of an algorithm replacing a human causes queasiness, consider this: In states’ standardized tests, each essay is typically scored by two human graders; machine scoring could replace one of the two. And humans are not necessarily ideal graders: they provide an average of only three minutes of attention per essay. Ms. Chow says.
We are talking here about providing a very rough kind of measurement, the assignment of a single summary score on, say, a seventh grader’s essay, not commentary on the use of metaphor in a college senior’s creative writing seminar.”
The article goes on to make the case that if for nothing else the software could be used as a cost-effective way of scoring these types of essays by eliminating at least one of the human scorers.