Harvard and M.I.T. Release De-identified Data on MOOC Students!

Dear Commons Community,

As reported in The Chronicle of Higher Education, “de-identified” records of more than a million people who took part in the first year of massive open online courses offered by Harvard University and the Massachusetts Institute of Technology have been released to researchers, the two institutions said on Friday.

The institutions said the records had been “subjected to a careful process of de-identification: removing personally identifiable information, using best practices including aggregation, anonymization via random identifiers, and blurring to reduce individuality of sensitive data fields, among other techniques.”

“By sharing these de-identified data, we hope to show that we can protect information about individuals while still enabling replicable research about what works in online learning,” said Andrew Ho, an associate professor at the Harvard Graduate School of Education. Mr. Ho and Isaac Chuang, a professor in MIT’s departments of electrical engineering and computer science and of physics, were the lead researchers in the effort to release data from the courses, offered through the two institutions’ edX platform.”


I have just downloaded the file and accompanying documentation and it appears realively easy to use. It is in EXCEL spreadsheet format and can easily be converted to SPSS. There are about twenty variables for each student record including country of origin, gender, course enrollment, etc.
This might be fertile ground for researchers interested in MOOC technology.


On the surface, this appears to be a good move on the part of Harvard and M.I.T.


Comments are closed.