Purdue Libraries and School of Information Studies News

Publishing Open Access and Transformed Datasets

October 24th, 2019

Oct. 21-27, 2019, is International Open Access Week. This is part of a series — written by Purdue faculty and staff — that demonstrates the benefits of open access scholarly publishing. For the entire series, visit

Sandi Caldrone, Purdue University Libraries and School of Information Studies
Sandi Caldrone

by Sandi Caldrone, Data Repository Outreach Specialist

Publishing open access data requires imagination. When I review datasets submitted for publication in the Purdue University Research Repository (PURR), I try to put myself in the shoes of a scholar hoping to reuse this dataset, and I try to imagine every question the scholar might have. When you share your data with the world, you open it up to new possibilities—possibilities that are hard to anticipate.

On November 10, 1981, French philosopher Gilles Deleuze gave a lecture on cinema in a Paris university. When he prepared his notes for class that day, he could have had no way of knowing that a student’s audio recording of that lecture, along with dozens of his other lectures, would eventually find their way to the French National Library, and from there to PURR, where anyone can download it to hear his words or text mine the transcriptions.

When Deleuze gave this lecture a little less than 40 years ago, that tape recorder was the most advanced technology in the room. Now, digital humanities students can plug his words into online tools that spin out word clouds, bubble charts, and network graphs. That’s why data curators are always pushing for richer descriptions of data. We want to give future researchers everything they might need to conduct analyses we can’t even imagine yet.

The cycle of imaginative reuse doesn’t have to take forty years. In PURR, we’re already starting to see second-generation open access data—open access data that has been combined, transformed, and republished as a new open access dataset.

As it was in Deleuze’s classroom, it is students who are in the vanguard.

In 2019, PURR has started to see examples of student-faculty collaborations in which students collect data from various open access datasets and put in the labor required to prepare those data for analysis. By publishing their transformed data, they give other researchers the opportunity to pick up where they left off and push scholarship forward, instead of reinventing the wheel. See two excellent examples:

It’s hard to imagine what students might do with data 40 years from now, but I’m really looking forward to finding out.

Explore the Purdue University Research Repository at

Learn more about Purdue’s Open Access resources, including Purdue e-Pubs, Purdue’s open access digital repository, at