In January 2011, the National Science Foundation began requiring that all grant proposals include two-page plans that describe what data will be generated in the research and how the data will be managed and shared. Other funding agencies such as the National Institutes of Health, NASA, and the National Endowment for the Humanities soon followed suit with their own requirements.
The Purdue University Research Repository (PURR) was created to support researchers in meeting these requirements by creating a platform for collaborating on research and publishing and archiving datasets.
Examples of research data include software source code, output from sensors and instruments, interview transcripts, observation logs, spreadsheets, databases, scientific images and video, and more.
Purdue faculty, graduate students, and staff can create projects on the PURR website, invite others to join their projects, and receive a free allocation of storage and tools for helping them collaborate and manage their research data.
“Scholars often publish their findings in conference and journal papers, but without the supporting data, the research can’t be reproduced and verified by others”, says Courtney Matthews, Digital Data Repository Specialist at the Purdue Libraries. “PURR gives Purdue researchers a platform for managing and publishing their datasets in a way that meets funder requirements and enables the reuse of data that gives credit to the researcher.”
It also provides boilerplate text that can be pasted into grant proposals as well as tutorials and support for developing effective data management plans.
Since its launch, PURR has been included in over 500 grants proposals that have originated from Purdue.
Datasets that are published and archived in PURR are assigned Digital Object Identifiers (DOIs) that uniquely identify them and make them more easily tracked and cited. David Gleich, an assistant professor of Computer Science, recently used PURR to publish a dataset for testing algorithms in social network analysis. “DOIs make it easy to track citations, usage, and other metrics”, says Gleich. “It’s always important to be able to demonstrate [research] impact.”
For eight years agronomy Professor Jeffrey Volenec and colleagues collected data from ninety-six farm plots to better understand how potassium and phosphorus levels influence the growth of alfalfa. With the study over, the question became what to do with all that data.
That concern prompted Volenec to be one of the first users of PURR. “It’s unlikely to be done anytime soon by anyone else so we thought this type of data ought to be preserved,” Volenec says. “It was bought mainly with tax dollars. The data, the numbers, belong to the people.”
Datasets are archived for a minimum of ten years, after which time they are managed as a collection of the university’s libraries. PURR was designed to implement open standards and best practices such as the ISO 16363 certification of trustworthy digital repositories, for which an audit process is currently underway.
PURR was jointly developed by the Purdue Libraries, the Office of the Vice President for Research, and Information Technology at Purdue. The service is based on HUBzero™, which was also developed at Purdue.