Search
Loading

Purdue Libraries and School of Information Studies News

PURR Integration with Globus: A Big Opportunity for Big Data at Purdue

May 9th, 2023

Purdue University Libraries, School of Information Studies, and Purdue IT’s Rosen Center for Advanced Computing (RCAC) are pleased to announce that Globus, a fast and reliable service that provides large scale data transfer, is now integrated into Purdue University Research Repository (PURR). The integration of Globus is a significant step forward in supporting larger scale research data sharing at Purdue. This integration allows the transfer of data from Data Depot to PURR, reliably facilitating larger scale data sharing and publication. It provides greater transfer capacity and connection between the high-capacity, secure data storage service and Purdue’s institutional data repository. It also helps Purdue’s many grant-funded research projects comply with federal mandates for sustainable data sharing. The Data Depot and PURR/Globus integration are available to researchers at all Purdue University campuses. 

Over the past decade, the U.S. Office of Science and Technology Policy (OSTP) has released two subsequent memos requiring federal funding agencies to implement policies specifying that extramural researchers supported through grants must maximize data sharing and sustainability, specifically providing free and open public access. The most recent OSTP memo from August 2022 states that funding agencies must “update their public access policies…no later than December 31st, 2025, to make publications and their supporting data resulting from federally funded research publicly accessible without an embargo on their free and public release.” This will significantly impact researchers, who are encouraged to proactively plan for how they will satisfy these mandates. When submitting proposals for federal grant awards, researchers must include a data management and sharing plan that addresses how they will handle, disseminate, and sustain access to their data and related materials. The PURR/Globus transfer option gives Purdue researchers a reliable, fast, and secure way to publish their data with PURR to fulfill this new federal requirement. 

Here’s how. Faster than Secure File Transfer Protocol (SFTP), Globus monitors the data transfer process to mitigate problems when network strength is limited. With the widespread use of Data Depot by Purdue researchers and their collaborators, the use of Globus facilitates transfer via web interface, which offers users more flexibility. PURR, in turn, provides a publishing platform for Purdue researchers and their collaborators, allowing them to share their data openly, and at no cost to the user. 

As an institutional research core and local support, researchers can easily use PURR to fulfill their need to publish and share data. PURR is compliant with federal funder requirements for open sharing. In fact, it is central to PURR’s continued mission to provide researchers with support throughout the research process and a trusted means to share research data, with the added advantage of PURR’s high quality preservation methods.

PURR began publishing data in 2012. Over ten years later, it continues to grow as an integral service for Purdue researchers. PURR currently hosts over 1525 published data sets and 2223 research projects, and provides sharing and preservation support to many grant funded projects, with the National Science Foundation (NSF) being most common. PURR is also working towards a CoreTrustSeal certification, a global certification based on a universal catalog of requirements that reflect the characteristics of trustworthy data repositories. Currently, PURR is in the process of forming partnerships with additional units across campus to develop a comprehensive suite of support services for Purdue researchers facing new challenges based on the NIH Data Management and Sharing Policy and the August 2022 OSTP memo. The PURR/Globus integration is an important step.

“Globus is already available to the Purdue community in conjunction with the Rosen Center for Advanced Computing,” Reid Boehm, research data system manager at PURR, says. “Together, we have opened a door to Data Depot users who wish to deposit data in PURR for publication and sharing. We’ve started in connection with Data Depot, but we hope to continue partnering with RCAC to enable future options for transfer.”

Boehm assures that researchers interested in taking advantage of the new integration will find it to be fairly easy, but there are some requirements and collaboration involved to get started. First, researchers must already be using Data Depot, and they must already have a project created in PURR. Once they have completed these steps, they will then submit a ticket to PURR for the large file transfer. More information on this process is available online. 

“To get help creating a project, researchers can visit our guide,” Boehm says, “or get in touch with us at purr@purdue.edu. The more we are able to raise awareness of the repository and demonstrate its potential benefits to researchers, the more we can share the overarching possibilities for data sharing with our diverse communities.”