Bringing Bioinformatics to Boilermakers

Biochemistry and bioinformatics expert Dr. Peter Pascuzzi teaches Purdue researchers how to use open-source and web-based tools to dig deeper into their research data.

When hearing about what Purdue University Libraries Assistant Professor Peter Pascuzzi does at Purdue and how he helps graduate-student and faculty researchers, one is reminded of the inventive plot twists often found in episodes of “The Simpsons.” His career chronicle has many intriguing turns like those portrayed in each tidy tale of Homer’s world. And then, so satisfyingly, it all makes perfect sense when you get to the episode’s end.

Purdue Libraries Assistant Professor Peter Pascuzzi

Pascuzzi presenting to students at a Fall 2018 orientation in the Wilmeth Active Learning Center (Home of the Library of Engineering and Science). Photo by Lindsey Organ

Pascuzzi, who studied biology and chemistry as an undergraduate, earned his Ph.D. in biochemistry (at Cornell). Naturally, as a Libraries faculty member in life sciences, his subject areas at Purdue include biochemistry, bioinformatics, medicinal chemistry, molecular biosciences, and molecular pharmacology.

His expertise is not only in biochemistry, but also in bioinformatics. He teaches researchers how to use web-based and open-source tools to better analyze and understand their research data. His CellMiner Companion application is an example of this. According to Omicstools, where his application is available, the tool “enables researchers to explore the output of CellMiner queries. The data from multiple files is summarized, assembled into a single data matrix, z-score normalized, clustered, and visualized both as a heatmap and dendrogram.” [See image below.]

“With the work I do here at Purdue, I really want to make an impact on the science, but more importantly, having been a graduate student, I have a lot of empathy with people who get stuck in a place because they don’t have the data skills they need. So I have always made a lot of effort to understand what the graduate students need, and that is what motivates a lot of the teaching I do,” Pascuzzi explained. “Many people I teach are new to bioinformatics. But I can look back over a few years now, and from that experience, I can surmise about half of them will go on to do their own bioinformatics, and it will really help them in their research. I’m not saying they wouldn’t publish without it, but I know they are doing more of the work themselves and that they are more qualified because of what they learned.”

Image Courtesy of Peter Pascuzzi. Image is figure from NIH website, "A gene expression pattern for 21 transporter genes was retrieved from CellMiner and visualized with CellMiner Companion."

Image Courtesy of Peter Pascuzzi. “A gene expression pattern for 21 transporter genes was retrieved from CellMiner and visualized with CellMiner Companion.” Image is figure from “CellMiner Companion: an interactive web application to explore CellMiner NCI-60 data.” (Image Link to NIH website)

Plants, Paths, Plots, and Projects

Pascuzzi, who started college at 26, began his academic career studying to be a plant scientist.

“I always tended to make these weird angles. I fell in with a great botany professor and then good chemistry professors. Later, I transferred to a new school and got in with a good genetics professor. I went to Cornell for plant science—I was in the biochemistry program, but working in a plant pathology lab at the Boyce Thompson Institute, which surprised my department,” he explained. “Then I collaborated on a structural biology project, which involved working with another lab at Cornell. After that, I went to N.C. State for a post-doc in plant genomics, but I had to learn bioinformatics to understand our data. The interest in bioinformatics brought me to Purdue Libraries.”

While here, his work has been varied, too. He recalls one project on which he worked with a faculty member in the vet school, helping her take data from the National Center of Biotechnology Information and getting it into a format she could use.

“For me, it was very simple; for her it seemed impossible. In many ways, how I helped her is, 100 percent, the work that libraries do. There is a public resource out there, the resource is information or data, and I show them how to work with it,” Pascuzzi said.

While people have complimented Pascuzzi on his CellMiner Companion tool—that he used to develop a visualization plot, or a heatmap, on cancer cells with drug treatment, data—he points out what really is important is what can be done, with data by individual researchers.

“Generating a plot, from publicly available data on cancer cells, isn’t revolutionary. What is revolutionary is that I was able to do it myself—and I am able to teach just about anybody how to do something like that,” Pascuzzi said. “The technology has moved so quickly, data access has moved so quickly, that projects like that have become trivial. A decade ago, that would have been a major project. You would have approached computer science students, and then write to someone to get access to the data. Now it is just all out there.”

Purdue University’s Elizabeth Tran, an associate professor in biochemistry, is another faculty member who Pascuzzi has helped over the years. She said his expertise has contributed to the continued funding of her work, as well as critical bioinformatics training and instruction for her and her graduate students.

“He taught my students how to code, both from his R/Bioconductor course [BCHM 695 Introduction to Bioconductor and R] and through one-on-one assistance,” she noted.

Tran added that she and Pascuzzi have collaborated on several research projects since he has been at Purdue.

“Our research is focused on the role of RNA unwinding enzymes in gene expression. Not surprisingly, we were faced with the challenge of needing genome-wide studies gene expression differences between the pathways we were investigating. I reached out to Pete, and he was able to help us use published data sets to compare to results we had generated with RNA sequencing. This resulted in a grant renewal for my laboratory, with Pete as an essential collaborator, and a publication one of my graduate students,” she explained.

Pascuzzi has continued working on projects in Tran’s laboratory, including helping another one of her graduate students with cutting-edge “next gen” studies to identify binding sites for the RNA helicases on RNAs.

Distinguished Professor in the Purdue University Department of Nutrition Science James Fleet pointed out that Pascuzzi’s unique perspective and skill set bridge the traditional roles of the library, “i.e. information management and analysis, with an important area of modern biology, bioinformatics and big data analysis.”

“I came to know Pete through an educational program funded by the National Institutes of Health’s ‘Big Data to Knowledge’ program. This program funded projects to provide data-analysis training to traditional biomedical researchers. (This was a unique, nationally competitive grants program, and Purdue was only of only about a dozen places to receive funding from the program.) I had heard about Pete’s skills as a bioinformatician and an educator, and I knew that he was the piece we needed to round out our team,” Fleet explained. “His contribution to our course was necessary for its success. In addition, he has been instrumental in establishing core bioinformatics and data management/analysis courses for the Biochemistry Department.”

Biochemistry Assistant Professor Vikki Weake said Pascuzzi’s influence on student learning and success is clear.

“Pete and I worked together on some RNA-seq studies in Drosophila, and he helped mentor one of my graduate students, Jingqun Ma, so that she could learn how to analyze her data. These studies were published in the journal G3. Jinqun is now a bioinformatician,” she said.

When it comes to research data, Weake not only touted Pascuzzi’s bioinformatics expertise, but she also noted that Purdue Libraries’ Purdue University Research Repository, or PURR, is a tremendous resource for Purdue faculty.

“Data management and archiving are becoming increasingly important in the life sciences, and my lab team members have used PURR extensively to archive data sets associated with our published studies,” she said. “This is really important, as other researchers have access to the raw data, so they can replicate our analyses and results. The National Institutes of Health have recognized that we need efforts to improve rigor and reproducibility in biomedical science, and services that make raw data freely available are a great way for labs to be transparent about the work that they are doing. Ideally, other groups should be able to take our data and replicate our findings, or if new knowledge becomes available—they might use our data to gain novel insight into a biological process,” Weake added.

Pascuzzi, like his fellow faculty colleagues in the Purdue Libraries, serves Purdue faculty with invaluable instructional and research support, oftentimes providing key resources, tools, and insights that help them make great leaps in their learning, information discovery, and research studies.

“My niche has always tended to be helping others. Libraries are highly service, education, and learning oriented,” Pascuzzi said. “I have tried to go all in on that. It’s what we do.”