Despite COVID-19 restrictions on in-person campus operations, undergraduate students continue to further research in Biochemistry faculty labs. In particular, computational work is well-suited to being conducted remotely.
This brand of telecommuting, coined “tele-science” by Associate Professor Aaron Hoskins and carried out largely by undergraduate students across the department, is vital to research in biochemistry.
“All of my undergraduates are given their own, independent research projects – all of which are exciting and make contributions to my lab and the field at large,” said Hoskins. “I don’t think COVID-19 has changed this; however, it is has allowed us to prioritize computational projects for the moment.
Students conducting this type of research play a key role in taking large quantities of data and making that data accessible to scientists and researchers who need the information, but may not have the computer programming background necessary to gather and analyze it effectively.
Yichen Sun, a rising senior majoring in statistics, works in the Hoskins lab. Hoskins’s lab members, in particular, Drs. Harpreet Kaur and Clarisse van der Feltz, have developed a new way to evaluate the structures of complex cellular machines like the ribosome, spliceosome, or even the COVID-19 RNA dependent RNA polymerase. But to make their method more generally useful, they needed a way for it be used by non-computer-programming experts.
“That’s where Yichen’s summer tele-science project comes in,” said Hoskins. “Using Python and his computer programming skills, he has created a graphical user interface (GUI) that allows anyone to carry out these evaluations.”
Sun is writing a computer program which calculates the surface areas of intermolecular interactions between proteins from Protein Data Bank (PDB) files. These files contain information comprising RNA chains, and provide calculations on interactions within those chains. The aggregate information is then used to create network-theory based models of the structure.
The Hoskins lab is also interested in understanding how mutations in the spliceosome can change its function and cause diseases like cancer. “DNA sequencing of tens of thousands of tumor samples have revealed many different mutations in the spliceosome that are frequently associated with one cancer or another,” said Hoskins. “What we need is an easy way to map these mutations onto cryo-EM structures of spliceosomes.”
And that’s where Lukas Voigts comes in. Voigts, a third year student majoring in biochemistry, is working on a project mapping these mutations. Specifically, he’s mapping thousands of these mutations – found in 33 different types of cancers – onto spliceosome structures. He then uses PyMol to create a viewing platform which allows anyone to easily see how these mutations can impact spliceosome structure.
Sarah Fahlberg is a rising sophomore studying biochemistry and computer science. She works in Assistant Professor Phil Romero’s lab, which studies the design principles of proteins and how they can be applied to engineer new molecular functions. Her project aims to understand how different methods of encoding protein sequences can be used for machine learning driven protein engineering.
Better methods mean more efficient processes for researchers studying the data. Protein sequences contain long chains of amino acids, which are indicated by letters. Computers need numbers to perform computations, so the protein sequences need to be coded accordingly. Falhberg’s project examines how well different protein coding methods perform so that researchers can use machine learning to better predict sequences, and ultimately learn more about their questions by doing fewer experiments.
When campus shut down most on-site research operations earlier this year, these students, and their peers across the department – were able to steadily continue contributing to research.
The arrangement is mutually beneficial, providing the lab with important research, and the students with hands-on experience, professional connections, and continued learning opportunities. Both technical and more nuanced types of learning have continued to progress for students off-site during the pandemic, thanks in part to the attentive support of mentors and colleagues, with whom students connect formally and informally through virtual lab meetings and via chat platforms.
Fahlberg in particular cited enthusiasm for her developing skills in navigating the programming side of software development, as well as considering the needs of the end user; the constituents on these opposite ends of the software spectrum often speak different technical languages and exist in two different technical worlds, which she now bridges.
Sun has been learning Python, a program critical for a data scientist; the experience working with Python has broadened his thinking about potential careers: “I’d been thinking of only data science or consulting, but now bioinformatics looks pretty interesting.”
Voigts’s work with PyMOL and R “will be incredibly useful regardless of the area of study” he pursues.
And each of these projects continues to move research forward in the department, pandemic or no. Sun’s GUI will be part of the research manuscript the Hoskins lab will submit on this project later this summer. Voigts’s work will be released as a resource for the RNA community at the end of the summer.
“These computational projects are just as critical and important for our scientific research as would be purifying a protein or synthesizing a new ligand,” said Hoskins. “Overall, I think our dynamic is as good as can be expected, but we all eagerly await the day when we can be in lab together!”