This guest post was written by Ann Brown, director of strategic communications at Virginia Tech University Libraries.
Virginia Tech undergraduate students are getting a taste of real-world data analysis that makes a difference as the globe grapples with the COVID-19 pandemic.
The White House Office of Science and Technology (OSTP) announced a call to action for data scholars across the country to use their expertise in artificial intelligence and data mining to help COVID-19 researchers keep up with the ongoing emerging research surrounding the pandemic.
University Libraries at Virginia Tech faculty members Anne M. Brown and Jonathan Briganti challenged their undergraduate data students in the library’s DataBridge program and the Bevan & Brown Lab to jump in and create tools and look for patterns and trends in the research data using machine learning and molecular modeling.
The OSTP call to action said the purpose is “to develop new text and data mining techniques that can help the science community answer high-priority scientific questions related to COVID-19.”
Brown and Briganti tapped graduate student Daniel Chen, a Ph.D. student in genetics, bioinformatics, and computational biology, working under Brown, to lead the undergraduate students in working remotely yet collaboratively on this challenge. Students engaged in this project range from first-year students to seniors and are from the departments of biochemistry, biological sciences, computational modeling and data analytics, and geography.
One team of students is exploring the use of molecular modeling in application to understanding the biology and druggability of proteins associated with COVID-19. These students are searching the Protein Data Bank for all protein structures associated with COVID-19 and annotating a database of potential structures for future experiments. The team then optimizes a protocol to scan drug targets against hundreds of known drug molecules.
The other team is doing text-data mining of literature sources, popular media, and other outlets to connect research questions they have generated to the coronavirus pandemic. Most of the team’s work involves natural language processing of published texts. Their research questions and tasks include looking at the simultaneous presence of two chronic diseases or conditions and risk factors for COVID-19, and best practices and challenges in medical care to prevent the spread. The team members also used natural language processing to look at the changes in sentiment in published news articles, the state of the economy, and how quarantine measures are affecting air quality.
Brown said this work is about bringing what students have learned from all of their academic experiences to bear on a global challenge and seeing how their knowledge and teamwork can make a difference.
“Our first outcomes and purpose are really student-based—how can our students take what they are learning in their classes, in their research experiences in our group, and apply it to something happening in real-time,” said Brown. “Students are discussing their results across discipline boundaries, which is important for developing transdisciplinary collaboration skills for the future.”
Collaboration is key while working remotely during the quarantine.
“The students are learning how to work in a collaborative environment where data and code can be shared across everyone in the team,” said Chen. “In some sense, the quarantine measures enforce these collaboration techniques since we can no longer hold in-person office hours. The work we’re doing is all open and public, and the skills they’re using for this work are the same set of skills employed in other open-source projects, big and small, as well as companies.”
Chen said the mechanics of the work is of the greatest value to the data students.
“Working openly, and using GitHub as a focal point for project management, and working asynchronously with data and code updates are the biggest skills the students are learning that will carry on with them as this project continues and for their future careers,” said Chen.
Brown said that the White House call to action was a siren song for their team. They couldn’t pass up the opportunity.
“We saw the opportunity to use the White House call to action and all of the publicly available data as both a way to engage students in experiential learning while remote, while also working on something extremely relevant,” said Brown. “Hopefully, they take this experience and skill set with them in their careers and can be the change makers the world needs.”