High-performance computing (HPC) resources are helping a University of Iowa researcher understand how geography and evolution interact.
The Helium computing cluster, administered by Information Technology Services at the UI, is enabling evolutionary biologist Trina Roberts to analyze how geographic factors affect the process of evolution – for example, how living on an island versus a mountain, or a warm place instead of a cold one, could over time influence specific characteristics of a species, such as size.
“Knowing how geographic factors impact evolution could help scientists predict how human changes to a landscape will affect evolution, and also what effect global climate change might have,” said Roberts, associate director of the UI Museum of Natural History.
Roberts collected DNA samples from 150 treeshrews from Southeast Asia – some from museum collections, and some through field work in Cambodia. (For the record, treeshrews aren’t true shrews; they’re one of primates’ closest relatives, and they resemble squirrels with pointy noses).
With collaborators at Yale University and the University of Alaska, Roberts sequenced the DNA samples. Now, with HPC, they are using the assembled DNA data to determine evolutionary relationships. Those relationships help them understand patterns in geography and morphology.
“To me, it’s an interesting basic science question because we don’t know how many species there are, or what they are,” Roberts said. “Estimates on the number of species in one species complex have ranged from three to 60. We believe the reality is somewhere in between. The fact that there has been so much confusion was a good clue that there is an interesting evolutionary story behind this.”
Roberts feeds the genetic data into the high-performance computing cluster, which uses modern algorithms to determine the evolutionary relationship of the treeshrews. With specialized software, it matches up DNA sequences to determine where each treeshrew fits into the family tree. Roberts uses the information to create maps and identify patterns in their characteristics and habitats.
“This kind of analysis is not something you would want to do by hand,” she said.
The number of possible relationships among 50 individuals (or species) is greater than 10 to the 70th power, and the number of possible relationships among 150 is effectively infinite. Roberts can’t look at all the possibilities, but she needs to search through enough of them to have a good chance at finding the right answer, and then double check it. With multiple processors, HPC allows her to do many computations simultaneously, in parallel.
With HPC, Roberts can also store vast amounts of data. And, she appreciates the speed and reliability. More than once in her career, she has experienced the frustration of a traditional computer crashing several days into an analysis.
“I’ve done a lot of analyses that would have taken months on a laptop. It’s hard keep your computer on for that long, and it really ties up and slows down the machine,” she said. “Having HPC that you can trust to keep running for the entire analysis is great, and you can do the analysis in hours rather than weeks.”
Illustration: Trina Roberts is pictured with a small fruit bat (genus Cynopterus) during field research in Cambodia. Roberts is an evolutionary biologist whose research involves treeshrews and bats. The map, which illustrates possible treeshrew species and climate variables, was created with the help of information generated by high-performance computing.
Treeshrew photo provided by Anthony Cramp/Creative Commons.