A Wayne State University School of Medicine-Perinatology Research Branch researcher won an international computational biology competition. The models he developed to win the challenge demonstrated how transcriptomics information present in the blood can predict whether people have been exposed to specific toxicants.
Adi Tarca, Ph.D., associate professor of Obstetrics and Gynecology and director of the Bioinformatics and Computational Biology Unit at the National Institutes of Health's Eunice Kennedy Shriver National Institute of Child Health and Human Development Perinatology Research Branch at WSU, won the Systems Toxicology Computational Challenge. A field of 135 scientists from around the world took part in the competition.
The Systems Toxicology Computational Challenge is the latest to run under the Systems Biology Verification IMPROVER umbrella, a crowdsourcing initiative led and funded by Philip Morris International and designed to test and verify scientific methods and results.
This year's IMPROVER challenge was designed to determine whether blood mRNA expression can be used to assess the exposure to smoking in humans, and whether a common set of genes can predict this outcome in both human and mouse models.
Participating teams around the world were provided with gene expression microarray data generated by the organizers and their collaborators. The prediction performance of the models was evaluated on data from a new set of subjects whose exposure to smoking was concealed from the participants, while the organizers were blind to the identity of the participants during the ranking of the teams.
The models developed by the Perinatology Research Branch team, led by Dr. Tarca, ranked first in the first challenge and tied for second in the second challenge. The team's model displayed 100 percent sensitivity and 93 percent specificity when discriminating between smokers and non-current smokers, and 65 percent sensitivity and 57 percent specificity when discriminating between former smokers and those who never smoked.
"The predictive modeling strategy developed by the WSU PRB team has direct applications in predicting any clinical outcome from high-dimensional data, and the main computational lessons will be published in collaboration with other best performing teams," Dr. Tarca said. "In addition, the 12-gene model developed by the team can be used in any transcriptomics studies to provide a granular measure of exposure to smoking, and demonstrated that exposure to smoking needs to be considered in the design of transcriptomics studies to ensure drawing accurate inferences."
The challenge sought to verify whether markers could be extracted from blood gene expression data that would distinguish tobacco smokers from non-smokers, and then differentiate between non-smokers as former smokers and those who never smoked. This question was addressed in two sub-challenges, the first one looking at human data only, the second one investigating human and mouse data together. Anonymized participants' submissions were scored against a gold-standard dataset. Final results and team rankings were reviewed and approved by an independent expert scoring review panel.
Participants were successful in developing models with a high level of predictive performance in distinguishing current tobacco smokers from non-smokers. Predicting whether non-smokers were former smokers or never smokers was more challenging, suggesting that these two groups are likely to have similar gene expression profiles.
The challenge provided the opportunity for participants to vigorously and objectively test their methodologies.
"We were driven by the desire to create a model that can both lead to valuable biological insight, and be implemented in practice at the lowest possible cost," Dr. Tarca said. "The Systems Toxicology Computational Challenge has allowed us to test the quality of our research, and I'm delighted that our approach has proved to be robust."