Scientists at the University of California, Irvine have developed a new deep-learning framework that predicts gene regulation at the single-cell level.
Deep learning, a family of machine-learning methods based on artificial neural networks, has revolutionized applications such as image interpretation, natural language processing and autonomous driving. In a study published recently in Science Advances, UCI researchers describe how the technique can also be successfully used to observe gene regulation at the cellular level. Until now, that process had been limited to tissue-level analysis.
According to co-senior author Xiaohui Xie, UCI professor of computer science, the framework enables the study of transcription factor binding at the cellular level, which was previously impossible due to the intrinsic noise and sparsity of single-cell data. A transcription factor is a protein that controls the translation of genetic information from DNA to RNA; TFs regulate genes to ensure they’re expressed in proper sequence and at the right time in cells.
“The breakthrough was in realizing that we could leverage deep learning and massive datasets of tissue-level TF binding profiles to understand how TFs regulate target genes in individual cells through specific signals,” Xie said.
By training a neural network on large-scale genomic and epigenetic datasets, and by drawing on the expertise of collaborators across three departments, the researchers were able to identify novel gene regulations for individual cells or cell types.
“Our capability of predicting whether certain transcriptional factors are binding to DNA in a specific cell or cell type at a particular time provides a new way to tease out small populations of cells that could be critical to understanding and treating diseases,” said co-senior author Qing Nie, UCI Chancellor’s Professor of mathematics and director of the campus’s National Science Foundation-Simons Center for Multiscale Cell Fate Research, which supported the project.
He said that scientists can use the deep-learning framework to identify key signals in cancer stem cells — a small cell population that is difficult to specifically target in treatment or even quantify.
“This interdisciplinary project is a prime example of how researchers with different areas of expertise can work together to solve complex biological questions through machine-learning techniques,” Nie added.