To date, much has been stated about the promise of quantum computing for myriad of applications but there have been few examples of a quantum advantage for real-world problems of practical interest. This might change with a new study from the USC Center for Quantum Information Science & Technology at the Viterbi School of Engineering and the USC Dana and David Dornsife College of Arts, Letters and Sciences. Researchers Richard Li, Rosa Di Felice, Remo Rohs, and Daniel Lidar have demonstrated how a quantum processor could be used as a predictive tool to assess a fundamental process in biology: the binding of gene regulatory proteins to the genome. This is one of the first documented examples in which a physical quantum processor has been applied to real biological data. The research was conducted on a D-Wave Two X machine at the USC Information Sciences Institute.
Certain sequences of DNA make up genes, which are the “instructions” for making proteins that do most of the heavy lifting within a cell. However, in response to its molecular environment, a cell may need to have more or less of a certain protein to carry out its function. This complex process of controlling the production of proteins is known as gene regulation. The proteins that regulate which genes are expressed are known as transcription factors (TFs). In order to carry out their function, TFs need to be able to find and attach themselves at specific locations of the genome.
Overall, it is not yet entirely clear how TFs identify the small fraction of functional binding sites in the genome amongst many almost identical but non-functional sites. More comprehensive knowledge of DNA transcription and protein formation are critical for scientists to achieve an increased understanding of how mutations in proteins that are the building blocks of our bodies, lead to disease.
“Quantum computers might help shed light on this process,” said the study’s co-corresponding author Daniel Lidar.
“We chose to attack the problem using machine learning implemented on a D-Wave quantum annealer, in order to test our ability to translate complicated real-life biology problems to the setting of quantum machine learning, and to look for any advantages this approach might offer over more conventional, yet state-of-the-art classical machine learning techniques,” Lidar added.
A key step in the transcription of DNA is the binding of a protein. However, the binding event will happen only when certain conditions are met: a particular sequence of the letters of the DNA alphabet (adenine, thymine, guanine and cytosine) and only at the right location on a strand of DNA known as a binding site. A possible binding site is only functional in less than one percent of circumstances, says the study’s other co-corresponding author Rohs, a professor of biological sciences, chemistry, physics, and computer science who is also a faculty member in the new USC Michelson Center for Convergent Bioscience.
Chemistry PhD candidate Richard Li, computational nano/bio physicist Rosa Di Felice, quantum computing expert and Viterbi Professor of Engineering Daniel Lidar along with computational biologist Remo Rohs sought to apply machine learning to derive models from biological data to predict whether certain sequences of DNA represented strong or weak binding sites for binding of a particular set of transcription factors. The patterns and models learned by the quantum processor were then applied to estimate the strength of binding for a series of sequences for which it was unknown if a protein would bind to them. The algorithm they developed specifically for the D-Wave Two X quantum annealing machine led to predictions that were in agreement with real-world experimental data.
Mapping of a real biological problem to a quantum computer
For this study, the quantum D-Wave Two X processor appeared to have the ability to classify the binding sites as strong or weak. One novelty of the study was the mapping of a biological problem using actual protein-DNA binding data to a quantum chip. The quantum machine was also able to generate conclusions that were consistent with a biologist’s current understanding of gene regulation. In this case, the quantum mapping resulted in the correct binding site for selected proteins.
“The ability to do this work on a quantum computer is an important step forward and suggests future applications of a convergence of biology and quantum information,” said Rohs.
The researchers stress that in its current form, the study uses a simplified version of biological data and has a “proof-of-principle nature.” They believe that once quantum processors known as annealers accumulate qubits and have increased processing power, more complex cellular determinants of gene regulation that Rohs is currently studying could be encoded into new models that use quantum computers.
It also indicates a future in which quantum information may converge with other disciplines that strongly rely on computational strategies, such as materials science and nanotechnologies.
The results of this study are published in “Quantum annealing versus classical machine learning applied to a simplified computational biology problem” in the Nature Partner Journal Quantum Information.