news

Innovative algorithm could help advance single-cell genomics

Researchers have developed a novel algorithm, “scArches”, that can compare data on single-cell genomics to better understand diseases.

Genomic data in shape of DNA molecule

Researchers at Helmholtz Zentrum München and the Technical University of Munich (TUM), both Germany, have created a new algorithm called “scArches”, short for single-cell architecture surgery. This algorithm can compare new datasets from single-cell genomics and thus advance the study of certain diseases.

The team stated that large single-cell atlases are now routinely generated to serve as references for analysis of smaller-scale studies. The Human Cell Atlas is the world’s largest, growing single-cell reference atlas and contains references of millions of cells across tissues, organs and developmental stages. These references help physicians to understand the influences of ageing, environment and disease on a cell and ultimately diagnose and treat patients better. 

However, according to the team, single-cell datasets may contain measurement errors, the global availability of computational resources is limited and the sharing of raw data is often legally restricted. The researchers therefore developed scArches which uses transfer learning and parameter optimisation to enable efficient, decentralised, iterative reference building and contextualisation of new datasets with existing references without sharing raw data.

“Instead of sharing raw data between clinics or research centres, the algorithm uses transfer learning to compare new datasets from single-cell genomics with existing references and thus preserves privacy and anonymity. This also makes annotating and interpreting of new data sets very easy and democratises the usage of single-cell reference atlases dramatically,” explained Mohammad Lotfollahi, the leading scientist of the algorithm.

To test their algorithm, the researchers applied scArches to study COVID-19 in several lung bronchial samples. They compared the cells of COVID-19 patients to healthy references using single-cell transcriptomics. The algorithm was able to separate diseased cells from the references and thus enabled the user to pinpoint the cells in need for treatment, for both mild and severe COVID-19 cases. Biological variation between patients did not affect the quality of the mapping process.

“Our vision is that in the future we will use cell references as easily as we nowadays do for genome references,” proposed researcher Fabian Theis. “In other words, if you want to bake a cake, you usually do not want to try coming up with your own recipe – instead you just look one up in a cookbook. With scArches, we formalise and simplify this lookup process.”

Their study was published in Nature Biotechnology.