news

AI learns chemistry language to predict how to make medicines

An algorithm has been developed which can predict the outcomes of complex chemical reactions with over 90 percent accuracy which can be applied to drug development.

University of Cambridge, UK researchers have shown that an algorithm can predict the outcomes of complex chemical reactions with over 90 percent accuracy, outperforming trained chemists. The algorithm also shows chemists how to make target compounds for drug development, providing the chemical ‘map’ to the desired destination.

The algorithm, which was developed by Dr Alpha Lee from Cambridge’s Cavendish Laboratory and his group, uses tools in pattern recognition to recognise how chemical groups in molecules react, by training the model on millions of reactions published in patents.

They looked at chemical reaction prediction as a machine translation problem. The reacting molecules are considered as one ‘language,’ while the product is considered as a different language. The model then uses the patterns in the text to learn how to ‘translate’ between the two languages.

Using this approach, the model achieves 90 percent accuracy in predicting the correct product of unseen chemical reactions, whereas the accuracy of trained human chemists is around 80 percent.

The researchers say that the model is accurate enough to detect errors in the data and correctly predict a plethora of difficult reactions. The model also produces an uncertainty score, which eliminates incorrect predictions with 89 percent accuracy.

In the second study, Lee and his group demonstrated the practical potential of the method in drug discovery. They showed that the model can make accurate predictions of reactions based on lab notebooks, showing that the model has learned the rules of chemistry and can apply it to drug discovery settings.

The team also showed that the model can predict sequences of reactions that would lead to a desired product. They applied this methodology to diverse drug-like molecules, showing that the steps that it predicts are chemically reasonable.

This technology can significantly reduce the time of pre-clinical drug discovery because it provides medicinal chemists with a blueprint of where to begin.

The Cambridge researchers are currently using this reaction prediction technology to develop a complete platform that bridges the design-make-test cycle in drug discovery and materials discovery: predicting promising bioactive molecules, ways to make those complex organic molecules and selecting the experiments that are the most informative.

The researchers are now working on extracting chemical insights from the model, attempting to understand what it has learned that humans have not.

The results are reported in two studies in the journals ACS Central Science and Chemical Communications.