Drug discovery from the age of information to the age of intelligence

Martin, Jeff

Drug discovery from the age of information to the age of intelligence

7

SHARES

Share via

Posted: 20 March 2019 | Jeff Martin (Head of Compound Management - Alkermes Inc.) | No comments yet

Data drives drug discovery, yet it continues to be among the biggest challenges faced by the industry.¹ Experiments are often not repeatable and data interpretation is subject to the biases and limitations of human beings.

There are many reasons that molecules fail to become marketed drugs. In order to help avoid this, it is imperative that decisions are made from the most accurate representation of the science; otherwise the risks of failure only increase. Evolving technology presents many opportunities to not only reverse the tides, but to leverage data with unprecedented benefits.

The information age has seen an explosion of data over the last few decades,² but more data does not inherently yield better understanding.³ It has been shown that the human brain is limited in its capacity to process data⁴ and that biases further complicate the reliability of the conclusions drawn.⁵ In addition, the human brain often does not recall all data after a certain time has elapsed, thus retrospective analyses over time becomes challenging, impossible or completely inaccurate.

While these difficulties are not easily solved, technology may hold some answers. Artificial intelligence (AI) is software technology that enables machines to think and learn. Machines, unlike humans, can perform complex calculations on large data sets quickly; can perform tasks consistently; and are not influenced by emotions. However, machines have generally relied on direct explicit instructions to perform tasks. For example: typing a single letter into primitive word processing software will display it on the computer screen; this is direct input and response reflecting only the single input that is provided. Artificial intelligence opens the door to leveraging the strengths of machines to perform complex analysis where humans fall short. Typing the same letter in a search engine produces a list of suggested search terms tailored to the user. These suggestions are generated by data intelligence technology that interprets the letter with a combination of many secondary data points, called metadata, such as the individual’s previous browsing history and current events. If a group of people were each asked to make a list of search suggestions from a single letter and a small set of metadata, they may produce similar lists. As the data volume grows, human data capacity becomes saturated and predictions would likely become increasingly divergent. Without data intelligence, the letter is just the letter on a screen, and large data sets may contribute more noise than knowledge. However, intelligent software brings consistency and completeness to the analysis, and far more relevance when metadata are included.

Interpreting data

In drug discovery, data intelligence technology has numerous applications; one of which is data interpretation, where context provides a crucial backdrop to the experimental data, offering insight that the experimental data alone may not capture. For instance, the fluorescence signal of a biological assay may be associated with a known interaction of interest. However, perhaps the signal is sensitive to temperature or a test substance is fluorescent. A further possibility that might add to the complexity is that neither dependency is known. An interpretation of only the experimental data would be completely misleading and potentially disastrous if the context of temperature and substance properties are not accounted for. An inactive compound could be promoted, wasting valuable resources on a dead end; or even worse, a would-be lead compound may be discarded and forgotten. Every experiment has hundreds, if not thousands, of these data points that complement each experimental data point. These data points extend beyond a single experiment to a full heredity of movements, conditions, and processes about each component in the sample. Most of these data points are not currently used or even collected, instead relying on controls to adjust for variability and unknowns. Data intelligence creates an opportunity to use these data points and explore what may be hiding behind experimental data. Perhaps assay variability can be explained or new insight about a target can be elucidated. As much as a search engine does with a single letter, intelligence technologies can provide far better meaning to experimental data by including metadata for a fully contextual analysis.

While artificial intelligence can provide new insights from data after an experiment, it can also assist with conducting science in the lab. Lead optimisation is a late-stage drug discovery process that refines molecules for safety and efficacy against a disease – it works on a cycle of designing molecules based on biological data, synthesising them, and then testing their biological activity. Many biological assays can test thousands of compounds in a single day. However, each molecule designed by medicinal chemists must be synthesised and purified, which can take days or weeks to complete. The chemistry may be unknown, multistep and time consuming. Automated synthesis and artificial intelligence technologies are being developed to help solve some of these challenges. Retrosynthesis is the process of solving each step of synthesising a compound, starting from the product and working back to commercial or readily-available reagents. There are millions of known single step chemical reactions,⁶ so it’s impossible for a chemist to know them all. Instead, it is routine practice for chemists to parse the literature in search of reported chemistry that could solve the problem – work that is time consuming and well suited for intelligent software. Retrosynthesis software applies artificial intelligence to reference known chemical reactions and solve reaction schemes, much like chemists do. The difference is that software can access all known chemical reactions and perform complex analysis that predicts the route, and a chemist is limited by their knowledge and how many references they can find and read. Development of such software has been ongoing for many years, and significant recent advancements have been reported.⁶ As retrosynthetic software improves, solving synthetic problems may take just a few clicks.

The role of automation

While software may shorten the time to solve the question of how to make a molecule, the reaction must still be carried out in the lab. A chemist is usually assigned multiple compounds at a time and must be able to make them efficiently. Difficult chemistry often requires trying many test reactions of multiple conditions and reagents to find one that works. Conventional synthesis is slow and labour intensive, requiring each reaction to be set up and run individually. Several automated approaches to synthesis that can remove these bottlenecks have been reported and are continuously being developed.^7,8 Automation need only be set up once to run multiple reactions without any further effort from the chemist. It can run continuously to generate compounds of interest and methodically iterate through many conditions to solve challenging transformations. Automated synthesis approaches can free the chemist from repetitive time-consuming tasks and reduce the time to access and test new chemical matter.

When taken individually, applications for artificial intelligence in drug discovery have immense potential, but machine intelligence will enable researchers to go one step further. Robots in drug discovery have been widely applied to biological assays that test molecules.⁹ Advancements in automated synthesis, retrosynthesis, and data intelligence create a perfect storm to combine biology and chemistry into a fully automated closed-loop process and efforts are underway to achieve it.¹⁰ Synthetic methods are determined by retrosynthesis software and sent to automated synthesis devices where molecules are synthesised, purified, and transferred to the assay. Once compounds are transferred, robotics are automatically triggered to execute the assay and to collect data on each sample. Intelligent software interprets the experimental and metadata in the context of not just the experiment, but of all the data in the organisation. The software then designs the next round of molecules and the cycle repeats with the software learning after each round. It may sound like science fiction, but many of these capabilities are developing rapidly or already exist – though certainly significant development in artificial intelligence and automation remains in order to create a truly automated drug discovery system as described. While such a system in its entirety may still lay in the fringes, it provides a template for organisations to solve problems today with available technology and prepare for a future that will look very different.

Adapting drug discovery to Artificial Intelligence…

Intelligent automated systems could help address many of the challenges faced in drug discovery. Lead optimisation cycle times could be reduced from weeks to days or even hours. Human limitations and biases could be removed from data analysis. Robotics could practically eliminate experimental inconsistencies. Metadata could provide insight into the cause of any remaining inconsistencies and offer a more complete context to experimental data. In other words, better decisions could be made faster, leading to better treatments getting to patients faster. Select considerations and solutions have been discussed, though many details and options exist. The frontiers at the interface of technology and science are exciting and promising. If well executed, organisations have an opportunity to prosper in the age of information intelligence, and most importantly, advance patient treatment with extraordinary speed and benefit.

Biography

Jeff Martin leads the compound management function and is responsible for developing and deploying technologies throughout the research organisation at Alkermes. He has over a decade of experience applying robotics to science, and a passion for solving hard scientific problems with technology and creativity. Jeff joined Vertex in 2015 where he was responsible for compound management and automation in Boston. Prior to joining Vertex, he worked in a number of roles at Biosero, ultimately leading custom robotics projects. Jeff earned a Bachelor of Arts from Assumption College and Master of Science from the University of Massachusetts Amherst, both in chemistry.

References

Begley CG, Ellis LM. Drug development: Raise standards for preclinical cancer research. Nature. 2012 Mar 28;483(7391):531.
Dobre C, Xhafa F. Intelligent services for big data science. Future Generation Computer Systems. 2014 Jul 1;37:267-81.
Kaplan RM, Chambers DA, Glasgow RE. Big data and large sample size: a cautionary note on the potential for bias. Clinical and translational science. 2014 Aug 1;7(4):342-6.
Wu T, Dufford AJ, Mackie MA, Egan LJ, Fan J. The Capacity of Cognitive Control Estimated from a Perceptual Decision Making Task. Scientific reports. 2016 Sep 23;6:34025.
Silberzahn R, Uhlmann EL. Crowdsourced research: Many hands make tight work. Nature News. 2015 Oct 8;526(7572):189.
Segler MH, Preuss M, Waller MP. Planning chemical syntheses with deep neural networks and symbolic AI. Nature. 2018 Mar;555(7698):604.
Sanderson K. March of the synthesis machines. Nature Reviews Drug Discovery. 2015 Apr;14:299.
Perera D, Tucker JW, Brahmbhatt S, Helal CJ, Chong A, Farrell W, Richardson P, Sach NW. A platform for automated nanomole-scale reaction screening and micromole-scale synthesis in flow. Science. 2018 Jan 26;359(6374):429-34.
Janzen WP. Screening technologies for small molecule discovery: the state of the art. Chemistry & Biology. 2014 Sep 18;21(9):1162-70.
Fleming GS, Beeler AB. Integrated Drug Discovery in Continuous Flow. J Flow Chem. 2017 Nov;7(3-4):124.

Related organisations
Alkermes, Vertex Pharmaceuticals

Cookie	Type	Duration	Description
cookielawinfo-checkbox-advertising-targeting	persistent	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Advertising & Targeting".
cookielawinfo-checkbox-analytics	persistent	1 year	This cookie is set by GDPR Cookie Consent WordPress Plugin. The cookie is used to remember the user consent for the cookies under the category "Analytics".
cookielawinfo-checkbox-necessary	persistent	1 year	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-performance	persistent	1 year	This cookie is set by GDPR Cookie Consent WordPress Plugin. The cookie is used to remember the user consent for the cookies under the category "Performance".
PHPSESSID	session	1 year	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
viewed_cookie_policy	persistent	1 year	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
zmember_logged	session	1 year	This session cookie is served by our membership/subscription system and controls whether you are able to see content which is only available to logged in users.

Cookie	Type	Duration	Description
advanced_ads_browser_width	persistent	1 month	This cookie is set by Advanced Ads and measures the browser width.
advanced_ads_page_impressions	persistent	2 years	This cookie is set by Advanced Ads and measures the number of previous page impressions.
advanced_ads_pro_server_info	persistent	1 month	This cookie is set by Advanced Ads and sets geo-location, user role and user capabilities. It is used by cache busting in Advanced Ads Pro when the appropriate visitor conditions are used.
advanced_ads_pro_visitor_referrer	persistent	1 year	This cookie is set by Advanced Ads and sets the referrer URL.
bscookie	persistent	2 years	This cookie is a browser ID cookie set by LinkedIn share Buttons and ad tags.
IDE	persistent	2 years	This cookie is set by Google DoubleClick and stores information about how the user uses the website and any other advertisement before visiting the website. This is used to present users with ads that are relevant to them according to the user profile.
li_sugr	persistent	3 months	This cookie is set by LinkedIn and is used for tracking.
UserMatchHistory	persistent	1 month	This cookie is set by Linkedin and is used to track visitors on multiple websites, in order to present relevant advertisement based on the visitor's preferences.
VISITOR_INFO1_LIVE	persistent	5 months	This cookie is set by YouTube. Used to track the information of the embedded YouTube videos on a website.

Cookie	Type	Duration	Description
bcookie	persistent	2 years	This cookie is set by LinkedIn. The purpose of the cookie is to enable LinkedIn functionalities on the page.
GPS	persistent	30 minutes	This cookie is set by YouTube and registers a unique ID for tracking users based on their geographical location
lang	session	1 year	This cookie is set by LinkedIn and is used to store the language preferences of a user to serve up content in that stored language the next time user visit the website.
lidc	persistent	1 day	This cookie is set by LinkedIn and used for routing.
lissc	persistent	11 months	This cookie is set by LinkedIn share Buttons and ad tags.
vuid	persistent	2 years	We embed videos from our official Vimeo channel. When you press play, Vimeo will drop third party cookies to enable the video to play and to see how long a viewer has watched the video. This cookie does not track individuals.
wow.anonymousId	persistent	2 years	This cookie is set by Spotler and tracks an anonymous visitor ID.
wow.schedule	persistent	20 minutes	This cookie is set by Spotler and enables it to track the Load Balance Session Queue.
wow.session	persistent	20 minutes	This cookie is set by Spotler to track the Internet Information Services (IIS) session state.
wow.utmvalues	persistent	20 minutes	This cookie is set by Spotler and stores the UTM values for the session. UTM values are specific text strings that are appended to URLs that allow Communigator to track the URLs and the UTM values when they get clicked on.
_ga	persistent	2 years	This cookie is set by Google Analytics and is used to calculate visitor, session, campaign data and keep track of site usage for the site's analytics report. It stores information anonymously and assign a randomly generated number to identify unique visitors.
_gat	persistent	1 minute	This cookies is set by Google Universal Analytics to throttle the request rate to limit the collection of data on high traffic sites.
_gid	persistent	1 day	This cookie is set by Google Analytics and is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected including the number visitors, the source where they have come from, and the pages visited in an anonymous form.

Cookie	Type	Duration	Description
cf_ob_info	persistent	1 minute	This cookie is set by Cloudflare content delivery network and, in conjunction with the cookie 'cf_use_ob', is used to determine whether it should continue serving “Always Online” until the cookie expires.
cf_use_ob	persistent	1 minute	This cookie is set by Cloudflare content delivery network and is used to determine whether it should continue serving “Always Online” until the cookie expires.
free_subscription_only	session	1 year	This session cookie is served by our membership/subscription system and controls which types of content you are able to access.
ls_smartpush	persistent	1 month	This cookie is set by Litespeed Server and allows the server to store settings to help improve performance of the site.
one_signal_sdk_db	persistent	Until cleared	This cookie is set by OneSignal push notifications and is used for storing user preferences in connection with their notification permission status.
YSC	session	1 year	This cookie is set by Youtube and is used to track the views of embedded videos.

Recommended

Drug discovery from the age of information to the age of intelligence

Interpreting data

The role of automation

References

Leave a Reply Cancel reply