Artificial Intelligence and Machine Learning in Drug Discovery (Part I)

In a recent webinar, we surveyed our audience and were surprised to see that a significant majority of attendees thought the application of artificial intelligence and machine learning (AI/ML) methods was the most exciting area for drug discovery, beyond even degraders or molecular glues. Machine learning methods have been increasingly appearing in our Molecules of the Month research, such as in Lilly’s application of an ML model to mitigating DDI risk and Psychogenics’ use of ML to characterize CNS drug mechanisms of action. Countless readers have asked for more coverage of AI/ML companies, and the excitement goes beyond R&D, and commercial reviewer Anthony Vaganos recently highlighted AI/ML as one of the area’s he’s most excited about as well. So here we are!

The application of AI/ML methods to drug discovery has been an area of significant investment for at least 10 years now. Exscientia, one of the most well-known companies in this area, for example, was founded in 2012. With so much news covering the space, it’s easy to have gotten lost on where exactly the field is today. This resource provides a brief overview of the history of AI/ML-based drug discovery and overview of the state of the field and an introductory reading list. Part II of this series covers AI/ML-focused companies with clinical small molecules, and Part III paints a broader picture of the landscape of AI in small molecule drug discovery.

A Brief Introduction to AI/ML in Drug Discovery

AI/ML approaches are to big datasets as linear regressions are to small ones. Analogous to how linear regressions build a predictive model using a line of regression, or “fit,” based on small sets of initial data, such as a scatterplot; AI/ML approaches build predictive models, such as  which cat video you will watch next, based on massive sets of data, like which videos of cats were watched by billions of people.

Hence, where there is a massive dataset to be found, there is likely a startup or division trying to apply AI/ML to extract more value from it. For example:

  • Genomic and biomedical datasets to identify biomarkers (e.g., DNAnexus)
  • Scientific articles to identify target ID (e.g., Causaly)
  • Tumor sample data to identify prognostic markers (e.g., Predictive Oncology)
  • Medical literature and symptom data for drug repurposing (e.g., Healx)
  • Cellular fluorescent imaging data for richer in vitro pharmacology (e.g., Recursion)
  • Protein structural data to identify peptide binders (e.g.,
  • Protein structural data and compound binding data for small molecule hit generation (e.g., Atomwise)
How AI/ML is applied to big data sets and its application to drug discovery

There are now well over 200 independent companies applying AI/ML to drug discovery, without including internal R&D teams in pharma and larger, existing computational chemistry leaders like Schrödinger. More than anything, this shows how much data goes into drug discovery already, as you need a dedicated company to tackle each niche in this scientifically rich industry.

The prospect of augmenting resource-intensive screening campaigns with comparatively cheaper and faster in silico campaigns is a reason for justifiable excitement. However, while AI has progressed by leaps and bounds in arenas like speech and image recognition and online shopping predictions, it has made only modest progress in drug discovery so far, given the complexity of the drug discovery process. Nevertheless, the recent clinical and preclinical candidates discovered and developed using AI-centered technologies validates the promise that such technologies offer for the future of drug discovery. 

Where are AI/ML Drug Discovery Companies Now?

We have compiled two molecule-focused reviews on the state of AI/ML in small molecule drug discovery today, with a focus on the structures, targets, and mechanisms of action of their lead molecules:

SHP-1971, SHP2 inhibitor, Relay Therapeutics, Oncology, often cited example of AI/ML application to drug discovery, Ph. I for treating solid metastatic tumors, licensed by Genentech ($75M upfront)
"Example 151", MALT1 inhibitor, Schrödinger, Oncology/Hematology, from machine-learning-powered prioritization of molecules, Schrödinger MALT1 candidate (SGR-1505) in Ph. I, SGR-1505 (clinical candidate) not disclosed

In these reviews, we have outlined some of the major players in the field of AI in drug discovery, with a special focus on those with small molecule clinical and preclinical candidates. The participants in AI-augmented drug discovery are diverse both in the approaches being used and the size of the startups. There are so many places where drug discovery involves the interpretation of significant amounts of data, and with the rapid advancement of data-generating technology, the amount of data for any specific project or even molecule will only get larger. It’s not surprising then that entire companies have been formed which are dedicated to addressing each of these issues. There are 200+ startups because there are literally hundreds of areas where the development of focused new data curation and cleaning tools could add value. 

A specific example of a promising area is the augmented characterization of cells in vitro (e.g., Recursion). In many disease areas, drug hunters are frequently stumped by the effects molecules have when many cell populations are at play (e.g., characterization of gross mixtures of immune cells, cancer microenvironments, stromal cells, etc.). This type of data enrichment has applications in assays, screening strategies, translational biology, in vitro and in vivo pharmacology, clinical biomarkers, and beyond, and will likely be employed universally by drug hunters of the future, whether they realize it or not.

The Future of AI/ML in Drug Discovery

Everyone participating in drug discovery is likely to be impacted by applications of AI/ML, though the effects are likely to be gradual and subtle, in the way Excel has gotten extraordinarily powerful, or as cloud platforms like Google Drive have become universally adopted. Many repetitive tasks requiring expertise are likely to become automatable, such as categorizing molecules within large libraries or triaging HTS hits. While the media enjoys conjuring images of robot overlords replacing humans, like any tool, the reality is more likely that medicinal chemists using AI/ML tools will replace medicinal chemists that don’t. With more and more companies dedicated to bringing AI/ML tools into the hands of expert scientists, this prospect should be more exciting than scary – after all, who still wants to run manual columns now that we have automated chromatography tools?

A strategy that will likely work well for companies in the space is to focus on a niche application that they excel at, building on a flywheel effect from economies of scale and experience gained through partnerships with a broad range of partners (think the WuXi or Charles River of AI). In the last several years, the high valuations given to biotechs with their own pipelines have contributed to the rapid growth in companies starting to use AI/ML to develop their own assets. As biotech valuations have cooled, we may see a shift back from asset-focused AI/ML biotechs to companies focused on AI/ML companies with fee-for-service or software models.

For now, there’s still plenty of work to be done. In our webinar transcripts generated with, we’ve seen Drug Hunter interpreted as “Duck Hunter,” “Dragon Hunter,” and our favorite, “Drunk Hunter.” Just for fun, we punched “Molecules of the Month” and “Drug Hunter” into, and this was what we got:

AI-generated images using the inputs: “Molecules of the Month” and “Drug Hunter” (source: DeepAi.Org)

If translation and image generation are this difficult, one can only imagine how challenging applying AI to drug discovery is. Kudos to the brilliant teams trying!

Further Reading

We hope this resource is helpful for getting caught up in this space. To look at more specific examples of molecules emerging from AI/ML companies, the targets and approaches being used, and the mechanisms of action, continue to parts II and III:

For more industry perspectives from experts on the impact of AI/ML on drug discovery and how the field  should be evaluated, see: 

Join the Drug Hunter mailing list

to get free content and resources weekly. Trusted by > 5,500 drug hunters worldwide. Unsubscribe anytime.


Join Subscribers from