By Cyril Bucher
You’ve probably lived this story before: you’re working on your favorite kinase, YFK1, and are looking for selective inhibitors over YFK2. The problem is, they’re virtually identical near the active site, save for a single residue sitting in the back pocket that your amazing structural biologist has identified. The team thinks you have a better chance of getting selectivity for YFK1 over YFK2 with kinase inhibitor Types I½ or II, which both bind to the back pocket (Figure 1). Is there any way to pick out the Type I½ or Type II kinase inhibitors out of the 250,000 hinge-binding compounds in your kinase-focused library? This task sounds like a perfect application for machine learning, which a team from Boehringer Ingelheim and the Friedrich-Wilhelm-Universität recently reduced to practice. (Miljković, F., Rodríguez-Pérez, R., Bajorath, J. “Machine Learning Models for Accurate Prediction of Kinase Inhibiors with Different Binding Modes.” J. Med. Chem. 2019. doi: 10.1021/acs.jmedchem.9b00867)

The authors first assembled a dataset of 1425 Type I, 394 Type I½, and 190 Type II inhibitors with binding modes confirmed by X-ray co-crystal structure in the PDB.[note]Interestingly, they found that 3.6% of these inhibitors adopted multiple binding modes against different kinases, a feature associated with inhibitor promiscuity.[/note] They divided the inhibitors into evenly sized training and test sets, and then tested several machine learning algorithms to see whether an algorithm could distinguish the three inhibitor classes in the test sets after being trained on the training sets. The team ultimately found a model that was able to distinguish the three classes of inhibitors with remarkably high sensitivity and specificity while starting from an impressively small training set (Figure 2). From this proof of concept, one can imagine a future where libraries might be more computationally enriched before screening (think “focused focused” libraries), leading to smaller screens and higher quality hits.

Take a look at the following 8 kinase inhibitors, and try to classify them by their kinase inhibitor type (I or II, answers in the caption below). How long did it take you? Now imagine doing it 100,000 times to curate a library! This illustrates how much time computational tools like machine-learning could save scientists in the future.

Props to the authors for putting together a highly useful dataset, developing an excellent classification algorithm, and for making both their dataset and algorithm freely available to the community.
Explore dhdevsite0.wpengine.com for more.