Artificial Intelligence in Drug Discovery
eBook - ePub

Artificial Intelligence in Drug Discovery

Nathan Brown, Nathan Brown

Share book
  1. 406 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Artificial Intelligence in Drug Discovery

Nathan Brown, Nathan Brown

Book details
Book preview
Table of contents
Citations

About This Book

Following significant advances in deep learning and related areas interest in artificial intelligence (AI) has rapidly grown. In particular, the application of AI in drug discovery provides an opportunity to tackle challenges that previously have been difficult to solve, such as predicting properties, designing molecules and optimising synthetic routes. Artificial Intelligence in Drug Discovery aims to introduce the reader to AI and machine learning tools and techniques, and to outline specific challenges including designing new molecular structures, synthesis planning and simulation. Providing a wealth of information from leading experts in the field this book is ideal for students, postgraduates and established researchers in both industry and academia.

Frequently asked questions

How do I cancel my subscription?
Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is Artificial Intelligence in Drug Discovery an online PDF/ePUB?
Yes, you can access Artificial Intelligence in Drug Discovery by Nathan Brown, Nathan Brown in PDF and/or ePUB format, as well as other popular books in Computer Science & Artificial Intelligence (AI) & Semantics. We have over one million books available in our catalogue for you to explore.
Section 1: Introduction to Artificial Intelligence and Chemistry
Chapter 1
Introduction
Nathan Browna
a BenevolentAI4-8 Maple StreetLondonW1T 5HDUK[email protected]
The resurgence in artificial intelligence research and successful applications over the past decade has led to renewed interest in its application to drug discovery, particularly how best to model drug properties and design molecular structures that satisfy desired profiles. This book is one of the first to combine both the learning from applying multiple methods over many decades and to include the most recent algorithmic advances in artificial intelligence to demonstrate value to the drug discovery process. As time progresses, the new artificial intelligence methods and algorithms will become more commonplace and eventually become routine approaches as they combine and adapt within the existing software ecosystem in existing chemoinformatics, molecular modelling, and computational chemistry. These innovations will give rise to a new and revised toolkit from which we can select the most promising algorithms and approaches to help design and validate the next compounds to make and test, and eventually the new drugs that will reach the market as approved novel therapeutics for unmet human needs.

1.1 Introduction

Drug discovery applies to a vast range of technologies in the interest of ushering new chemical entities of disease relevance into the clinic to meet, as yet, unmet patient needs. While many of the technological methods use experiments in so-called “wet” laboratories, the development and application of computational methods, often called in silico as opposed to in vitro or in vivo, have been in wide usage for many decades. Recently, however, a renaissance of Artificial Intelligence, and specifically Machine Learning, approaches have led the vanguard of novel applications to speed drug discovery not only in efficiency gains but also in the development of improved medications for patients.
This book covers a number of different and new approaches to applying artificial intelligence and machine learning methods in drug discovery, sometimes with entirely new algorithms, but often-times building on years of research and applied anew aided by concomitant algorithmic improvements, but also significant improvements in software and hardware that allows hitherto inaccessible quantities of computational power.
The first chapters of this book consider chemical data and learning from unstructured data sources, such as those found in the literature and in patents. One of the most challenging aspects of using Artificial Intelligence in drug discovery is teasing out insights that have been discovered but often published in a way that is not amenable to further analysis – essentially obfuscating this data through publication in formats that cannot be easily read by machines. The first of these chapters looks at the area of chemical topic modelling, which is an unsupervised learning method to extract meaning from natural language, using a methodology called natural language processing. The data extracted permits the clustering of documents based on the co-occurrences of words in these publications thereby permitting an easier approach to grouping documents based on subject potentially obviating the need for manual curation.
Later chapters of the book cover predictive modelling, specifically making use of ligand and protein structure data, respectively. However, it can often be challenging to entirely deconvolute these two data types, since aspects of each are at least implicit, if not explicit, in the other. Ligand-based predictive modelling has a long and distinguished history in drug discovery, going right back to the pioneering work of Corwin Hansch and Toshio Fujita of Quantitative Structure-Activity Relationships (QSARs) back in the 1950s and 1960s. These QSAR equations were derived manually, originally to be used to predict certain molecular properties using mathematical models. In more recent decades, the processes have moved towards using large scale data sources and libraries of molecular descriptors to automate the generation of predictive models using more modern machine learning algorithms. However, it should be recognised that the field of Machine Learning arose contemporaneously with QSARs in the late 1950s.
One of the most important aspects of predictive modelling is not using the predicted values of the endpoint independently, but rather incorporating that with consideration of whether that prediction may be reliable based on the data that are available, or even when the data available is simply insufficient to make any predictions and perhaps the best recommendation is to generate new data to inform our data space. Krstajic offers some important advice in this area with his chapter on the importance of defining the “don't know” answer. This advice is two-fold in importance. Firstly, it encourages diligence in application of our predictive models to drug discovery. Secondly, however, it offers an honesty in modelling to ensure that scientists from medicinal chemistry and computational methods understand more clearly where the data is limited and offer new insights and priorities for chemical synthesis and testing to bolster those datasets.
As protein structure data becomes more prevalent due to protein crystallography becoming a routine assay type in drug discovery, new methods arise that incorporate these more sizable datasets to offer more protein binding site contextual information rather than that merely implied by ligand structure data alone. Methods included in the structure-based predictive modelling chapters are predicting protein-ligand molecular recognition, approaches to using convolutional neural networks in virtual screening, and applying machine learning methods in enhancing the impact of molecular dynamics simulations.
One of the most significant challenges in synthetic organic chemistry, let alone medicinal chemistry and drug discovery, is the design and planning of new chemical syntheses. Given a molecular target, what series of reactions and indeed conditions can be optimised to minimise materials, effort and time to produce the desired results in appropriate yield for its intended purpose in the laboratory? The challenge of synthesis planning has been something of a holy grail in the area of applying artificial intelligence methods to chemistry and drug discovery over many decades, going back to the work of E.J. Corey and others with their retrosynthetic planning working backwards from the desired product to decide which steps should make up parts of the synthesis. However, in recent years, more modern deep learning artificial intelligence, in addition to symbolic artificial intelligence methods have come to the fore. These new methods take advantage of the vast repositories of reaction data held in public databases, proprietary data held by publishers, and internal data sources at chemical companies to rapidly synthesise options of synthetic routes that have been demonstrated to be competitive with human experts. While a significant amount of research remains outstanding to develop these methods into something that is capable of working competitively with human experts, this area of research is one of the most researched areas of computational sciences applied to chemistry in recent years.
The last chapters of this book bring together many of the topics already covered earlier in the book but brings to bear this combination of methods to the more holistic approach of molecular design. The field of drug discovery is itself a multi-objective or multi-parametric challenge. Approved drugs must satisfy requirements of safety and efficacy at the intended dose, but this is itself a convolution of many different requirements and vast arrays of assays that must be considered in the eventual recommendation of not only a drug for human benefit, but also the nomination of a clinical candidate. While methods for molecular design incorporate many different approaches, the heart of the challenge is molecular graph generation: being able to recommend what to make and how to make it to satisfy the various constraints identified as important for the disease area being considered. A number of artificial intelligence and deep learning approaches have been investigated in recent years, including recurrent neural networks, junction tree variational autoencoders and other deep generative models, riding the resurgent wave of artificial intelligence methods, it is important to reflect on the challenges involved in molecular design, independent of the method used to suggest said designs. Medicinal chemistry and drug discovery have a long history of designing new molecules both manually and using automated or semi-automated ways to suggest novel designs. A significant approach is that of matched molecular pair analysis, which has historically been considered as a relatively manual approach investigating molecular replacements around a chemical series of interest. However, in more recent years, computational methods have been developed to abstract the wealth of information from data generated in various organisations to automate a data-driven process to matched molecular pairs giving rise to a vast database of molecular transforms that have concomitant levels of statistical significance in terms of modulating certain properties of interest.
As research progresses in the area of molecular design, many of the emerging methods, together with older and more established approaches, will likely combine into various hybrid systems that permit reliable and reproducible molecular design at scale. Some of the most significant challenges include those in optimising not only potency but also ADMET properties, in addition to the design of molecular structures that can indeed be made effectively and efficiently in the laboratory to test the virtual hypotheses being asked. As we draw closer the theoretical worlds of design and the practical efforts of synthesis and testing of those designed compounds, so will the modern laboratory change and adapt to facilitate more close-coupled drug discovery and development and thereby enhance efficiency and optimisation processes, which is touched on in the last main chapter of this book with a practical vision of how we can apply these methods to democratise the discovery of new chemicals.
Section 2: Chemical Data
Chapter 2
The History of Artificial Intelligence and Chemistry
Nathan Browna
a BenevolentAI4-8 Maple StreetLondonW1T 5HDUK[email protected]
Artificial intelligence today is a very broad church covering many different areas and disciplines and has had impact over a number of years. In drug discovery, although there have been very many new innovations in applying artificial intelligence to challenges here and in chemistry in general, methods from AI have been applied over decades in various forms to contribute to the progress to challenges in developing new therapies and new scientific methods.

2.1 Artificial Intelligence in History

Although relatively new in name, the concepts of artificial intelligence have had a long history throughout millenia of mythology and human progress.1 The concepts of artificial intelligence, robots and autonomous robots were highlighted by Greek poets from almost three thousand years ago, from the likes of Homer and Hesiod. Talos, mentioned by Hesiod, is an early example of a robotic creature made of bronze created by Hephaestus, the Greek god of blacksmiths and metalworking, amongst other trades. Other examples include those of Galatea and Pandora. The sacred mechanical statues which were built in Egypt and also in Greece, were thought to have the capacity for emotion and wisdom. Trismegistus wrote: “they have sensus and spiritus 
 by discovering the true nature of the gods, man has been able to reproduce it”.
Automata and mechanical beings continued in their development for many centuries, from Yan Shi presenting the King Mu of Zhou with mechanical men in around the tenth century BCE, to Al-Jazari creating an orchestra that could be programmed to play music by a group of mechanical human beings. In the 1500s, Paracelsus, the Swiss physician, claimed to have created an artificial man from magnetism, sperm, and alchemy, but his claims are somewhat in dispute. Paracelsus, of course, also being the scientist who coined the phrase: “Sola dosis facit venenum” (Only the dose makes the poison). This being the foundation of toxicology in drug research. Paracelsus described his creation of an homunculus, or small human being, as such:
That the sperm of a man be putrefied by itself in a sealed cucurbit for forty days with the highest degree of putrefaction in a horse's womb, or at least so long that it comes to life and moves itself, and stirs, which is easily observed. After this time, it will look somewhat like a man, but transparent, without a body. If, after this, it be fed wisely with the Arcanum of human blood, and be nourished for up to forty weeks, and be kept in the even heat of the horse's womb, a living human child grows therefrom, with all its members like another child, which is born of a woman, but much smaller.
In the early 17th century, Descartes suggested that animals were complex machines, but did claim that mental phenomena were of a different substance.2 However, Hobbes countered this by suggesting a mechanical and combinatorial theory of cognition, which led him to conclude that “
 reason is nothing but reckoning”.
Mathematical calculators were subsequently invented and developed by the likes of Pascal and Leibniz, introducing first addition and subtraction, with the later additions of multiplication and division by Leibniz. Leibniz also developed binary representations around that time that became the foundation of modern computation systems.
Of course, theories of artificial intelligence were not only within the realms of science, technology, and philosophy, but also the arts and literature. Swift's Gulliver's Travels in 1726, imagined a world that included the Engine, that was able to assist people in the ability to write “Books in Philosophy, Poetry, Politicks (sic), Law, Mathematicks (sic), and Theology, with the least Assistance from Genius or study”. The Engine itself being a parody of the Ars Magna, and inspiration for Leibniz's mechanism. One of the most significant cultural references to artificial intelligence and artificial l...

Table of contents