Machine Learning in Chemistry
eBook - ePub

Machine Learning in Chemistry

The Impact of Artificial Intelligence

Hugh M Cartwright, Hugh M Cartwright

Share book
  1. 546 pages
  2. English
  3. ePUB (mobile friendly)
  4. Available on iOS & Android
eBook - ePub

Machine Learning in Chemistry

The Impact of Artificial Intelligence

Hugh M Cartwright, Hugh M Cartwright

Book details
Book preview
Table of contents
Citations

About This Book

Progress in the application of machine learning (ML) to the physical and life sciences has been rapid. A decade ago, the method was mainly of interest to those in computer science departments, but more recently ML tools have been developed that show significant potential across wide areas of science. There is a growing consensus that ML software, and related areas of artificial intelligence, may, in due course, become as fundamental to scientific research as computers themselves.

Yet a perception remains that ML is obscure or esoteric, that only computer scientists can really understand it, and that few meaningful applications in scientific research exist. This book challenges that view.

With contributions from leading research groups, it presents in-depth examples to illustrate how ML can be applied to real chemical problems. Through these examples, the reader can both gain a feel for what ML can and cannot (so far) achieve, and also identify characteristics that might make a problem in physical science amenable to a ML approach.

This text is a valuable resource for scientists who are intrigued by the power of machine learning and want to learn more about how it can be applied in their own field.

Frequently asked questions

How do I cancel my subscription?
Simply head over to the account section in settings and click on “Cancel Subscription” - it’s as simple as that. After you cancel, your membership will stay active for the remainder of the time you’ve paid for. Learn more here.
Can/how do I download books?
At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.
What is the difference between the pricing plans?
Both plans give you full access to the library and all of Perlego’s features. The only differences are the price and subscription period: With the annual plan you’ll save around 30% compared to 12 months on the monthly plan.
What is Perlego?
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Do you support text-to-speech?
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Is Machine Learning in Chemistry an online PDF/ePUB?
Yes, you can access Machine Learning in Chemistry by Hugh M Cartwright, Hugh M Cartwright in PDF and/or ePUB format, as well as other popular books in Sciences physiques & Chimie physique et théorique. We have over one million books available in our catalogue for you to explore.

Information

CHAPTER 1
Computers as Scientists
TIMOTHY E. H. ALLENa,b
a MRC Toxicology Unit, Hodgkin Building, Lancaster Road, Leicester LE1 7HB, UK
b Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK

1.1 What Is Computational Science?

Computers excel at many tasks that humans are very bad at (see Figure 1.1). For example, if you have a smartphone with a voice assistant, try asking it what the square root of 32 761 is. For the phone, this problem is very straightforward, and it almost immediately returns the correct answer of 181. If instead, we were to ask a person the same question, they would probably find it very challenging. However, if they get the correct answer as quickly as the phone, we will think they are a genius (or perhaps more likely that they are cheating!). Now, for comparison, try to engage your voice assistant with a knock-knock joke, and it doesn't work. In contrast to the mathematical challenge, the machine finds this social interaction impossibly complicated, but we would think a person a simpleton if he/she is not able to partake in it. Children learn how to make these jokes before they learn to add. This illustrates why computers make such good scientific companions as they help scientists to be better at the things that they are naturally much worse at.
image
Figure 1.1 Strengths and weaknesses of humans and computers.
Computational science is a scientific field, which uses a computational methodology to answer scientific questions. Computational scientists are most often trying to answer questions in fields such as chemistry, physics or biology using the algorithms developed by computer scientists. This is, in some respects, analogous to an experimental chemist who uses a reaction to make a molecule in the lab, as opposed to the one who proposes that reaction, or a scientist who uses an NMR machine to determine the chemical composition of a biological sample using a procedure devised by an NMR expert. This is not to say that the scientists deploying these techniques, which might be algorithmic, experimental or otherwise, are any less important than their discoverers. The most impactful scientific techniques are those which have long-term applications across multiple fields.
The use of artificial intelligence (AI) and machine learning (ML) algorithms, as described later in this book, could be considered as computational science, but chemistry has a long track record in other computational methods too. Numerous computational techniques do not require AI but rather use the power of computers to solve mathematical equations. Many of the most complex problems are in quantum mechanics.
The field of quantum mechanics is just over 100 years old. In 1900, Max Planck1 proposed that energy is absorbed and radiated in discrete quanta (or packets of energy). Albert Einstein2 used quantum theory to explain the photoelectric effect in 1905, which led to the theory that light acts as both a wave and a particle. Louis de Broglie3 extended wave–particle duality to include all types of matter, including, importantly, electrons. Once it was accepted that electrons behave as waves, the hunt began to find a way of describing these waves mathematically. A mathematical model was proposed by Erwin Schrodinger4 in 1925, and this is shown below, in its time-independent form;
image
(1.1)
where Ĥ is the Hamiltonian operator, Ψ is the wavefunction and E is a constant equal to the energy of the system. The mathematics of this equation is far beyond the scope of this chapter, but it is worth noting that finding solutions to the Schrodinger equation is not trivial for any system. At present, the equation cannot be solved exactly for species with multiple electrons. However, with suitable assumptions and high-powered computers, approximate probability distributions can be found, which tell us where electrons are (approximately, as for quantum systems, it is impossible to know exactly where the electron is) and their associated energy levels. Such details are useful in chemistry to help explain how, and why, the reactions happen.
Because of the wealth of detail that quantum calculations provide, the use of computational calculations to explain, predict and even mimic chemical reactions is an extremely important field. Ken Houk's group at UCLA, for example, have been exploring asymmetric chemical catalysis for many years using computational calculations.5 Such chemical reactions are crucial in the manufacture of molecules with chiral centres, and many of which have important biological properties. Chemicals can have handedness or chirality, which can lead to important biological consequences, as some molecules fit into spaces that others cannot. To manufacture such molecules, we must consider which product(s) might be generated by a chemical reaction that could follow several different pathways. Computers can help us predict why certain reactions are more favourable. To do this, we need, among other things, to calculate the energy barrier that exists between the reactants and the products of a reaction. Computers can consider the shapes of molecules in 3D, their interactions with one another and the energies of these systems, allowing the calculation of these so-called activation energies (see Figure 1.2).
image
Figure 1.2 A common procedure used in the Houk group for the computational investigation of chemical reactions. Reproduced from ref. 5 with permission from American Chemical Society, Copyright 2016.
From among all the chemically reasonable routes from reactants to products, how can we select the one that has the lowest activation energy? This low-energy pathway will often dominate the reaction, and therefore its product should be the one that is experimentally observed. Calculations of the activation barrier can be considered alongside the experimental observations to evaluate the hypothesis and gain insight – does what is expected through the calculations match the observation? If so, the model can provide useful insight into the chemistry of the interaction by providing information about the structure and dynamics of the transition state as chemicals move from reactants to products. Perhaps, is there an unfavourable interaction between two large groups in the unfavoured pathway that raises its energy? Or is there any favourable electronic interaction in the favoured pathway that lowers the barrier? Such insights can be used subsequently to design new and more efficient reactions.

1.2 What Is Artificial Intelligence?

Defining AI is surprisingly challenging. When trying to do so, I often think of the idea of a generally intelligent machine (a machine able to do a wide range of different tasks as a human does) and would, therefore, define an intelligent machine as A machine capable of doing things that humans consider to be intelligent.
Although this may not be entirely satisfactory, it sums up the goal of AI science quite well as humans are the most intelligent species on earth. It also aligns with the well-known Turing test, developed by Alan Turing in 1950. The standard version of this test requires a human interpreter to converse with, and subsequently judge, two other hidden players, of which one is another human and one is a machine. If, after having conversations with both players, the interpreter cannot distinguish between the machine and the human, the machine would be deemed to have passed the test and therefore be considered “intelligent” (within the boundaries of Turing's test of course) (see Figure 1.3).6 In this context, the test is often referred to as “the imitation game”.
image
Figure 1.3 The Turing test.
Some tasks that we might consider require a degree of intelligence include speech and vision, scientific and medical decision making, and strategic gameplay. We will consider some examples of these tasks in the rest of this section.
AI has been a pursuit of scientists at least since the 1950s. During that decade, there were three important meetings. A “Session on Learning Machines” in Los Angeles in 1955, a “Summer Research Project on Artificial Intelligence” at Dartmouth College in 1956, and a symposium on the “Mechanization of Thought Processes” at the National Physical Laboratory in the United Kingdom.6 These early meetings tackled many of the problems that still occupy AI researchers today. These include examples of tasks such as pattern recognition,7 understanding of language,8 and playing chess.9 Other, more mechanistic, issues were also discussed, including the imitation of the human brain and central nervous system as the basis for learning machines,7 and the use of iterative learning to ultimately improve computer performance in learning tasks,10 as well as the realisation that powerful machines would be required to address many of these challenging topics.
Let us consider again the definition of AI given earlier. What is it about humans that makes them intelligent? And does answering this help us understand what kinds of things machines need to be able to do to also be considered intelligent?
Humans sense the world around them through vision. They explore the world by moving around it. They can speak to one another and recognise incoming speech. They can also recognise patterns in objects they encounter and events they experience. These goals are all currently being explored in AI, in the fields of computer vision, robotics, speech recognition, natural language processing, and pattern recognition. Let's cover some examples from the history of AI in these different research areas.
In the late 1950s and early 1960s, a number of projects attempted to use pattern recognition algorithms to aid in the identification of target objects in aerial photographs (aerial reconnaissance). Laveen N. Kanal, Neil C. Randall, and Thomas Harley at the Philco Corporation, for example, attempted to screen aerial photographs for military tanks.6,11 A small section of the film was processed to enhance any edges, and the result presented to the target detection system as a 32×32 array of 1's and 0's. This array was segmented into 24 overlapping 8×8 “feature blocks”, and each of which was then evaluated using a statistical test to establish if the block contained a part of ...

Table of contents