eBook - ePub

Machine Learning in Chemistry

Name: Machine Learning in Chemistry
ISBN: 9781839160240

The Impact of Artificial Intelligence

Hugh M Cartwright,

546 pages
English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Machine Learning in Chemistry

The Impact of Artificial Intelligence

Hugh M Cartwright,

About this book

Progress in the application of machine learning (ML) to the physical and life sciences has been rapid. A decade ago, the method was mainly of interest to those in computer science departments, but more recently ML tools have been developed that show significant potential across wide areas of science. There is a growing consensus that ML software, and related areas of artificial intelligence, may, in due course, become as fundamental to scientific research as computers themselves.

Yet a perception remains that ML is obscure or esoteric, that only computer scientists can really understand it, and that few meaningful applications in scientific research exist. This book challenges that view.

With contributions from leading research groups, it presents in-depth examples to illustrate how ML can be applied to real chemical problems. Through these examples, the reader can both gain a feel for what ML can and cannot (so far) achieve, and also identify characteristics that might make a problem in physical science amenable to a ML approach.

This text is a valuable resource for scientists who are intrigued by the power of machine learning and want to learn more about how it can be applied in their own field.

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.

No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn more here.

Perlego offers two plans: Essential and Complete

Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.

Both plans are available with monthly, semester, or annual billing cycles.

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.

Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.

Yes, you can access Machine Learning in Chemistry by Hugh M Cartwright in PDF and/or ePUB format, as well as other popular books in Physical Sciences & Physical & Theoretical Chemistry. We have over one million books available in our catalogue for you to explore.

Information

Publisher

Royal Society of Chemistry

Year

2020

eBook ISBN

9781839160240

Edition

Topic

Physical Sciences

Subtopic

Physical & Theoretical Chemistry

CHAPTER 1

Computers as Scientists

TIMOTHY E. H. ALLEN^a,b

^a MRC Toxicology Unit, Hodgkin Building, Lancaster Road, Leicester LE1 7HB, UK

^b Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, UK

Email: [email protected]

1.1 What Is Computational Science?

Computers excel at many tasks that humans are very bad at (see Figure 1.1). For example, if you have a smartphone with a voice assistant, try asking it what the square root of 32 761 is. For the phone, this problem is very straightforward, and it almost immediately returns the correct answer of 181. If instead, we were to ask a person the same question, they would probably find it very challenging. However, if they get the correct answer as quickly as the phone, we will think they are a genius (or perhaps more likely that they are cheating!). Now, for comparison, try to engage your voice assistant with a knock-knock joke, and it doesn't work. In contrast to the mathematical challenge, the machine finds this social interaction impossibly complicated, but we would think a person a simpleton if he/she is not able to partake in it. Children learn how to make these jokes before they learn to add. This illustrates why computers make such good scientific companions as they help scientists to be better at the things that they are naturally much worse at.

Figure 1.1 Strengths and weaknesses of humans and computers.

Computational science is a scientific field, which uses a computational methodology to answer scientific questions. Computational scientists are most often trying to answer questions in fields such as chemistry, physics or biology using the algorithms developed by computer scientists. This is, in some respects, analogous to an experimental chemist who uses a reaction to make a molecule in the lab, as opposed to the one who proposes that reaction, or a scientist who uses an NMR machine to determine the chemical composition of a biological sample using a procedure devised by an NMR expert. This is not to say that the scientists deploying these techniques, which might be algorithmic, experimental or otherwise, are any less important than their discoverers. The most impactful scientific techniques are those which have long-term applications across multiple fields.

The use of artificial intelligence (AI) and machine learning (ML) algorithms, as described later in this book, could be considered as computational science, but chemistry has a long track record in other computational methods too. Numerous computational techniques do not require AI but rather use the power of computers to solve mathematical equations. Many of the most complex problems are in quantum mechanics.

The field of quantum mechanics is just over 100 years old. In 1900, Max Planck¹ proposed that energy is absorbed and radiated in discrete quanta (or packets of energy). Albert Einstein² used quantum theory to explain the photoelectric effect in 1905, which led to the theory that light acts as both a wave and a particle. Louis de Broglie³ extended wave–particle duality to include all types of matter, including, importantly, electrons. Once it was accepted that electrons behave as waves, the hunt began to find a way of describing these waves mathematically. A mathematical model was proposed by Erwin Schrodinger⁴ in 1925, and this is shown below, in its time-independent form;

(1.1)

where Ĥ is the Hamiltonian operator, Ψ is the wavefunction and E is a constant equal to the energy of the system. The mathematics of this equation is far beyond the scope of this chapter, but it is worth noting that finding solutions to the Schrodinger equation is not trivial for any system. At present, the equation cannot be solved exactly for species with multiple electrons. However, with suitable assumptions and high-powered computers, approximate probability distributions can be found, which tell us where electrons are (approximately, as for quantum systems, it is impossible to know exactly where the electron is) and their associated energy levels. Such details are useful in chemistry to help explain how, and why, the reactions happen.

Because of the wealth of detail that quantum calculations provide, the use of computational calculations to explain, predict and even mimic chemical reactions is an extremely important field. Ken Houk's group at UCLA, for example, have been exploring asymmetric chemical catalysis for many years using computational calculations.⁵ Such chemical reactions are crucial in the manufacture of molecules with chiral centres, and many of which have important biological properties. Chemicals can have handedness or chirality, which can lead to important biological consequences, as some molecules fit into spaces that others cannot. To manufacture such molecules, we must consider which product(s) might be generated by a chemical reaction that could follow several different pathways. Computers can help us predict why certain reactions are more favourable. To do this, we need, among other things, to calculate the energy barrier that exists between the reactants and the products of a reaction. Computers can consider the shapes of molecules in 3D, their interactions with one another and the energies of these systems, allowing the calculation of these so-called activation energies (see Figure 1.2).

From among all the chemically reasonable routes from reactants to products, how can we select the one that has the lowest activation energy? This low-energy pathway will often dominate the reaction, and therefore its product should be the one that is experimentally observed. Calculations of the activation barrier can be considered alongside the experimental observations to evaluate the hypothesis and gain insight – does what is expected through the calculations match the observation? If so, the model can provide useful insight into the chemistry of the interaction by providing information about the structure and dynamics of the transition state as chemicals move from reactants to products. Perhaps, is there an unfavourable interaction between two large groups in the unfavoured pathway that raises its energy? Or is there any favourable electronic interaction in the favoured pathway that lowers the barrier? Such insights can be used subsequently to design new and more efficient reactions.

1.2 What Is Artificial Intelligence?

Defining AI is surprisingly challenging. When trying to do so, I often think of the idea of a generally intelligent machine (a machine able to do a wide range of different tasks as a human does) and would, therefore, define an intelligent machine as A machine capable of doing things that humans consider to be intelligent.

Although this may not be entirely satisfactory, it sums up the goal of AI science quite well as humans are the most intelligent species on earth. It also aligns with the well-known Turing test, developed by Alan Turing in 1950. The standard version of this test requires a human interpreter to converse with, and subsequently judge, two other hidden players, of which one is another human and one is a machine. If, after having conversations with both players, the interpreter cannot distinguish between the machine and the human, the machine would be deemed to have passed the test and therefore be considered “intelligent” (within the boundaries of Turing's test of course) (see Figure 1.3).⁶ In this context, the test is often referred to as “the imitation game”.

Figure 1.3 The Turing test.

Some tasks that we might consider require a degree of intelligence include speech and vision, scientific and medical decision making, and strategic gameplay. We will consider some examples of these tasks in the rest of this section.

AI has been a pursuit of scientists at least since the 1950s. During that decade, there were three important meetings. A “Session on Learning Machines” in Los Angeles in 1955, a “Summer Research Project on Artificial Intelligence” at Dartmouth College in 1956, and a symposium on the “Mechanization of Thought Processes” at the National Physical Laboratory in the United Kingdom.⁶ These early meetings tackled many of the problems that still occupy AI researchers today. These include examples of tasks such as pattern recognition,⁷ understanding of language,⁸ and playing chess.⁹ Other, more mechanistic, issues were also discussed, including the imitation of the human brain and central nervous system as the basis for learning machines,⁷ and the use of iterative learning to ultimately improve computer performance in learning tasks,¹⁰ as well as the realisation that powerful machines would be required to address many of these challenging topics.

Let us consider again the definition of AI given earlier. What is it about humans that makes them intelligent? And does answering this help us understand what kinds of things machines need to be able to do to also be considered intelligent?

Humans sense the world around them through vision. They explore the world by moving around it. They can speak to one another and recognise incoming speech. They can also recognise patterns in objects they encounter and events they experience. These goals are all currently being explored in AI, in the fields of computer vision, robotics, speech recognition, natural language processing, and pattern recognition. Let's cover some examples from the history of AI in these different research areas.

In the late 1950s and early 1960s, a number of projects attempted to use pattern recognition algorithms to aid in the identification of target objects in aerial photographs (aerial reconnaissance). Laveen N. Kanal, Neil C. Randall, and Thomas Harley at the Philco Corporation, for example, attempted to screen aerial photographs for military tanks.^6,11 A small section of the film was processed to enhance any edges, and the result presented to the target detection system as a 32×32 array of 1's and 0's. This array was segmented into 24 overlapping 8×8 “feature blocks”, and each of which was then evaluated using a statistical test to establish if the block contained a part of ...

Cover
Title
Copyright
Preface
Contents
Chapter 1 Computers as Scientists
Chapter 2 How Do Machines Learn?
Chapter 3 MedChemInformatics: An Introduction to Machine Learning for Drug Discovery
Chapter 4 Machine Learning for Nonadiabatic Molecular Dynamics
Chapter 5 Machine Learning in Science – A Role for Mechanical Sympathy?
Chapter 6 A Prediction of Future States: AI-powered Chemical Innovation for Defense Applications
Chapter 7 Machine Learning for Chemical Synthesis
Chapter 8 Constraining Chemical Networks in Astrochemistry
Chapter 9 Machine Learning at the (Nano)materials-biology Interface
Chapter 10 Machine Learning Techniques Applied to a Complex Polymerization Process
Chapter 11 Machine Learning and Scoring Functions (SFs) for Molecular Drug Discovery: Prediction and Characterisation of Druggable Drugs and Targets
Chapter 12 Artificial Intelligence Applied to the Prediction of Organic Materials
Chapter 13 A New Era of Inorganic Materials Discovery Powered by Data Science
Chapter 14 Machine Learning Applications in Chemical Engineering
Chapter 15 Representation Learning in Chemistry
Chapter 16 Demystifying Artificial Neural Networks as Generators of New Chemical Knowledge: Antimalarial Drug Discovery as a Case Study
Chapter 17 Machine Learning for Core-loss Spectrum
Chapter 18 Autonomous Science: Big Data Tools for Small Data Problems in Chemistry
Chapter 19 Machine Learning for Heterogeneous Catalysis: Global Neural Network Potential from Construction to Applications
Chapter 20 A Few Guiding Principles for Practical Applications of Machine Learning to Chemistry and Materials
Subject Index

About this book

Frequently asked questions

Information

1.1 What Is Computational Science?

1.2 What Is Artificial Intelligence?

Table of contents