eBook - ePub

Surrogates

Name: Surrogates
Author: Robert B. Gramacy

Gaussian Process Modeling, Design, and Optimization for the Applied Sciences

Robert B. Gramacy,

560 pages
English
ePUB (mobile friendly)
Available on iOS & Android

eBook - ePub

Surrogates

Gaussian Process Modeling, Design, and Optimization for the Applied Sciences

Robert B. Gramacy,

About this book

Computer simulation experiments are essential to modern scientific discovery, whether that be in physics, chemistry, biology, epidemiology, ecology, engineering, etc. Surrogates are meta-models of computer simulations, used to solve mathematical models that are too intricate to be worked by hand. Gaussian process (GP) regression is a supremely flexible tool for the analysis of computer simulation experiments. This book presents an applied introduction to GP regression for modelling and optimization of computer simulation experiments.

Features:
• Emphasis on methods, applications, and reproducibility.
• R code is integrated throughout for application of the methods.
• Includes more than 200 full colour figures.
• Includes many exercises to supplement understanding, with separate solutions available from the author.
• Supported by a website with full code available to reproduce all methods and examples.

The book is primarily designed as a textbook for postgraduate students studying GP regression from mathematics, statistics, computer science, and engineering. Given the breadth of examples, it could also be used by researchers from these fields, as well as from economics, life science, social science, etc.

Frequently asked questions

Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.

At the moment all of our mobile-responsive ePub books are available to download via the app. Most of our PDFs are also available to download and we're working on making the final remaining ones downloadable now. Learn more here.

Perlego offers two plans: Essential and Complete

Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.

Both plans are available with monthly, semester, or annual billing cycles.

We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.

Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.

Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.

Yes, you can access Surrogates by Robert B. Gramacy in PDF and/or ePUB format, as well as other popular books in Mathematics & Probability & Statistics. We have over one million books available in our catalogue for you to explore.

Information

Publisher

Year

Print ISBN

eBook ISBN

Edition

Topic

Mathematics

Subtopic

Probability & Statistics

Index

Mathematics

1 Historical Perspective

A surrogate is a substitute for the real thing. In statistics, draws from predictive equations derived from a fitted model can act as a surrogate for the data-generating mechanism. If the fit is good – model flexible yet well-regularized, data rich enough and fitting scheme reliable – then such a surrogate can be quite valuable. Gathering data is expensive, and sometimes getting exactly the data you want is impossible or unethical. A surrogate could represent a much cheaper way to explore relationships, and entertain “what ifs?”. How do surrogates differ from ordinary statistical modeling? One superficial difference may be that surrogates favor faithful yet pragmatic reproduction of dynamics over other things statistical models are used for: interpretation, establishing causality, or identification. As you might imagine, that characterization oversimplifies.

The terminology came out of physics, applied math and engineering literatures, where the use of mathematical models leveraging numerical solvers has been commonplace for some time. As such models became more complex, requiring more resources to simulate/solve numerically, practitioners increasingly relied on meta-models built off of limited simulation campaigns. Often they recruited help from statisticians, or at least used setups resembling ones from stats. Data collected via expensive computer evaluations tuned flexible functional forms that could be used in lieu of further simulation. Sometimes the goal was to save money or computational resources; sometimes to cope with an inability to perform future runs (expired licenses, off-line or over-impacted supercomputers). Trained meta-models became known as surrogates or emulators, with those terms often used interchangeably. (A surrogate is designed to emulate the numerics coded in the solver.) The enterprise of design, running and fitting such meta-models became known as a computer experiment.

So a computer experiment is like an ordinary statistical experiment, except the data are generated by computer codes rather than physical or field observations, or surveys. Surrogate modeling is statistical modeling of computer experiments. Computer simulations are generally cheaper than physical observation, so the former could be entertained as an alternative or precursor to the latter. Although computer simulation can be just as expensive as field experimentation, computer modeling is regarded as easier because the experimental apparatus is better understood, and more aspects may be controlled. For example many numerical solvers are deterministic, whereas field observations are noisy or have measurement error. For a long time noise was the main occupant in the gulf between modeling and design considerations for surrogates, on the one hand, and more general statistical methodology on the other. But hold that thought for a moment.

Increasingly that gulf is narrowing, not so much because the nature of experimentation is changing (it is), but thanks to advances in machine learning. The canonical surrogate model, a fitted Gaussian process (GP) regression, which was borrowed for computer experiments from the geostatistics’ kriging¹ literature of the 1960s, enjoys wide applicability in contexts where prediction is king. Machine learners exposed GPs as powerful predictors for all sorts of tasks², from regression to classification, active learning/sequential design, reinforcement learning and optimization, latent variable modeling, and so on. They also developed powerful libraries, lowering the bar to application by non-expert practitioners, especially in the information technology world. Facebook uses surrogates to tailor its web portal and apps to optimize engagement; Uber uses surrogates trained to traffic simulations to route pooled ride-shares in real-time, reducing travel and wait time.

Round about the same time, computer simulation as a means of scientific inquiry began to blossom. Mathematical biologists, economists and others had reached the limit of equilibrium-based mathematical modeling with cute closed-form solutions. They embraced simulation as a means of filling in the gap, just as physicists and engineers had decades earlier. Yet their simulations were subtly different. Instead of deterministic solvers based on finite elements³, Navier–Stokes⁴ or Euler methods⁵, they were building stochastic simulations⁶, and agent-based models⁷, to explore predator-prey (Lotka–Voltera⁸) dynamics, spread of disease, management of inventory or patients in health insurance markets. Suddenly, and thanks to an explosion in computing capacity, software tools, and better primary school training in STEM⁹ subjects (all decades in the making), simulation was enjoying a renaissance. We’re just beginning to figure out how best to model these experiments, but one thing is for sure: the distinction between surrogate and statistical model is all but gone.

If there’s (real) field data, say on a historical epidemic, further experimentation may be almost entirely limited to the mathematical and computer modeling side. You can’t seed a real community with Ebola and watch what happens. Epidemic simulations, and surrogates built from a limited number of expensive runs where virtual agents interact and transmit infection, can be calibrated to a limited ...

Cover
Half Title
Series Page
Title Page
Copyright Page
Dedication
Table of Contents
Preface
1 Historical Perspective
2 Four Motivating Datasets
3 Steepest Ascent and Ridge Analysis
4 Space-filling Design
5 Gaussian Process Regression
6 Model-based Design for GPs
7 Optimization
8 Calibration and Sensitivity
9 GP Fidelity and Scale
10 Heteroskedasticity
Appendix
Bibliography
Index

About this book

Frequently asked questions

Information

Table of contents