In this chapter, we review the fundamental concepts of the Bayesian approach to statistical inference. Bayesian statistics was first introduced over 250 years ago, but became only popular when it could address practical problems. For a long time Fisher’s theory based on the likelihood function as the fundamental engine of inference and the frequentist approach of Neyman and Pearson have ruled the statistical world. Until three decades ago, the Bayesian approach was looked upon as more of a curiosity rather than providing a tool for solving practical problems. This changed when Markov chain Monte Carlo techniques were introduced.
The chapter starts with reviewing the concepts of the classical approach, also called the frequentist approach. Central to the Bayesian approach is Bayes theorem. The origin of the theorem is a simple factorization of the joint probability into the product of a conditional and a marginal probability. The ingenious idea of Thomas Bayes is to apply this principle to the parameters of a statistical model and to assume that the uncertainty underlying their “true” value can be described using a probability model. We illustrate how the posterior distribution arises and can be computed from prior and data information. The characteristics of the posterior distribution are illustrated for binary and Gaussian responses. In addition, the most common posterior summary measures are discussed. Independent and dependent sampling, including Markov chain Monte Carlo techniques, to approximate the posterior distribution and posterior summary measures are discussed and illustrated. A brief and incomplete review of Bayesian software is then given. Most Bayesian analyses are based on parametric assumptions. Especially in the last decade, nonparametric Bayesian developments have seen the light but the theoretical level prevents us going deep here. Bayesian tools for model selection and model checking are also reviewed. Additional topics are treated in the final section as well as suggestions for further reading.
1.1 Introduction
Medical knowledge has expanded tremendously during the last century. This has given a boost to pharmaceutical and drug research. Following the thalidomide disaster (Kim and Scialli, 2011) in the late 1950’s, the involvement of statistics and of statisticians has increased exponentially. Initially, acting more as “policemen”, protecting medical researchers against over interpreting positive results, gradually statisticians have become involved and pro-active in all stages of medical research and more specifically also in drug research. The impact of statistics and statisticians on medical research truly cannot be overstated, especially in the course of the last five decades. To a large extent, this is due to the ingenious and hard work of so many statisticians such as Armitage, Cochran, Fisher, Neyman and Pearson, to name a few.
Medical knowledge grows by setting up successive experiments to test theoretical conjectures about the mechanisms of action and the resulting effectiveness of healthcare interventions. Each result, whether a failure or a success, gives insight into the medical processes. This is the successful paradigm that pharmaceutical research has followed over many years. For instance, before drugs enter the market they undergo numerous tests from pre-clinical studies, Phase I studies, Phase II studies to Phase III studies. Even when approved and registered by regulatory authorities such as the US Food and Drug Administration (FDA) and the European Medicines Agency (EMA), large scale studies are set up to evaluate the safety of the drugs. Nevertheless, despite this careful process of learning, the current process of accumulating knowledge has been criticized heavily since it turns out that much of the (medical) scientific results cannot be reproduced (Baker, 2011).
The classical statistical approach following the independent and somewhat adversarial developments of Fisher, on one side and Neyman & Pearson, on the other, has brought in much rigor in empirical medical research. However, classical tools such as the P-value are often misunderstood, overused and misused. In addition, while scientific knowledge is built up from successes and failures in the past, i.e. from learning from the past, the classical statistical tools do not allow us to incorporate explicitly past knowledge. The Bayesian approach, for a long time ignored and even opposed by many statisticians, allows us to incorporate in a flexible way historical information into current statistical analyses. Despite this important feature, until about the 1990’s, Bayesian analysis was largely considered a curiosity, due to the fact that, because of computational limitations it was not possible to tackle practical problems using Bayesian tools. This changed with the introduction of Markov chain Monte Carlo sampling techniques. Since then, the Bayesian approach has grown tremendously in popularity, certainly among statisticians and increasingly am...