
- 264 pages
- English
- ePUB (mobile friendly)
- Available on iOS & Android
eBook - ePub
Transformation and Weighting in Regression
About this book
This monograph provides a careful review of the major statistical techniques used to analyze regression data with nonconstant variability and skewness. The authors have developed statistical techniques--such as formal fitting methods and less formal graphical techniques-- that can be applied to many problems across a range of disciplines, including pharmacokinetics, econometrics, biochemical assays, and fisheries research.
While the main focus of the book in on data transformation and weighting, it also draws upon ideas from diverse fields such as influence diagnostics, robustness, bootstrapping, nonparametric data smoothing, quasi-likelihood methods, errors-in-variables, and random coefficients. The authors discuss the computation of estimates and give numerous examples using real data. The book also includes an extensive treatment of estimating variance functions in regression.
Frequently asked questions
Yes, you can cancel anytime from the Subscription tab in your account settings on the Perlego website. Your subscription will stay active until the end of your current billing period. Learn how to cancel your subscription.
No, books cannot be downloaded as external files, such as PDFs, for use outside of Perlego. However, you can download books within the Perlego app for offline reading on mobile or tablet. Learn more here.
Perlego offers two plans: Essential and Complete
- Essential is ideal for learners and professionals who enjoy exploring a wide range of subjects. Access the Essential Library with 800,000+ trusted titles and best-sellers across business, personal growth, and the humanities. Includes unlimited reading time and Standard Read Aloud voice.
- Complete: Perfect for advanced learners and researchers needing full, unrestricted access. Unlock 1.4M+ books across hundreds of subjects, including academic and specialized titles. The Complete Plan also includes advanced features like Premium Read Aloud and Research Assistant.
We are an online textbook subscription service, where you can get access to an entire online library for less than the price of a single book per month. With over 1 million books across 1000+ topics, we’ve got you covered! Learn more here.
Look out for the read-aloud symbol on your next book to see if you can listen to it. The read-aloud tool reads text aloud for you, highlighting the text as it is being read. You can pause it, speed it up and slow it down. Learn more here.
Yes! You can use the Perlego app on both iOS or Android devices to read anytime, anywhere — even offline. Perfect for commutes or when you’re on the go.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Please note we cannot support devices running on iOS 13 and Android 7 or earlier. Learn more about using the app.
Yes, you can access Transformation and Weighting in Regression by Raymond J. Carroll,David Ruppert in PDF and/or ePUB format, as well as other popular books in Mathematics & Probability & Statistics. We have over one million books available in our catalogue for you to explore.
Information
CHAPTER 1
Introduction
1.1 Preliminaries
When modeling data it is often assumed that, in the absence of randomness or error, one can predict a response y from a predictor x through the deterministic relationship
(1.1) |
where β is a regression parameter. The equation (1.1) is often a theoretical (biological or physical) model, but it may also be an empirical model that seems to work well in practice, e.g., a linear regression model. In either case, once we have determined β then the system will be completely specified.
Observed data almost never fit a model exactly, so that β cannot be determined exactly. Such a situation occurs when the relationship (1.1) holds exactly, but we are able to observe the response y or the predictor x only with error. This is the useful perspective behind the ‘errors-in-variables’ literature, although it is not the standard explanation why observed data do not fit models exactly. Most statistical analyses assume that we can measure the predictor x accurately, but that, given a value of x, the relationship with the observed response y is not deterministic. Partly this might be due to measurement error in y, but there are also other factors to consider. The lack of determinism could be a physical fact, e.g., the physical process generating an observed y from x might not be deterministic. Also, the model might not be exact either because of slight misspecification or because certain other predictors might have been omitted.
Understanding the variability of the response is important. Efficient estimation of the unknown parameter β requires knowledge of the structure of the errors. Also, understanding the distribution of y may be important in itself, particularly in prediction and calibration problems.
A good example of these issues is the relationship between the number of fish in a spawning population (‘spawners’ denoted by S) and the number of new fish eventually recruited into the fishery (‘recruits’ denoted by R). A standard theoretical deterministic model is due to Ricker (1954), and can be written as
(1.2) |
for β2 ⩾ 0. This model is discussed at length in Chapter 4. In practice, the spawner–recruit relationship is not deterministic, and in fact there is considerable error in the equation. Partly, this is due to difficulties in measuring R and S, but perhaps even more important is the existence of unobserved environmental factors, e.g., ocean currents, pollution, predator depletion, and availability of food. Estimating the parameter β is important, but just as important for managing the fishery is understanding the distribution of the recruitment given the size of the spawning population. Only after this distribution is understood can one begin to assess the impact of different management strategies on the long-term productivity of the fishery.
In Ruppert et al. (1985) and Reish et al. (1985), a stochastic model for the Atlantic menhaden fishery is used to analyze the risk inherent in management strategies. The risk is largely due to the stochastic nature of recruitment, though estimation error in the parameters is also a factor. Certain harvesting policies, such as constant-catch policies where the same size catch is taken every year, are optimal for a deterministic fishery, but perform surprisingly poorly in stochastic situations. The main point is that the size and nature of management risk can only be assessed if a realistic model of recruitment variability (conditional on the size of the spawning population) is used.
1.2 The classical regression model
The usual regression model makes four basic assumptions and the analyses are often based on a fifth. The first four assumptions are that the model is correctly specified on average, and that the errors are independent, have the same distribution, and have constant variability, i.e.
(1.3) |
(1.4) |
(1.5) |
(1.6) |
Of course, (1.5) implies (1.4). Sometimes only (1.4) is assumed, so we list it separately. Most often, the parameter β is estimated by least squares, the basic motivation for which is that it is efficient if the errors are normally distributed. This leads to the fifth assumption, i.e.
(1.7) |
To see how these assumptions might work in practice, we consider the spawner-recruit relationship for the Skeena River sockeye salmon. The data are described in Chapter 4 and are plotted in Figure 4.1. In Figure 4.3, we plot the predicted values against the residuals. An examination of these two figures suggests that at least two of the major classical assumptions are strongly violated. The first assumption (1.3) seems reasonable, as the Ricker curve appears adequate. Whether the independence assumption (1.6) holds is not clear here, but it can be taken as a working hypothesis especially in light of the small size of the data set. A check of the serial correlations of the residuals suggests that (1.6) is not seriously violated. As can be seen from either figure, the variability of the data increases as a function of the size of the spawning population, thus violating assumptions (1.4) and (1.5). The normality assumption (1.7) seems adequate, although there may be some right-skewness.
The major purpose of this book is to discuss methods for analyzing regression data when the classical assumptions of constant variance and/or normality are violated.
Having observed that the variances are not constant, we might take as a working hypothesis that the variances are proportional to the square of the mean. This is a model with a constant coefficient of variation (standard deviation divided by mean). This model involves no additional parameters, but requires the a priori assumption about the variance function. A more flexible approach is to use a variance model with unknown parameters, for example to assume that the variance is proportional to some (unknown) power of the mean. To obtain a better estimate of the regression parameter β and to understand the distribution of the size of the recruit population given a spawning population size, two basic strategies can be employed.
First, we might continue to assume normally distributed obs...
Table of contents
- Cover
- Title Page
- Copyright Page
- Dedication
- Table of Contents
- Preface
- 1 Introduction
- 2 Generalized least squares and the analysis of heteroscedasticity
- 3 Estimation and inference for variance functions
- 4 The transform-both-sides methodology
- 5 Combining transformations and weighting
- 6 Influence and robustness
- 7 Technical complements
- 8 Some open problems
- References
- Author Index
- Subject Index