Mathematics

Least Squares Linear Regression

Least Squares Linear Regression is a statistical method used to find the best-fitting line through a set of data points. It minimizes the sum of the squares of the vertical distances between the data points and the line. This technique is commonly used for modeling and predicting relationships between variables in various fields such as economics, engineering, and social sciences.

Written by Perlego with AI-assistance

8 Key excerpts on "Least Squares Linear Regression"

  • Book cover image for: Analysing Data in Statistics (Concepts and Applications)
    ________________________ WORLD TECHNOLOGIES ________________________ Chapter- 2 Linear Regression and Least Squares Linear Regression Example of linear regression with one independent variable In statistics, linear regression is an approach to modeling the relationship between a scalar variable y and one or more variables denoted X . In linear regression, models of the unknown parameters are estimated from the data using linear functions. Such models are called “linear models.” Most commonly, linear regression refers to a model in which the conditional mean of y given the value of X is an affine function of X . Less commonly, linear regression could refer to a model in which the median, or some other quantile of the conditional distribution of y given X is expressed as a linear function of X . Like all forms of regression analysis, linear regression focuses on the conditional probability distribution of y given X , rather than on the joint probability distribution of y and X , which is the domain of multivariate analysis. ________________________ WORLD TECHNOLOGIES ________________________ Linear regression was the first type of regression analysis to be studied rigorously, and to be used extensively in practical applications. This is because models which depend linearly on their unknown parameters are easier to fit than models which are non-linearly related to their parameters and because the statistical properties of the resulting estimators are easier to determine. Linear regression has many practical uses. Most applications of linear regression fall into one of the following two broad categories: • If the goal is prediction, or forecasting, linear regression can be used to fit a predictive model to an observed data set of y and X values. After developing such a model, if an additional value of X is then given without its accompanying value of y , the fitted model can be used to make a prediction of the value of y .
  • Book cover image for: A Whistle-Stop Tour of Statistics
    111 6 Linear Regression Models 6.1 SIMPLE LINEAR REGRESSION Regression: A frequently applied statistical technique that serves as a basis for studying and characterizing a system of interest, by formulating a reasonable mathematical model of the relationship between a response variable y and a set of p explanatory variables x 1 , x 2 , … x p . The choice of an explicit form of the model may be based on previous knowledge of a system or on considerations such as ‘smoothness’ and continuity of y as a function of the explanatory vari- ables (sometimes called the independent variables, although they are rarely independent; explanatory variables is the preferred term). Simple linear regression: A linear regression model with a single explanatory variable. The data consist of n pairs of values ( y 1 , x 1 ), ( y 2 , x 2 ), … ( y n , x n ). The model for the observed values of the response variable is y x i n i i i = + + = β β ε 0 1 1 , … where β 0 and β 1 are, respectively, the intercept and slope parameters of the model and the ε i are error terms assumed to have a N(0, σ 2 ) distribution. The parameters β 0 and β 1 are estimated from the sample observations by least squares, i.e., the minimization of S i i n = = ∑ ε 2 1 S y x i i n i = - - = ∑ ( ) 1 0 1 2 β β 112 A Whistle-Stop Tour of Statistics ∂ ∂ = - - - ∂ ∂ = - - = = ∑ ∑ S y x S y i i n i i i n β β β β β 0 1 0 1 1 1 2 2 ( ) ( 0 1 - β x x i i ) Setting ∂ ∂ = ∂ ∂ = S S β β 0 1 0 0 , leads to the following estimators of the two model parameters: ˆ ˆ , ˆ ( )( ) ( ) β β β 0 1 1 1 2 1 = - = - - - = = ∑ y x y y x x x x i i i n i i n ∑ The variance σ 2 is estimated by s y y n i i n 2 2 1 2 = - - = ∑ ( ) . The estimated variance of the estimated slope parameter is Var( ˆ ) ( ) .
  • Book cover image for: An Essential Guide to Business Statistics
    • Dawn A. Willoughby(Author)
    • 2016(Publication Date)
    • Wiley
      (Publisher)
    We are specifically interested in the line that will give the best description of the linear relationship between the two variables. Simple linear regression involves finding the equation of this line which will provide the best possible prediction for the dependent variable based on the independent variable; it is known as the regression line or the line of best fit. Finding the Line of Best Fit For any straight line that we choose to draw on a scatter diagram, there will be differences between each data point and the corresponding position on the straight line. These differences, also known as residuals, can be positive or negative values depending on whether the data point lies above or below the straight line. A graphical representation of this concept is shown below. residuals y x 8 S I M P L E L I N E A R R E G R E S S I O N 249 Each residual is the difference between the actual y-value of the data point and the y-value that we would predict if we used the linear equation of this line for the prediction. These differences represent the random variation that occurs in the relationship between an independent and a dependent variable, as we described in the previous section. The total magnitude of the residuals, regardless of whether the residual is positive or negative, is a measure of the effectiveness of the line we have chosen in terms of how well it fits the data points. In finding the line of best fit, our aim is to draw the line which best fits the data points and so minimises these differences. This line will provide us with the best prediction for the dependent variable based on the values of the independent variable. To be able to identify this line, we need to calculate the gradient and y-intercept; this is achieved using the least squares method which was developed by Adrien-Marie Legendre (1752–1833).
  • Book cover image for: Applied Regression Analysis
    (For historical remarks, see Section 1.8.) Throughout this book we shall be most often concerned with relationships of the form Response variable = Model function + Random error. The model function will usually be "known" and of specified form and will involve 18 FITTING A STRAIGHT LINE BY LEAST SQUARES the predictor variables as well as parameters to be estimated from data. The distribution of the random errors is often assumed to be a normal distribution with mean zero, and errors are usually assumed to be independent. All assumptions are usually checked after the model has been fitted and many of these checks will be described. (Note: Many engineers and others call the parameters constants and the predictors parameters. Watch out for this possible difficulty in cross-discipline conversations!) We shall present the least squares method in the context of the simplest application, fitting the "best" straight line to given data in order to relate two variables X and Y, and will discuss how it can be extended to cases where more variables are involved. 1.1. STRAIGHT LINE RELATIONSHIP BETWEEN TWO VARIABLES In much experimental work we wish to investigate how the changes in one variable affect another variable. Sometimes two variables are linked by an exact straight line relationship. For example. if the resistance R of a simple circuit is kept constant, the current J varies directly with the voltage V applied, for, by Ohm's law, J = VIR. If we were not aware of Ohm's law, we might obtain this relationship empirically by making changes in V and observing I, while keeping R fixed and then observing that the plot of I against V more or less gave a straight line through the origin. We say "more or less" because, although the relationship actually is exact, our measurements may be subject to slight errors and thus the plotted points would probably not fall exactly on the line but would vary randomly about it.
  • Book cover image for: Linear Regression Analysis: Theory And Computing
    The dependent variable is also called response variable, and the independent variable is called explanatory or predictor variable. An explanatory variable explains causal changes in the response variables. A more general presentation of a regression model may be written as y = E ( y ) + ², where E ( y ) is the mathematical expectation of the response variable. When E ( y ) is a linear combination of exploratory variables x 1 , x 2 , · · · , x k the regression is the linear regression. If k = 1 the regression is the simple linear regression. If E ( y ) is a nonlinear function of x 1 , x 2 , · · · , x k the regression is nonlinear. The classical assumptions on error term are E ( ε ) = 0 and a constant variance Var( ε ) = σ 2 . The typical experiment for the simple linear regression is that we observe n pairs of data ( x 1 , y 1 ) , ( x 2 , y 2 ) , · · · , ( x n , y n ) from a scientific experiment, and model in terms of the n pairs of the data can be written as y i = β 0 + β 1 x i + ε i for i = 1 , 2 , · · · , n, with E ( ε i ) = 0, a constant variance Var( ε i ) = σ 2 , and all ε i ’s are indepen-dent. Note that the actual value of σ 2 is usually unknown. The values of x i ’s are measured “exactly”, with no measurement error involved. After model is specified and data are collected, the next step is to find “good” estimates of β 0 and β 1 for the simple linear regression model that can best describe the data came from a scientific experiment. We will derive these estimates and discuss their statistical properties in the next section. 2.2 Least Squares Estimation The least squares principle for the simple linear regression model is to find the estimates b 0 and b 1 such that the sum of the squared distance from actual response y i and predicted response ˆ y i = β 0 + β 1 x i reaches the minimum among all possible choices of regression coefficients β 0 and β 1 . i.e., ( b 0 , b 1 ) = arg min ( β 0 ,β 1 ) n X i =1 [ y i -( β 0 + β 1 x i )] 2 .
  • Book cover image for: Experimental Methods for Science and Engineering Students
    eBook - PDF

    Experimental Methods for Science and Engineering Students

    An Introduction to the Analysis and Presentation of Data

    112 Fitting a Line to x–y Data Using the Least Squares Method where m and c refer to the slope and intercept of the line shown in Figure 6.2. Δy i is the difference between the observed and predicted y values and is termed the residual, given by Δy i ¼ y i  ^ y i : (6.2) As we move the line around in an effort to find the position where the line passes closest to the majority of points, Δy i for each point changes. A criterion is required by which we can decide the best position for the line. This position – and therefore the best values for m and c – is found by applying a theory from statistics called the principle of maximum likelihood. 3 This predicts that the best line will be found by minimising the sum of the squares of the residuals. Writing the sum of squares of residuals as SSR, we can say SSR ¼ Δy 1 ð Þ 2 þ Δy 2 ð Þ 2 þ Δy 3 ð Þ 2 þ    þ Δy n ð Þ 2 , Figure 6.2 x–y graph showing the residual, Δy i . 3 The principle of maximum likelihood considers the probability of obtaining the observed values of x and y during an experiment and asserts that a particular combination of m and c values describes the linear relationship which caused those values to arise. By finding the values of m and c which will make the probability of obtaining the observed set of y values a maximum, the best values of m and c are obtained. For more detail, see Taylor (1997), chapter 8. 6.2 The Method of Least Squares 113 which can be abbreviated to SSR ¼ X i¼n i¼1 Δy i ð Þ 2 : (6.3) The summation indicates that the square of the residuals must be added up for all the data from i = 1 to i = n, where n is the number of data. To make future equations more compact, we omit the limits of the summation and assume that all sums are calculated from i = 1 to i = n. Replacing Δy i in equation 6.3 by y i  ^ y i and replacing ^ y i by mx i + c, we can write SSR ¼ X y i  mx i þ c ð Þ ð Þ 2 : (6.4) We require values for m and c that reduce SSR to the smallest possible value.
  • Book cover image for: Contemporary Statistical Models for the Plant and Soil Sciences
    • Oliver Schabenberger, Francis J. Pierce(Authors)
    • 2001(Publication Date)
    • CRC Press
      (Publisher)
    If then max min 158 Chapter 4 The Classical Linear Model: Least Squares and Some Alternatives . [ ] 4.48 This variation of the sum of squares reduction test is due to Schrader and McKean (1977) and Schrader and Hettmansberger (1980). -values are approximated as . The Pr test of Schrader and Hettmansberger (1980) does not adjust the variance of the M-estimates for the fact that they are weighted unequally as one would in weighted least squares. This is justifiable since the weights are random, whereas they are considered fixed in weighed least squares. To test the general linear hypothesis : , we prefer a test proposed by Birch A d and Agard (1993) which is a direct analog of the test in the Gaussian linear models with unequal variances. The test statistic is A d A X WX A A d , where is the rank of and is an estimator of proposed as A . [ ] 4.49 Here, is the first derivative of and the diagonal weight matrix has entries . W Through simulation studies it was shown that this test has very appealing properties with respect to size (significance level) and power. Furthermore, if which results in least squares estimates, is the traditional residual mean square error estimate. 4.6.3 Robust Regression for Prediction Efficiency Data When fitting a quadratic polynomial to the Prediction Efficiency data by least squares and invoking appropriate model diagnostics, it was noted that the data point corresponding to the Magnesium observation was an outlier (see Table 4.13, p. 130). There is no evidence sugges-ting that the particular observation is due to a measurement or execution error and we want to retain it in the data set. The case deletion diagnostics in Table 4.13 suggest that the model fit may change considerably, depending on whether the data point is included or excluded. We fit the quadratic polynomial to these data with several methods.
  • Book cover image for: Essentials of Statistics for Business & Economics
    • David Anderson, Dennis Sweeney, Thomas Williams, Jeffrey Camm(Authors)
    • 2019(Publication Date)
    If x and y are linearly related, we must have b 1 Þ 0. The purpose of the t test is to see whether we can conclude that b 1 Þ 0. We will use the sample data to test the following hypotheses about the para-meter b 1 . H 0 : H a : b 1 5 0 b 1 ± 0 If H 0 is rejected, we will conclude that b 1 Þ 0 and that a statistically significant rela-tionship exists between the two variables. However, if H 0 cannot be rejected, we will have insufficient evidence to conclude that a significant relationship exists. The properties of the sampling distribution of b 1 , the least squares estimator of b 1 , provide the basis for the hypothesis test. First, let us consider what would happen if we used a different random sample for the same regression study. For example, suppose that Armand’s Pizza Parlors used the sales records of a different sample of 10 restaurants. A regression analysis of this new sample might result in an estimated regression equation similar to our previous estimated regres-sion equation y ˆ = 60 + 5 x . However, it is doubtful that we would obtain exactly the same Copyright 2020 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it. 678 Chapter 14 Simple Linear Regression equation (with an intercept of exactly 60 and a slope of exactly 5). Indeed, b 0 and b 1 , the least squares estimators, are sample statistics with their own sampling distributions. The properties of the sampling distribution of b 1 follow.
Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.