Technology & Engineering

Least Squares Fitting

Least Squares Fitting is a mathematical method used to find the best-fitting curve to a set of data points. It minimizes the sum of the squares of the differences between the observed and predicted values. This technique is commonly used in various fields, including engineering, statistics, and data analysis, to model relationships between variables and make predictions.

Written by Perlego with AI-assistance

Related key terms

1 of 5

12 Key excerpts on "Least Squares Fitting"

eBook - PDF
Statistical Methods for Climate Scientists
- Timothy DelSole, Michael Tippett(Authors)
- 2022(Publication Date)
- Cambridge University Press
  (Publisher)
8 Linear Regression Least Squares Estimation In most investigations where the object is to reduce the most accurate possible results from observational measurements, we are led to a system of equations of the form E = a + bx + cy + f z + &c. , in which a,b,c,f, &c. are known coefficients, varying from one equa- tion to the other, and x,y,z&c. are unknown quantities, to be determined by the condition that each value of E is reduced either to zero, or to a very small quantity… Of all the principles that can be proposed for this purpose, I think there is none more general, more exact, or easier to apply, than that which we have used in this work; it consists of making the sum of the squares of the errors a minimum. By this method, a kind of equilibrium is established among the errors which, since it prevents the extremes from dominating, is appropriate for revealing the state of the system which most nearly approaches the truth. 1 Legendre, 1805 (note “&c” is an old form of “Et cetera”) Some variables can be modeled by an equation in which one variable equals a linear combination of other random variables, plus random noise. Such models are used to quantify the relation between variables, to make predictions, and to test hypotheses about the relation between variables. After identifying the variables to include in a model, the next step is to estimate the coefficients that multiply them, which are known as the regression parameters. This chapter discusses the least squares method for estimating regression parameters. The least squares method estimates the parameters by minimizing the sum of squared differences between the fitted model and the data. This chapter also describes measures for the goodness of fit and an illuminating geometric interpretation of Least Squares Fitting. The least squares method is illustrated on various routine calculations in weather and climate analysis (e.g., fitting a trend).
Sign up to read
Learn more about book
eBook - PDF
Optimal Estimation of Dynamic Systems
- John L. Crassidis, John L. Junkins(Authors)
- 2011(Publication Date)
- Chapman and Hall/CRC
  (Publisher)
1 Least Squares Approximation Theory attracts practice as the magnet attracts iron. —Gauss, Karl Friedrich T he celebrated concept of least squares approximation is introduced in this chap-ter. Least squares can be used in a wide variety of categorical applications, in-cluding: curve fi tting of data, parameter identi fi cation, and system model realization. Many examples from diverse fi elds fall under these categories, for instance determin-ing the damping properties of a fl uid-fi lled damper as a function of temperature, iden-ti fi cation of aircraft dynamic and static aerodynamic coef fi cients, orbit and attitude determination, position determination using triangulation, and modal identi fi cation of vibratory systems. Even modern control strategies, for instance certain adaptive controllers, use the least squares approximation to update model parameters in the control system. The broad utility implicit in the aforementioned examples strongly con fi rms that the least squares approximation is worthy of study. Before we begin analytical and mathematical discussions, let us fi rst de fi ne some common quantities used throughout this chapter and the text. For any variable or pa-rameter in estimation, there are three quantities of interest: the true value, the mea-sured value, and the estimated value. The true value (or “truth”) is usually unknown in practice. This represents the actual value sought of the quantity being approxi-mated by the estimator. Unadorned symbols are used to represent the true values. The measured value denotes the quantity which is directly determined from a sen-sor. For example, in orbit determination a radar is often used to obtain a measure of the range to a vehicle. In actuality, this is not a totally accurate statement since the truly measured quantity given by the radar is not the range.
Sign up to read
Learn more about book
eBook - PDF
Multi-Resolution Methods for Modeling and Control of Dynamical Systems
- Puneet Singla, John L. Junkins(Authors)
- 2008(Publication Date)
- Chapman and Hall/CRC
  (Publisher)
1 Least Squares Methods Life stands before me like an eternal spring with new and brilliant clothes. C. F. Gauss 1.1 Introduction In all branches of engineering, various system processes are generally charac-terized by mathematical models. Controller design, optimization, fault detec-tion, and many other advanced engineering techniques are based upon mathe-matical models of various system processes. The accuracy of the mathematical models directly affect the accuracy of the system design and/or control per-formance. As a consequence, there is a great demand for the development of advanced modeling algorithms that can adequately represent the system behavior. However, different system processes have their own unique char-acteristics which they do not share with other structurally different systems. Obviously the mathematical structure of engineering models are very diverse; they can be simple algebraic models, may involve differential, integral or dif-ference equations or may be a hybrid of these. Further, many different factors, like intended use of the model, problem dimensionality, quality of the meas-urement data, offline or online learning, etc., can result in ad-hoc decisions leading to an inappropriate model architecture. For the simplest input-output relationship, the mapping from the state to the measurable quantities is ap-proximated adequately by a linear algebraic equation: ¯ Y = a 1 x 1 + a 2 x 2 + .... + a n x n (1.1) where Y and a i denote the measured variables and x i denotes the unknown parameters that characterize the system. So the problem reduces to the esti-mation of the true but unknown parameters ( x i ) from certain data measure-ments. When the approximation implicit in Eq. (1.1) is satisfactory, we have a linear algebraic estimation problem. The problem of linear parameter esti-mation arises in a variety of engineering and applied science disciplines such 1
Sign up to read
Learn more about book
eBook - ePub
Multivariable System Identification For Process Control
- Y. Zhu(Author)
- 2001(Publication Date)
- Elsevier Science
  (Publisher)
Chapter 4
Identification by the Least-Squares Method

The least-squares principle was invented by Karl Gauss at the end of the eighteenth century for determining the orbits of planets. Since then this method has become a major tool for parameter estimation using experimental data. Most existing parametric identification methods can be related to the least-squares method. The method is easy to comprehend and, due to the existence of a closed solution, it is also easy to implement. The least-squares method is also called linear regression (in statistical literature) and equation error method (in identification literature).

Section 4.1 will introduce the principle of least-squares; in Section 4.2 the method will be applied to the estimation of finite impulse response (FIR) models and the estimation of parametric models. Before making assumptions and pursuing a theoretical analysis of the method, we first test the method on two industrial processes, a single stand rolling mill and a glass tube production process (Section 4.3 ). The method is successful for the first process; but it fails for the second one. Why? In Section 4.4 some theoretical analysis is carried out, and reasons are given why the least-squares method can fail. Finally in Section 4.5 we will draw conclusions about the least-squares method.

4.1 The Principle of Least-Squares

The least-squares technique is a mathematical procedure by which the unknown parameters of a mathematical model are chosen (estimated) such that the sum of the squares of some chosen error is minimized. Suppose a mathematical model is given in the form

(4.1.1)

where y (t ) is the observed variable, {θ1 , θ2 , · · ·, θn } is a set of constant parameters, x 1 (t ), x 2 (t ), · · ·,
xn
(t ) are known functions that may depend on other known variables. The variable t often denotes time.

Assume that N samples of measurements of y (t ) and x 1 (t ), x 2 (t ), · · ·,
xn
(t ) are made at time 1, 2, · · ·, ·, N . Filling the data samples into equation (4.1.1
Sign up to read
Learn more about book
eBook - ePub
Beginning Statistics with Data Analysis
- Frederick Mosteller, Stephen E. Fienberg, Robert E.K. Rourke, Stephen E. Fienberg, Robert E.K. Rourke(Authors)
- 2013(Publication Date)
- Dover Publications
  (Publisher)
Fitting StraightLines UsingLeast Squares
     11

Learning Objectives

1. Predicting the values of new observations using linear relations

2. Reviewing the aims of fitting straight lines to data

3. Choosing a criterion for fitting straight lines

4. Fitting straight lines to data using the method of least squares to minimize the sum of the squares of the residuals

11-1 PLOTTING DATA AND PREDICTING

Were predicting the outcomes or values of future events not such a difficult job, we would all be rich. To make prediction a reasonable statistical task, we need information on approximate relationships between the variable whose value we would like to predict and other variables whose values we can observe or control, such as the relationship between the breaking strength of pottery and its temperature at firing.

         Today, high-speed computers can do most of the computational work for plotting data and fitting linear relations. In this chapter we give, without proof, formulas for fitting straight lines to data. By using such formulas, many people have written computer programs to help us compute. We need to learn the logic behind the formulas and then let the computer do the work of calculating and printing out the information we request.

         In Chapter 3 we explored in an informal manner how to find and summarize approximate linear relationships. We now offer formal methods that are widely used in practice for fitting and examining linear relations. We turn to an example that uses these methods for prediction purposes.

EXAMPLE 1 Winning speeds at the Indianapolis 500. The most famous American automobile race, the Indianapolis 500, is held each Memorial Day at the Indianapolis Speedway. Owing to advancing technology, each year the racers tend to go faster than in previous years. We can assess this phenomenon by examining the speed of the winning drivers over a period of years. Table 11-1
Sign up to read
Learn more about book
eBook - PDF
Uncertainty Analysis for Engineers and Scientists
A Practical Guide
- Faith A. Morrison(Author)
- 2021(Publication Date)
- Cambridge University Press
  (Publisher)
We focus on empirical models. The principal discussion of the chapter is how to fit empirical models to data using an ordinary least-squares algorithm. We also address the uncertainty associated with model parameters and the uncertainty associated with values obtained from model predictions. 6.2 Least Squares of Linear Models The first step in fitting an empirical model to data is to choose the form of the model. The second step is to fit the model to the data – that is, to identify values of the parameters in the model so that the model, with those param- eters, represents the data well. For example, if the model is a straight line, y = mx + b, we are looking for the values of slope m and y -intercept b that give the “best fit” of the model to the data. Numerous algorithms have been developed to fit a model to data. The basic idea behind these algorithms is to minimize the differences between what the 236 6 Model Fitting model predicts at various points and the values of the observed data at these points. In this section, we describe one of the most common techniques for model fitting, ordinary least-squares linear regression. We first discuss the idea behind “least squares,” and then we present how to assess uncertainties related to the model’s parameters and predictions. As we discuss least-squares model fitting, we also introduce the software tools that are an important part of empirical modeling. The calculations for linear regression can be tedious to perform, particularly for a large number of data points, but the algorithm is easy to program into software. We pause here to describe our strategy for presenting the Excel and MATLAB software tools for ordinary least-squares linear regression. 6.2.1 Software Tools for Linear Regression Both Excel and MATLAB are effective at carrying out error calculations, and which tool you choose is a matter of personal preference.
Sign up to read
Learn more about book
eBook - PDF
Applied Regression Analysis
- Norman R. Draper, Harry Smith(Authors)
- 2014(Publication Date)
- Wiley-Interscience
  (Publisher)
CHAPTER 1 Fitting a Straight Line by Least Squares 1.0. INTRODUCTION: THE NEED FOR STATISTICAL ANALYSIS In today's industrial processes, there is no shortage of "information." No matter how small or how straightforward a process may be, measuring instruments abound. They tell us such things as input temperature, concentration of reactant, percent catalyst, steam temperature, consumption rate, pressure, and so on, depending on the character- istics of the process being studied. Some of these readings are available at regular intervals, every five minutes perhaps or every half hour; others are observed continu- ously. Still other readings are available with a little extra time and effort. Samples of the end product may be taken at intervals and, after analysis, may provide measurements of such things as purity, percent yield, glossiness, breaking strength, color, or whatever other properties of the end product are important to the manufacturer or user. In research laboratories, experiments are being performed daily. These are usually small, carefully planned studies and result in sets of data of modest size. The objective is often a quick yet accurate analysis, enabling the experimenter to move on to "better" experimental conditions, which will produce a product with desirable characteristics. Additional data can easily be obtained if needed, however. if the decision is ini- tially unclear. A Ph.D. researcher may travel into an African jungle for a one-year period of intensive data-gathering on plants or animals. She will return with the raw material for her thesis and will put much effort into analyzing the data she has, searching for the messages that they contain. It will not be easy to obtain more data once her trip is completed, so she must carefully analyze every aspect of what data she has. Regression analysis is a technique that can be used in any of these situations.
Sign up to read
Learn more about book
eBook - PDF
An Introduction to Numerical Methods Using MATLAB
- K. Akbar Ansari Ph.D., P.E., Bonni Dichone Ph.D.(Authors)
- 2019(Publication Date)
- SDC Publications
  (Publisher)
One way to include this weighting is to make multiple inclusions of the associated data point in the regression analysis. For example, if the following data is given and the point (2, 20) is to be assigned a “weighting factor” of 3, this data point must simply be considered thrice in coming up with a curve-fit as shown below. Given data: x 1 2 3 4 y 10 20 30 40 Information to be used for curve-fitting: x 1 2 2 2 3 4 y 10 20 20 20 30 40 6.2 The Method of Least Squares With a well-chosen approximating function, a “least squares” fit will yield a good represen- tation of experimental data. Suppose you want to measure the distance between two points in a field, and let us say that you do this n times. You will come up with n measurements which are likely to be somewhat different from one another. Let these be d 1 , d 2 , d 3 , ... , d n . If the true distance is D, then the sum of the squares of the deviations from the true distance D is S = ( d 1 - D) 2 + ( d 2 - D) 2 + ··· + ( d n - D) 2 . (6.1) This sum S will be a maximum or a minimum when dS dD = 0, which yields n X i =1 d i - nD = 0 (6.2) or D = ∑ n i =1 d i n . (6.3) 6.3 Straight Line Regression 161 For S to be a minimum, d 2 S dD 2 > 0, which it is since it comes out to be 2n. Thus, S will be a minimum when D = ∑ n i =1 d i n , that is, if n measurements are taken, the true distance will be the arithmetic mean of the n measurements if the sum of the squares of the deviations is to be a minimum. 6.3 Straight Line Regression Let us assume that a plot of the given data suggests that we should fit it with a linear function. Let this function be f ( x ) = C 1 + C 2 x (6.4) where C 1 and C 2 are coefficients to be determined.
Sign up to read
Learn more about book
eBook - ePub
Applied Regression Analysis
- Norman R. Draper, Harry Smith(Authors)
- 2014(Publication Date)
- Wiley-Interscience
  (Publisher)
CHAPTER 1 Fitting a Straight Line by Least Squares

1.0. INTRODUCTION: THE NEED FOR STATISTICAL ANALYSIS

In today’s industrial processes, there is no shortage of “information.” No matter how small or how straightforward a process may be, measuring instruments abound. They tell us such things as input temperature, concentration of reactant, percent catalyst, steam temperature, consumption rate, pressure, and so on, depending on the characteristics of the process being studied. Some of these readings are available at regular intervals, every five minutes perhaps or every half hour; others are observed continuously. Still other readings are available with a little extra time and effort. Samples of the end product may be taken at intervals and, after analysis, may provide measurements of such things as purity, percent yield, glossiness, breaking strength, color, or whatever other properties of the end product are important to the manufacturer or user.

In research laboratories, experiments are being performed daily. These are usually small, carefully planned studies and result in sets of data of modest size. The objective is often a quick yet accurate analysis, enabling the experimenter to move on to “better” experimental conditions, which will produce a product with desirable characteristics. Additional data can easily be obtained if needed, however, if the decision is initially unclear.

A Ph.D. researcher may travel into an African jungle for a one-year period of intensive data-gathering on plants or animals. She will return with the raw material for her thesis and will put much effort into analyzing the data she has, searching for the messages that they contain. It will not be easy to obtain more data once her trip is completed, so she must carefully analyze every aspect of what data she has.

Regression analysis is a technique that can be used in any of these situations. Our purpose in this book is to explain in some detail something of the technique of extracting, from data of the types just mentioned, the main features of the relationships hidden or implied in the tabulated figures. (Nevertheless, the study of regression analysis techniques will also provide certain insights into how to plan the collection of data, when the opportunity arises. See, for example, Section 3.3.)
Sign up to read
Learn more about book
eBook - PDF
Elements of Matrix Modeling and Computing with MATLAB
- Robert E. White(Author)
- 2006(Publication Date)
- Chapman and Hall/CRC
  (Publisher)
The objective is to extend both the geometric and residual approaches to additional data points (rows in ) and to additional model parameters (columns in ) so that more complicated and accurate data modeling can be done. This section will focus on more data points with two model parameters. Applications will be given to sales prediction, radioactive decay and population models. 4.1.1 The Least Squares Problem Consider = [ a b ] with two × 1 columns where may be larger than 3 . The possible solution of the system x = d [ a b ] ¸ = d a + b = d is the and that forces the residual vector r d ( a + b ) = [ 1 2 · · · ] to be as close as possible to the zero vector. If the sum of the squares of the residual vector’s components is a minimum, then this is called the least squares solution . The objective is to find and that minimize r r = 2 1 + 2 2 + · · · + 2 The least squares function of and can be explicitly computed as ( ) r r = ( d ( a + b )) ( d ( a + b )) = d d 2 d a 2 d b + 2 a b + a a 2 + b b 2 (4.1.1) This problem will be solved three ways: graphically, calculus and normal equa-tions (in Section 4.2). For the above high definition television data a = [1 2 3] , b = [1 1 1] and d = [2000 1950 1910] and gives ( ) = 11450600 2(11630) 2(5860) + 2(6) + 14 2 + 3 2 In Subsection 2.4.4 a geometric argument required the minimum to be the solution of the following algebraic system ½ a r = a ( d ( a + b )) = a d ( a T a + a b ) = 0 b r = b ( d ( a + b )) = b d ( b T a + b b ) = 0 (4.1.2) This is a special case of the normal equations, and as we shall see they give a solution to the least squares problem even if the number of data points is larger than = 3 . 4.1. CURVE FITTING TO DATA 173 Table 4.1.1: Computer Sales Data Months Computers Sold 1 78 2 85 3 90 4 95 5 104 6 113 Example 4.1.1. Suppose a new computer company has had increased sales and would like to make a prediction of future sales based on past sales.
Sign up to read
Learn more about book
eBook - PDF
Matrix, Numerical, and Optimization Methods in Science and Engineering
- Kevin W. Cassel(Author)
- 2021(Publication Date)
- Cambridge University Press
  (Publisher)
Show that the KKT equations simplify to give the least-squares solution u = b + C T ( CC T ) −1 (d − Cb) . Hint: Follow the logic in Section 10.2.2 for the underdetermined case. 10.12 Throughout the text, we have considered the system of linear algebraic equations A ˆ u = b, where A and b are known, and ˆ u is sought. Let us instead imagine that we have a system with an unknown behavior as represented by the matrix A. Instead, we have a series of input data u i ,i = 1, . . . ,N and the corresponding output data b i ,i = 1, . . . ,N for the system, and we seek the system matrix A. In a least- squares context, this could be formulated as seeking the matrix A that minimizes the objective function J (A) = N  i =1 b i − Au i  2 = B − AU 2 , where the known input and output vectors u i and b i are the columns of matrices U and B, respectively. Show that the least-squares solution is given by A = BU + . 11 Data Analysis: Curve Fitting and Interpolation Can one think that because we are engineers, beauty does not preoccupy us or that we do not try to build beautiful, as well as solid and long-lasting, structures? Aren’t the genuine functions of strength always in keeping with unwritten conditions of harmony? (Gustave Eiffel) Science and engineering are all about data that quantify the relationships between various physical quantities of interest. Such data may be obtained experimentally (empirically) or numerically. The data can then be used to develop, refine, and confirm mathematical models of physical phenomena. In many cases, the goal is to develop a simple functional representation of a complex or noisy data set; this is the goal of curve fitting. In other instances, the objective is to provide a means to infer data for values of the variables that have not been directly measured; this is accomplished using inter- polation.
Sign up to read
Learn more about book
eBook - PDF
Experimental Methods for Science and Engineering Students
An Introduction to the Analysis and Presentation of Data
- Les Kirkup(Author)
- 2019(Publication Date)
- Cambridge University Press
  (Publisher)
6 Fitting a Line to x–y Data Using the Method of Least Squares 6.1 Overview: How Can We Find the Best Line through x–y Data? Linearly related x–y data emerge so frequently from science and engineering experi- ments that the analysis of such data deserves special attention. In general, we seek the quantitative relationship which best describes the dependence of y upon x. To do this, we need a method by which we can determine the equation of the line that best fits the x–y data. In Chapter 3 we saw that an equation representing the relationship between x and y quantities 1 can be found by first plotting the data followed by drawing the best straight line through the points on an x–y graph (or at least as close as possible to them) with a ruler. The slope, m, and intercept, c, of this line are calculated and the equation of the line is written y = mx + c. Although positioning a line by eye through x–y data is a good way of obtaining reasonable estimates for m and c, there are several issues with this method: • No two people draw the same ‘best’ line through a given data set. • If the uncertainty in each point on the graph is different, should we take this into account when drawing a line through the points? If the answer is yes, how do we do that? • Drawing the best line is difficult if the data exhibit large scatter. • Finding the uncertainties in m and c directly from the graph (as described in Section 3.3.6) is cumbersome and tends to overestimate their values. Figure 6.1 shows an example of a graph for which it is difficult to determine the best line through the points. Both lines in Figure 6.1 appear to fit the data quite well, but which is the better? 2 To answer this we need a tool that avoids the guesswork involved when finding the best line through a set of points by eye. The tool we will use is usually described as 1 Where the x–y data look to be (at least approximately) linearly related.
Sign up to read
Learn more about book

Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.

Explore more topic indexes

1 of 8

View all

Least Squares Fitting

Related key terms

12 Key excerpts on "Least Squares Fitting"

Statistical Methods for Climate Scientists

Optimal Estimation of Dynamic Systems

Multi-Resolution Methods for Modeling and Control of Dynamical Systems

Multivariable System Identification For Process Control

Identification by the Least-Squares Method

4.1 The Principle of Least-Squares

Beginning Statistics with Data Analysis

Learning Objectives

11-1 PLOTTING DATA AND PREDICTING

Uncertainty Analysis for Engineers and Scientists

A Practical Guide

Applied Regression Analysis

An Introduction to Numerical Methods Using MATLAB

Applied Regression Analysis

1.0. INTRODUCTION: THE NEED FOR STATISTICAL ANALYSIS

Elements of Matrix Modeling and Computing with MATLAB

Matrix, Numerical, and Optimization Methods in Science and Engineering

Experimental Methods for Science and Engineering Students

An Introduction to the Analysis and Presentation of Data

Explore more topic indexes