Mathematics
Product Moment Correlation Coefficient
The Product Moment Correlation Coefficient is a measure of the strength and direction of the linear relationship between two variables. It ranges from -1 to 1, where 1 indicates a perfect positive linear relationship, -1 indicates a perfect negative linear relationship, and 0 indicates no linear relationship. It is commonly denoted by the symbol "r" and is used in statistics to analyze the association between variables.
Written by Perlego with AI-assistance
Related key terms
1 of 5
11 Key excerpts on "Product Moment Correlation Coefficient"
- No longer available |Learn more
- (Author)
- 2014(Publication Date)
- Orange Apple(Publisher)
Other correlation coefficients have been developed to be more robust than the Pearson correlation, or more sensitive to nonlinear relationships. ________________________ WORLD TECHNOLOGIES ________________________ Several sets of ( x , y ) points, with the Pearson correlation coefficient of x and y for each set. Note that the correlation reflects the noisiness and direction of a linear relationship (top row), but not the slope of that relationship (middle), nor many aspects of nonlinear relationships (bottom). N.B.: the figure in the center has a slope of 0 but in that case the correlation coefficient is undefined because the variance of Y is zero. Pearson product-moment correlation coefficient In statistics, the Pearson product-moment correlation coefficient (sometimes referred to as the PMCC , and typically denoted by r ) is a measure of the correlation (linear dependence) between two variables X and Y , giving a value between +1 and −1 inclusive. It is widely used in the sciences as a measure of the strength of linear dependence between two variables. It was developed by Karl Pearson from a similar but slightly different idea introduced by Francis Galton in the 1880s. The correlation coefficient is sometimes called Pearson's r. Definition Pearson's correlation coefficient between two variables is defined as the covariance of the two variables divided by the product of their standard deviations: The above formula defines the population correlation coefficient, commonly represented by the Greek letter ρ (rho). Substituting estimates of the covariances and variances based on a sample gives the sample correlation coefficient , commonly denoted r : ________________________ WORLD TECHNOLOGIES ________________________ An equivalent expression gives the correlation coefficient as the mean of the products of the standard scores. - eBook - ePub
Using Statistics in Small-Scale Language Education Research
Focus on Non-Parametric Data
- Jean L. Turner(Author)
- 2014(Publication Date)
- Routledge(Publisher)
Section IVAnalyzing Patterns Within a Variable and Between Two VariablesPassage contains an image
10The Parametric Pearson’s Product Moment Correlation Coefficient StatisticIn Chapter Ten , we take a look at one of the statistics designed for investigating the correlational relationship between variables, Pearson’s Product Moment Correlation Coefficient, also known as Pearson’s r . Pearson’s r is a parametric statistic used to calculate the strength of the correlational relationship between two normally distributed variables, an independent variable and a dependent variable. When it’s used in the context of statistical logic, Pearson’s r allows a researcher to make a probability statement about the degree of correspondence between two sets of data. Unlike the statistics addressed in Chapters Six through Nine , which allow a researcher to explore differences among groups, correlation formulas are used to determine the extent to which two sets of data vary together. A correlation is a numerical expression of the strength of the relationship between the two variables.Researchers in language education often want to know whether there’s a significant relationship between variables; they can gain a deeper understanding of the learners in their classes and their learning environment by knowing how the variables are related to one another. There’s a practical application too, of knowing the strength of the relationship between two variables—when two variables are strongly related, we can estimate or predict a person’s behavior on the second variable, given his or her performance on the first. It’s important, though, to remember that even a strong relationship between two variables can’t be interpreted as evidence of causality. To illustrate, I’d like to tell you about a little study I did quite a while ago, just out of curiosity. I was teaching oral skills courses for international undergraduate students at a university and it seemed to me that the type of shoe a student usually wore was related to the level of his or her oral skills. I collected some data and found there was a statistically significant relationship between the two variables, a strong one— the participants who wore a particular kind of sport shoe definitely tended to have a higher degree of oral language proficiency than did people who wore other types of shoes. So, yes, on the basis of that small study, I can say that there was a statistically significant relationship between the type of shoe an international undergraduate student wears and the level of his or her oral skills, but there’s no causality there—buying a different type of shoe isn’t going to help anyone become more fluent in English! - eBook - ePub
Sensory Evaluation of Food
Statistical Methods and Procedures
- Michael O'Mahony(Author)
- 2017(Publication Date)
- Routledge(Publisher)
Zero correlation coefficients can also be obtained in other ways; consider the examples of nonrandom points shown in Figures 15.7 and 15.8. It is difficult to know where to draw a line in these figures. Note : The correlation coefficient is a coefficient of linear correlation. It is high only when the points fall on a straight line. If the relationship between X and Y is curvilinear, as in Figure 15.8, the correlation will be zero. More complex coefficients are required to describe such a relationship. Figure 15.8 Curvilinear relationship yields zero correlation. 15.3 How to Compute Pearson’s Product-Moment Correlation Coefficient To measure the correlation between a set of Y and X values, Pearson’s product-moment correlation coefficient, developed by Karl Pearson, is used. We will simply give the formula for the correlation coefficient; we will not derive it formally. It is given by the formula r = Σ (X − X ¯) (Y − Y ¯) N S x S y where X ¯ and Y ¯ are the mean of the X and lvalues being correlated, S x and S y are their standard deviations, and N is the number of X scores (or the number of Y scores, but not the number of X scores + the number of Y scores), generally the number of subjects. tested. Note: S x = Σ (X − X ¯) 2 N − 1 = Σ X 2 − (Σ X) 2 N N − 1 This formula is inconvenient to use. However, it can be rearranged into a more convenient, albeit longer form: r = N Σ X Y − Σ X Σ Y [ N Σ X 2 − (Σ X) 2 ] [ N Σ Y 2 − (Σ Y) 2 Although this formula is not derived here, we can see that it is of an intuitively sensible form. Looking at the numerator, we can see that the larger the value of Σ XY, the larger the value of the numerator and hence r. Should the two sets of scores (X and Y) be positively correlated, large X and large Y values will be associated, giving large XY values and Σ XY will be correspondingly large. This can be clarified by an example - Jacob Cohen(Author)
- 2013(Publication Date)
- Academic Press(Publisher)
C H A P T E R The Significance of a Product Moment r s 3.1 INTRODUCTION A N D USE Behavioral scientists generally, and particularly psychologists with sub-stantive interests in individual differences in personality, attitude, and ability, frequently take recourse to correlational anlysis as an investigative tool in both pure and applied studies. By far the most frequently used statis-tical method of expression of the relationship between two variables is the Pearson product-moment correlation coefficient, r. r is an index of linear relationship, the slope of the best-fitting straight line for a bivariate (X, Y) distribution where the X and Y variables have each been standardized to the same variability. Its limits are — 1.00 to + 1.00. The purpose of this handbook precludes the use of space for a detailed consideration of the interpretations and assumptions of r. For this, the reader is referred to a general textbook, such as Cohen & Cohen (1975), Hays (1973), or Blalock (1972). When used as a purely descriptive measure of degree of linear relation-ship between two variables, no assumptions need be made with regard to the shape of the marginal population distribution of X and Y, nor of the distribution of Y for any given value of X (or vice versa), nor of equal varia-bility of Y for different values of X (homoscedasticity). However, when significance tests come to be employed, assumptions of normality and homoscedasticity are formally invoked. Despite this, it should be noted that, as in the case of the t test with means, moderate assumption failure here, particularly with large n, will not seriously affect the validity of signifi-cance tests, nor of the power estimates associated with them. 75 3 76 3 THE SIGNIFICANCE OF A PRODUCT MOMENT Γ, In this chapter we consider inference from a single correlation coefficient, r s , obtained from a sample of η pairs (X, Y) of observations.- Jacob Cohen, Patricia Cohen, Stephen G. West, Leona S. Aiken(Authors)
- 2013(Publication Date)
- Routledge(Publisher)
55 Note that this equation is slightly different from that in earlier editions. The n/(n - 1) term is necessary because the sd used here is the sample estimate of the population sd rather than the sample sd which uses n in the denominator.Although it is clear that this index, ranging from 0 (for a perfect positive linear relationship) through 2 (for no linear relationship) to 4 (for a perfect negative one), does reflect the relationship between the variables in an intuitively meaningful way, it is useful to transform the scale linearly to make its interpretation even more clear. Let us reorient the index so that it runs from −1 for a perfect negative relationship to +1 for a perfect positive relationship. If we divide the sum of the squared discrepancies by 2(n − 1) and subtract the result from 1, we haver = 1 −(2.2.4)(,)Σ(2z X−z Y)2 ( n − 1 )which for the data of Table 2.2.2 givesr = r = 1 −(= .657.9.614 28)r is the Product Moment Correlation Coefficient, invented by Karl Pearson in 1895.6 This coefficient is the standard measure of the linear relationship between two variables and has the following properties:6 The term product moment refers to the fact that the correlation is a function of the product of the first moments, of X and Y, respectively. See the next sections.- It is a pure number and independent of the units of measurement.
- Its value varies between zero, when the variables have no linear relationship, and +1.00 or −1.00, when each variable is perfectly estimated by the other. The absolute value thus gives the degree of relationship.
- Its sign indicates the direction of the relationship. A positive sign indicates a tendency for high values of one variable to occur with high values of the other, and low values to occur with low. A negative sign indicates a tendency for high values of one variable to be associated with low values of the other. Reversing the direction of measurement of one of the variables will produce a coefficient of the same absolute value but of opposite sign. Coefficients of equal value but opposite sign (e.g., +.50 and −.50) thus indicate equally strong linear relationships, but in opposite directions.
- eBook - PDF
Practical Statistics for Students
An Introductory Text
- Louis Cohen, Michael Holliday(Authors)
- 1996(Publication Date)
- SAGE Publications Ltd(Publisher)
CHAPTER 13 DESIGN 2 Group 1 Subjects A B C I D One group-ne observation per subject on each of two or more variables Observations on Observations on Variable 1 ( X ) Variable 2 ( Y ) x A yA XB ytl XC YC XD yo 13.1 Using the PEARSON Product Moment Correlation Coefficient A sample of 10 patients with anorexia nervosa is drawn at random and their anxiety and depression scores are obtained using the Crown-Crisp Experimental Index (CCEI) with the results shown in Table 47. We wish to find out whether there is a relationship between anxiety and depression. The Pearson Product Moment Correlation Coefficient is a suitable measure of relationship when samples are randomly selected from normally distributed populations. The assumptions underlying the Product Moment Correlation Coefficient when it is used for inferential purposes, are (i) hornoscedusficity that is to say, the variances in the Y values are comparable to variances in the X values, and (ii) the data are normally distributed. The null hypothesis (H,) in our example is that there is no relationship between anxiety and depression. The Pearson Product Moment Correlation Coefficient is given by the formula: 138 PEARSON PRODUCT-MOMENT CORRELATION Table 47 Patient anxiety and depression scores Patient A B C D E F G H I J Anxiety score 8.2 1.9 6.3 9.1 5.4 10.3 4.8 6.5 8.3 1.5 Depression score 6.4 5.8 4.9 1.2 3.9 1.9 5.0 4.2 7.1 5.3 where r = product moment correlation n = number of pairs of scores X = scores on variable X Y = scores on variable Y 1 = ‘sum of’. PROCEDURE FOR COMPUTING THE PEARSON Product Moment Correlation Coefficient 1 Total the scores on anxiety (D) and the scores on depression (c Y). 2 Square each patient’s scores on anxiety (X’) and depression (Y?. 3 Sum the X’ values giving 1 X’ , and sum the Y’ values giving 5 Multiply each patient’s score for anxiety (X) by her score on depression (Y) to give her XY value. 6 Sum the XY values (1 XY). Substituting our computed data from Table 48, Y2. - eBook - ePub
Data Analysis for the Social Sciences
Integrating Theory and Practice
- Douglas Bors(Author)
- 2018(Publication Date)
- SAGE Publications Ltd(Publisher)
Figure 7.20 we find thatIf we calculate cov from the data in Panel 2 of Figure 7.20 we find that=4320.Figure 7.22From the two covariances alone, we might be inclined to conclude that there is a stronger association in Panel 2 than there is in Panel 1. Examination of the two scatterplots produced from the two panels, however, reveals otherwise. Figures 7.21 and 7.22 reveal that it is only in terms of the units of measurements on the x-axis that the two scatterplots differ. The association between errors and latency depicted in the two scatterplots is identical. In the first panel latency was recorded in seconds; in the second panel latency was recorded in milliseconds. Just as we used indices such asφ, Cramér’s V, and Γ, to standardize the association between two categorical variables, we need an index for standardizing the association between two measurement variables.7.6 The Pearson Product Moment Correlation Coefficient
In this section we will examine:- the Pearson Product Moment Correlation Coefficient or r,
- how r is related to z-scores,
- how r can be transformed into an index of the reduction of error in prediction,
- how to test the significance of r, and
- nonparametric alternatives to r.
Where φ and Γ are standardized measures of the association between two categorical (nominal and ordinal) variables, the Pearson Product Moment Correlation Coefficient (r) (often simply referred to as the correlation) is the most common index of a linear association between two measurement variables.The word ‘correlation’ comes from two words in Latin, cor (com) meaning ‘together’ and relatio which means relation.Like the other measures of association we have discussed, r can take any value between 0 and 1. And like the ordinal Γ, r can be either negative or positive. One way to begin to understand r - eBook - PDF
Statistics Using Stata
An Integrative Approach
- Sharon Lawner Weinberg, Sarah Knapp Abramowitz(Authors)
- 2016(Publication Date)
- Cambridge University Press(Publisher)
(B) Elementary school students. (C) Adolescent boys. (D) College freshmen. (E) Male college students. (F) College students. (G) Children in grades K – 3. (H) Elementary school students. (I) Dimensions of a tree. EXPLORING RELATIONSHIPS BETWEEN TWO VARIABLES 159 of the linear relationship between them may be characterized by what is called the Pearson Product Moment Correlation Coefficient. ☞ Remark. The Pearson Correlation Coefficient is named after Karl Pearson, who in 1896 published the first mathematically rigorous treatment of this index. It was Sir Francis Galton, however, a cousin of Charles Darwin, who in the 1870’s first concep- tualized the notion of correlation and regression (to be covered in the next chapter) through his work in genetics; and, in particular, by examining two-dimensional scatter- plots of the relationship between the sizes of sweet pea plants of mother-daughter pairs (Stanton, 2001). While indices exist to represent the strength of nonlinear relationships, these are beyond the scope of our discussion, which, in this section, is confined to linear relationships only. For a more advanced treatment of correlation, including nonlinear relationships, the inter- ested reader is referred to Cohen, Cohen, West and Aiken (2003). 0 2 4 6 8 10 2 2.5 3 3.5 4 Students' expected grades in a course Same students' evaluation of the overall value of the course Fitted values 0 2 4 6 8 10 0 2 4 6 8 IQ Scores 1 Reading Achievement 2 Fitted values 2 3 4 5 6 0 2 4 6 8 Arithmetic reasoning 1 1 2 3 4 5 0 .5 1 1.5 2 Diameter of a tree Arithmetic fundamentals 2 Fitted values Circumference of the same tree Fitted values (f) (g) (h) (i) Figure 5.5 (continued) STATISTICS USING STATA: AN INTEGRATIVE APPROACH 160 THE PEARSON Product Moment Correlation Coefficient The strength of a linear relationship is characterized by the extent to which a straight line fits the data points. - eBook - PDF
- Lorena Madrigal(Author)
- 2012(Publication Date)
- Cambridge University Press(Publisher)
I am sure you have heard the saying “Correlation does not mean causation.” This is so true and so frequently forgotten! The natural and social world is full of spurious correlations, correlations which arise only because of chance and which have no meaning or importance in the natural and social world. 9.1 The Pearson product-moment correlation The Pearson correlation is a commonly applied parametric test which quantifies the relation between two numeric variables, and tests the null hypothesis that such relation is not statistically significant. The correlation between the variables is quantified with a coefficient whose statistical symbol is r, and whose parametric symbol is (“rho”). The coefficient ranges in value from −1 to +1. If r is negative, then as Y 1 increases, Y 2 decreases (Figure 9.1). If r is positive then as Y 1 increases, Y 2 increases as well (Figure 9.2). If r is not statistically significantly different from 0, then there is no significant relation between Y 1 and Y 2 (Figure 9.3). Thus, in correlation analysis the null hypothesis is that the parametric correlation between the two variables is 0; the 194 Correlation analysis Z 0 10 20 30 Y 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 Figure 9.1 A scatter plot of two variables which have a significantly negative correlation. Y 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 X 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 Figure 9.2 A scatter plot of two variables which have a significantly positive correlation. usual two-tailed test null hypothesis is H0: = 0. A one-tailed test is possible as well, although it should be used only when there are compelling reasons for it. The reader is by now familiar with the fact that many statistical techniques assume a sample data set to be normally distributed. Indeed, for analysis of variance, it was stressed that every sample be tested for normality of distribution. Correlation analysis - eBook - PDF
Staffing Organizations
Contemporary Practice and Theory
- Robert E. Ployhart, Benjamin Schneider, Neal Schmitt(Authors)
- 2005(Publication Date)
- CRC Press(Publisher)
the mean of those criterion scores is not a very good estimate. The same would hold true in estimating the mean of the predictor scores for all those scoring, say, five on the criterion; predictions from Y to X are more accurate when the distribution of possible predictor scores has a small standard deviation. To summarize all this information about the degree of relationship between the predictor and criterion, we use a statistic called the corre-lation coefficient. This statistic and its usefulness are described in the next several sections of the chapter. THE PEARSON PRODUCT-MOMENT CORRELATION COEFFICIENT ( r ) Like the mean and the standard deviation, the correlation coefficient (represented by r ) is a convention. Staffing researchers have used r as THE PEARSON PRODUCT-MOMENT CORRELATION 55 their index of relationship because of (a) its parsimony—the result is only one number, and (b) it meets the same criterion as the mean and standard deviation—it is an index of relationship that is a mean. 1 The technicalities of the arguments underlying the use of r need not concern us here. What is important, though, because of the extensive use of correlation coefficients in staffing work, is to begin to grasp the concepts of (a) the range of r , (b) the wonder of r , (c) the limitations of r , (d) the uses of r in reliability and validity analyses, and (e) alternatives to r . The Range of r The correlation coefficient may range in size from + 1.00 through zero to − 1.00. In staffing work, we rarely observe correlations above 0.90, and then generally only in relation to reliability (i.e., consistency or repro-ducibility). Typical observed validity coefficients between predictors and criteria range between 0.10 and 0.40. A correlation of 1.00, positive or negative, indicates that information about an individual regarding X also yields perfect information regard-ing Y ; it provides an “if . - Alan R. Jones(Author)
- 2018(Publication Date)
- Routledge(Publisher)
It provides a measure of how well the relationship between two variables can be represented by a straight line. mother. At that point, they are uncorrelated, and behave as random ‘hentities’ ( sorry, just couldn’t resist it). If the chick was still an unlaid egg, then there would be perfect correlation (just the single diagonal straight line). This is our Correlation Chicken ( not to be confused with Coronation Chicken, which is a type of food, however, here the Correlation Chicken is merely served up here as ‘food for thought’. If you remember nothing else about Partial Correlation, remember to visualise our Correlation Chicken! This visual concept of the correlated chickens could be replaced by a person walking a dog on an extendable lead. The two are tethered together by the lead. The degree of correlation is limited by the maximum length of the lead, which can be controlled by the locking mechanism on the lead. 5.2.5 Coefficient of Determination When we ask Microsoft Excel to draw a linear trendline through some data, we have the option of displaying the equation it calculates, and also the R-squared value, which it denotes on the graph as R 2 . The R 2 is called the ‘Coefficient of Determination’. The question it begs though is ‘What exactly is it determining’? Linearity, Dependence and Correlation | 283 In the case of two variables (one dependent and one independent), the Coefficient of Determination is simply the square of Pearson’s Linear Correlation Coefficient, and ranges between 0 and 1. ( Yes, I can see what you are thinking:‘This must be some new definition of the term “simple”’!) If we revisit our earlier example from Table 5.3 and Figures 5.8 and 5.9, we can visu- alise what R 2 is measuring.
Index pages curate the most relevant extracts from our library of academic textbooks. They’ve been created using an in-house natural language model (NLM), each adding context and meaning to key research topics.










