You need to log-in or subscribe in order to use Student access.

Pearson Correlation Coefficient

The Pearson correlation coefficient (also known as the “product-moment correlation coefficient”) is a measure of the linear association between two variables X and Y. It has a value between -1 and 1.

When to use

  • You want to know whether two numerical variables are linearly correlated.
  • Your data is continuous (see Types of Data for more information).
  • Your data is linked (paired) – each value of your independent variable has a matching value for the dependent variable (like X-Y coordinates).

Features

The Pearson correlation coefficient is a measure of the linear association between two variables. It has a value between -1 and 1 where:

  • -1 indicates a perfectly negative linear correlation between the two variables.
  • 0 indicates no linear correlation between the two variables.
  • 1 indicates a perfectly positive linear correlation between the two variables.

A Pearson correlation coefficient tells us the type of linear relationship (positive, negative, none) between two variables as well as the strength of that relationship (weak = 0.0 to 0.29, moderate = 0.30 to 0.49, strong = 0.50+).

Remember to consider these things when considering a Pearson correlation coefficient:

  • Correlation does not imply causation. Just because two variables are strongly correlated this does not mean that one variable causes the other.
  • Correlations are sensitive to outliers. One extreme outlier can dramatically change a Pearson correlation coefficient.
  • Only linear relationships. A Pearson correlation coefficient does not capture nonlinear relationships between two variables. So, if a Pearson correlation coefficient indicates that two variables are uncorrelated, they could still have some type of nonlinear relationship.
  • Order of variables matters. The Pearson correlation coefficient cannot tell the difference between dependent variables and independent variables. For example, if you are trying to find the correlation between pH levels and rate at which seashell dissolve, you might find a high correlation of 0.8. However, you could also get the same result with the variables switched around. In other words, you could say that seashell dissolution causes a high pH, which obviously makes no sense.

An online calculator can be found HERE.

Your Turn

You are investigating the effect of acid rain on the growth of plants. You plant seeds and water them each day with water with pH levels of 3.0, 4.0 and 5.0. You do 5 trials at each pH level.

After 10 days you measure the height of the plants. Your data is as shown to the right.

You now want to determine whether these two variables (pH level and plant growth) are correlated. Since there are no obvious outliers in your data, the Pearson correlation coefficient is appropriate.

You use the online calcuator HERE.

 

To enter your data into the calculator you need to know...

 

 X-values =

Plant height

 Y-values = 

pH level

You have tested whether the pH level of water (which you varied) affects the growth of plants (which you observed). This means that the X-values must be the pH level and the Y-values must be the height of the plants.

If you put the data the other way around you would be implying that the height of the plants affected the pH of the water, which is nonsensical.

Now run the test.

The first thing you see is that "The value of R is: 0.9531.

You should also see the graph to the right. It clearly shows that as the X-values (pH level) increase, the Y-values (height of plants) also increases. Such a graph could be a valuable addition to your report (although you would need to add a title and values along both axes).

 

The correlation coefficient of 0.9531 and the graph above both show that there is what type of correlation between pH level and height of plants?

The highest possible value for a positive correlation is 1 (or -1 for a negative correlation). Since the calculated value here is very close to 1, we can conclude that there is a strong positive correlation.

Total Score:

All materials on this website are for the exclusive use of teachers and students at subscribing schools for the period of their subscription. Any unauthorised copying or posting of materials on other websites is an infringement of our copyright and could result in your account being blocked and legal action being taken against you.