Spearman's Correlation Coefficient

This lesson covers: 

  1. What correlation is
  2. Spearman's rank correlation for assessing relationships between two variables

What is correlation?

Correlation refers to the relationship between two variables.


The correlation can be:

  1. Positive - This is when both variables increase or decrease together.
  2. Negative - This is when one variable increases as the other decreases.
  3. Non-existent - This is when there is no clear relationship, indicated by a correlation coefficient close to 0.


A correlation coefficient of 1 indicates a perfect, positive linear correlation, where all points lie on a straight line.

Assessing correlation using Spearman's rank correlation

Spearman’s rank correlation coefficient is used to measure the strength and direction of association between two continuous variables that are not normally distributed.


The process for calculating Spearman's rank correlation is:

  1. Convert the raw data values of each variable into ranks from the smallest value to the largest value.
  2. If two values are the same for one variable, give them an average rank (e.g. if the two smallest values are 6 mm, they would both have the rank 1.5, and the next smallest value would have the rank 3).
  3. Determine the difference in ranks (d) for each pair of values.
  4. Square these rank differences (d2).
  5. Sum up all the squared rank differences.
  6. Calculate Spearman's rho (ρ).


Use the formula (this will be provided in an exam):

ρ=1n(n21)6d2


Where:

  • ρ= Spearman’s rank correlation coefficient
  • d2= sum of the squared differences of the ranks
  • n= number of pairs of data


You will most likely be given critical value tables that use the sample size, n, for this test, rather than needing to calculate degrees of freedom.


Then, compare ρ to a critical value at 5% significance level to determine the significance of the correlation:

  • A ρ value near +1 indicates a strong positive correlation.
  • A ρ value near -1 suggests a strong negative correlation.
  • A ρ value around 0 implies there is no correlation.


It can be helpful to carry out the calculations for steps 1 to 4 in a table, for example:

SampleStem length (mm)Leaf width (mm)Rank: stem lengthRank: leaf widthRank difference (d)d2
17954106.53.512.25
273809900
320311100
423662.58-5.530.25
5675486.51.52.25
6644273416
739515411
8295345-11
9593662416
1023882.510-7.556.25

Worked example - Correlation of study time and test scores using Spearman's rank

The table below shows the study time and test scores of 10 students. The critical value at the 5% level for n = 10 is 0.600.

SampleStudy time (hours)Test score (%)
15.578
23.265
38.182
42.459
56.780
67.388
71.855
84.673
99.094
1010.590

Using Spearman's rank correlation coefficient, assess whether there is a significant correlation between study time and test scores.


Step 1: Equation

ρ=1n(n21)6d2


Step 2: Rank the data to calculate d2

rank the data for both variables from smallest to largest, using the average rank when two values are identical, and then calculate the rank difference (d) and d2

SampleStudy time (hours)Test score (%)Rank: study timeRank: test scoreRank difference (d)d2
15.5785500
23.2653300
38.1828711
42.4592200
56.7806600
67.38878-11
71.8551100
84.6734400
99.094910-11
1010.59010911

Step 3: Substitution and correct evaluation

ρ=1n(n21)6d2

sum of the squared differences of ranks (d2)=4

n=10

ρ=110(1021)6×4

ρ=199024

ρ=10.0242

ρ=0.976


Step 4: Determine significance and interpret result

the significance of the correlation coefficient is determined by comparing it to a critical value at a chosen significance level


the critical value at the 5% level for n = 10 is 0.648, so the correlation coefficient of 0.976 is much larger than the critical value


this means the correlation is significant at the 5% level, indicating a very strong positive correlation between study time and test scores that is unlikely to be due to chance