Chi-squared Test

This lesson covers: 

  1. How chi-squared tests can be used to analyse genetic crosses
  2. The key steps in the chi-squared test
  3. How to calculate the chi-squared statistic
  4. How to compare chi-squared values to critical values

The chi-squared test for genetic crosses

The chi-squared (χ2) test is a statistical tool scientists use to measure any differences between observed experimental results and expected theoretical outcomes. Effectively, it can assess whether the outcomes of a genetic cross are significantly different from the outcomes predicted by a specific inheritance pattern.


Using the χ2 test has certain criteria:

  1. Large sample size
  2. Discrete data categories (like yes or no, heads or tails, red or blue)
  3. Using raw counts (not percentages or rates)
  4. A comparison of experimental and theoretical results


More observations reduce the relative effect of chance on the difference between expected and observed results.

Overview of the χ2 test

To carry out the χ2 test, there are a few steps that you need to follow.


Steps in the χ2 test:

  1. Propose an alternative hypothesis, which suggests that there is a significant difference between the observed and expected results, and that the difference is due to a factor other than chance.
  2. Propose a null hypothesis, which assumes that there is no significant difference between observed and expected results, and that any difference is due to chance alone.
  3. Predict the expected phenotypic ratios among the offspring.
  4. Conduct crosses and record the observed ratios.
  5. Calculate the χ2 statistic.
  6. Compare the χ2 value to the critical value at a chosen probability level, typically a 5% significance level (p = 0.05).


If the χ2 value is higher than the critical value at the chosen probability level, it suggests that the differences are not due to random chance, leading to the rejection of the null hypothesis.


We accept the null hypothesis when the χ2 value is lower than the critical value at the chosen probability level. This suggests that the differences between the observed and expected frequencies are due to chance.

Calculating the χ2 statistic

To calculate the χ2 statistic, use the formula:

χ2=E(OE)2


Where O is the observed number and E is the expected number for each phenotype.


To calculate the χ2 statistic:

  1. Calculate the expected values based on the expected phenotypic ratio.
  2. Record the observed values for each phenotype.
  3. For each phenotype, subtract the expected number from the observed number.
  4. Square these differences (to make the values positive).
  5. Divide each squared difference by the expected number.
  6. Repeat steps 1-5 for each phenotype and add these values together.

Example of calculating a χ2 statistic

Consider a monohybrid cross examining wing length in pure breeding fruit flies. The phenotypes are normal wings (dominant) and vestigial wings (recessive). This means a 3:1 ratio of normal to vestigial wings can be hypothesised in the F2 generation.


The information you may be provided with is as follows:

  • A homozygous dominant parent (NN) is crossed with a homozygous recessive parent (nn).
  • This produces 160 offspring in the F2 generation.
  • In the F2 generation, there were 111 offspring observed with normal wings, and 49 observed with vestigial wings.


Inputting your calculations into a table may help organise the information.

PhenotypeExpected ratioCalculation of expected numberExpected number (E)Observed number (O)OEO - E(OE)2\text{(O} - \text{E)}^2(OE)2E\frac{\text{(O} - \text{E)}^2}{E}
Normal wings3160×34160\times\frac{3}{4}120111-9810.675
Vestigial wings1160×14160\times\frac{1}{4}40499812.025

We then add together the two values for E(OE)2:

χ2=E(OE)2

χ2=(0.675+2.025)

χ2=2.7

Comparing the χ2 statistic to the critical value

You then need to determine whether the difference between observed and expected values is statistically significant.


This is done as follows:

  1. Choose a suitable probability level, typically p = 0.05.
  2. Calculate the degrees of freedom (df), df = number of phenotypes−1
  3. Use a χ2 table to find the critical value corresponding to the chosen probability level and degrees of freedom.
  4. If the χ2 statistic is greater than or equal to the critical value, the difference is significant, and the null hypothesis is rejected.
  5. If the χ2 statistic is less than the critical value, the difference is likely due to chance, and the null hypothesis can be accepted.

Example of comparing the χ2statistic to the critical value

Let's see this in action with the fruit fly example above.


In this case:

  • With two phenotypes, df = 21=1
  • At the 0.05 probability level, the critical value for 1 df is 3.84 (see the table below; the table will be provided in an exam).
  • As the χ2 statistic is 2.7, which is smaller than 3.84, we must accept the null hypothesis.
  • There is no significant difference between the observed and expected values, and any difference is due to chance.
dfp = 0.50p = 0.10p = 0.05p = 0.01p = 0.001
10.462.713.846.6310.83
21.394.605.999.2113.82
32.376.257.8111.3516.27

Worked example - Calculating the χstatistic for a pea plant cross

A pea plant cross where the expected ratio of tall to dwarf plants is 3:1 produced 160 offspring, where 135 were tall plants and 25 were dwarf plants.


Calculate the χ2 statistic for this cross, and determine whether any difference between observed and expected values is due to chance or some other factor.


Use the following formula:

χ2=E(OE)2


Where O is the observed number and E is the expected number for each phenotype.


Step 1: Calculate expected numbers

expected tall plants: 160×43=120

expected dwarf plants: 160×41=40


Step 2: Subtract expected values from observed values for each phenotype

tall plants: 135120=15

dwarf plants: 2540=15


Step 3: Square the differences

tall plants: 152=225

dwarf plants: (−15)2=225


Step 4: Divide the squared difference by the expected number

tall plants: 120225=1.875

dwarf plants: 40225=5.625


Step 5: Substitution and correct evaluation

χ2=1.875+5.625=7.5


Step 6: Compare to the critical value at a 5% probability level

this determines whether the observed ratio significantly deviates from the expected 3:1 ratio

as there are 2 phenotypes, df = 21=1

dfp = 0.50p = 0.10p = 0.05p = 0.01p = 0.001
10.462.713.846.6310.83
21.394.605.999.2113.82
32.376.257.8111.3516.27

the χ2 statistic for the pea plant cross is 7.5, which is higher than the critical value at p = 0.05 (3.84)

this means we can reject the null hypothesis, and the differences between the observed and expected values are due to a factor other than chance