Chi-squared Test Revision notes | A-Level Biology AQA

Chi-squared Test

This lesson covers:

How chi-squared tests can be used to analyse genetic crosses
The key steps in the chi-squared test
How to calculate the chi-squared statistic
How to compare chi-squared values to critical values

The chi-squared test for genetic crosses

The chi-squared (χ²) test is a statistical tool scientists use to measure any differences between observed experimental results and expected theoretical outcomes. Effectively, it can assess whether the outcomes of a genetic cross are significantly different from the outcomes predicted by a specific inheritance pattern.

Using the χ² test has certain criteria:

Large sample size
Discrete data categories (like yes or no, heads or tails, red or blue)
Using raw counts (not percentages or rates)
A comparison of experimental and theoretical results

More observations reduce the relative effect of chance on the difference between expected and observed results.

Overview of the χ² test

To carry out the χ² test, there are a few steps that you need to follow.

Steps in the χ² test:

Propose an alternative hypothesis, which suggests that there is a significant difference between the observed and expected results, and that the difference is due to a factor other than chance.
Propose a null hypothesis, which assumes that there is no significant difference between observed and expected results, and that any difference is due to chance alone.
Predict the expected phenotypic ratios among the offspring.
Conduct crosses and record the observed ratios.
Calculate the χ² statistic.
Compare the χ² value to the critical value at a chosen probability level, typically a 5% significance level (p = 0.05).

If the χ² value is higher than the critical value at the chosen probability level, it suggests that the differences are not due to random chance, leading to the rejection of the null hypothesis.

We accept the null hypothesis when the χ² value is lower than the critical value at the chosen probability level. This suggests that the differences between the observed and expected frequencies are due to chance.

Calculating the χ² statistic

To calculate the χ² statistic, use the formula:

χ2=∑E(O−E)2

Where O is the observed number and E is the expected number for each phenotype.

To calculate the χ² statistic:

Calculate the expected values based on the expected phenotypic ratio.
Record the observed values for each phenotype.
For each phenotype, subtract the expected number from the observed number.
Square these differences (to make the values positive).
Divide each squared difference by the expected number.
Repeat steps 1-5 for each phenotype and add these values together.

Example of calculating a χ² statistic

Consider a monohybrid cross examining wing length in pure breeding fruit flies. The phenotypes are normal wings (dominant) and vestigial wings (recessive). This means a 3:1 ratio of normal to vestigial wings can be hypothesised in the F₂ generation.

The information you may be provided with is as follows:

A homozygous dominant parent (NN) is crossed with a homozygous recessive parent (nn).
This produces 160 offspring in the F₂ generation.
In the F₂ generation, there were 111 offspring observed with normal wings, and 49 observed with vestigial wings.

Inputting your calculations into a table may help organise the information.

Phenotype	Expected ratio	Calculation of expected number	Expected number (E)	Observed number (O)	$O - E$	$(O - {E)}^{2}$	$\frac{(O - {E)}^{2}}{E}$
Normal wings	3	$160 \times \frac{3}{4}$	120	111	-9	81	0.675
Vestigial wings	1	$160 \times \frac{1}{4}$	40	49	9	81	2.025

We then add together the two values for E(O−E)2:

χ2=∑E(O−E)2

χ2=(0.675+2.025)

χ2=2.7

Comparing the χ² statistic to the critical value

You then need to determine whether the difference between observed and expected values is statistically significant.

This is done as follows:

Choose a suitable probability level, typically p = 0.05.
Calculate the degrees of freedom (df), df = number of phenotypes−1
Use a χ² table to find the critical value corresponding to the chosen probability level and degrees of freedom.
If the χ² statistic is greater than or equal to the critical value, the difference is significant, and the null hypothesis is rejected.
If the χ² statistic is less than the critical value, the difference is likely due to chance, and the null hypothesis can be accepted.

Example of comparing the χ²statistic to the critical value

Let's see this in action with the fruit fly example above.

In this case:

With two phenotypes, df = 2−1=1
At the 0.05 probability level, the critical value for 1 df is 3.84 (see the table below; the table will be provided in an exam).
As the χ² statistic is 2.7, which is smaller than 3.84, we must accept the null hypothesis.
There is no significant difference between the observed and expected values, and any difference is due to chance.

df	p = 0.50	p = 0.10	p = 0.05	p = 0.01	p = 0.001
1	0.46	2.71	3.84	6.63	10.83
2	1.39	4.60	5.99	9.21	13.82
3	2.37	6.25	7.81	11.35	16.27

Worked example - Calculating the χ²statistic for a pea plant cross

A pea plant cross where the expected ratio of tall to dwarf plants is 3:1 produced 160 offspring, where 135 were tall plants and 25 were dwarf plants.

Calculate the χ² statistic for this cross, and determine whether any difference between observed and expected values is due to chance or some other factor.

Use the following formula:

χ2=∑E(O−E)2

Where O is the observed number and E is the expected number for each phenotype.

Step 1: Calculate expected numbers

expected tall plants: 160×43=120

expected dwarf plants: 160×41=40

Step 2: Subtract expected values from observed values for each phenotype

tall plants: 135−120=15

dwarf plants: 25−40=−15

Step 3: Square the differences

tall plants: 152=225

dwarf plants: (−15)2=225

Step 4: Divide the squared difference by the expected number

tall plants: 120225=1.875

dwarf plants: 40225=5.625

Step 5: Substitution and correct evaluation

χ2=1.875+5.625=7.5

Step 6: Compare to the critical value at a 5% probability level

this determines whether the observed ratio significantly deviates from the expected 3:1 ratio

as there are 2 phenotypes, df = 2−1=1

df	p = 0.50	p = 0.10	p = 0.05	p = 0.01	p = 0.001
1	0.46	2.71	3.84	6.63	10.83
2	1.39	4.60	5.99	9.21	13.82
3	2.37	6.25	7.81	11.35	16.27

the χ² statistic for the pea plant cross is 7.5, which is higher than the critical value at p = 0.05 (3.84)

this means we can reject the null hypothesis, and the differences between the observed and expected values are due to a factor other than chance