You need to log-in or subscribe in order to use Student access.

Shapiro-Wilk Test

Some statistical tests ask you to assume that your data follows a normal distribution. The Shapiro-Wilk Test compares your data to data from a normal distribution with the same mean and standard deviation as your sample.

When to use

  • You want to test whether your data can be assumed to follow the normal distribution.
  • Your data sample is less than 2,000.
  • Your data is continuous (see Types of Data for more information).

NOTE: The test is usually recommended for samples of less than 2,000. For larger samples, use the Kolmogorov-Smirnov Goodness of Fit Test.

Features

Some statistical tests ask you to assume that your data follows a normal distribution. The Shapiro-Wilk test compares your data to data from a normal distribution with the same mean and standard deviation as your sample. If the test is NOT significant, then the data can be assumed to be normal.

The Shapiro-Wilk test has certain assumptions that need to be met for accurate results:

  • The data set should be independent.
  • The data set should be continuous.
  • The data set should not have outliers.
  • The data set should not have significant skewness or long "tails" (kurtosis). See Measures of Central Tendency for more information.

The hypotheses for the test are:

H0: the data follows the normal distribution.

H1: the data does not follow the normal distribution.

The easiest way to do this test is to use an online calculator.  A good example can be found HERE.

Your Turn

An ecologist was investigating woodland microhabitats, contrasting the communities in a shaded position with those in full light. One of the plants was ivy (Hedera helix). Leaf widths were measured, but because the size of the leaves varied with the position on the plant, only the 4th leaf from each stem tip was measured. The results from the plants available were as follows.

Width of sunny leaves (mm): 32, 24, 30, 33, 61, 26, 32, 37, 43, 31, 38, 26

Width of shady leaves (mm): 34, 16, 45, 41, 36, 33, 37, 42, 35, 35, 36, 36

You now wish to check whether it is reasonable to assume that your data is normally distributed.

First test your sunny leaf data using this online calculator HERE.

What do you conclude? Tick all that apply.

You now know that you cannot assume your "sunny" data is normally distributed as it has a positive skew.

 

Now test your shady leaf data using the online calculator.

What do you conclude?

 

It is not reasonable to assume that your data is normally distributed

Your calculated p-value = 0.006994 which is less than the critical probability or significance level of 0.05 (5%). Therefore we cannot reasonably assume the data is normally distributed.

EXTENSION QUESTION

Because you cannot assume your data is normally distributed, which statistical tests could you do to compare the sunny and shady data points? Tick all that apply.

The Unpaired t-Test assumes the data follows a normal distribution so you cannot use this test.

You cannot do a Wilcoxon Rank test as this test is for repeated sample groups.

BUT you can do a Mann-Whitney Test.

 

Total Score:

All materials on this website are for the exclusive use of teachers and students at subscribing schools for the period of their subscription. Any unauthorised copying or posting of materials on other websites is an infringement of our copyright and could result in your account being blocked and legal action being taken against you.