Goodness of fit samples - Cinema Tickets
What do the samples tell us?
In this activity we will examine a variety of different samples from a couple of different populations to explore the notion of the chi squared goodness of fit test and how it is a measure of how likely a given sample is to come from a given population.
Activity
This is a very simple activity that your teacher will guide you through. The basic premise is that you will be taking lots of samples from a given population to explore how samples can differ and what happens each time when you use a goodness of fit test to establish how likely it is that your sample comes from an expected population. In this particular case study, the idea is that a cinema believes its ticket sales are distributed in a certain way. See the diagram below. You will take samples to see if this is how it worked out.
You will need the cards in this spreadsheet. You may also need the large printable collection table on page 4 of the spreadsheet and/or the digital one on page 1. You may also want to record your results on this worksheet (download as for different formats), either digitally or pen and paper.
Step 1 - Print and cut out the cards for population 1 and place them in a bowl from which a sample can easily be taken.
Step 2 - Take a sample
Step 3 - Count the number of each day there is in the sample
Step 4 - Carry out a goodness of fit test the see how likely it is that the sample came from the expected population
Step 5 - Repeat for another 3 or 4 samples
Step 6 - Repeat all the above for Population 2
A ToK moment - This particular topic is a total goldmine for ToK and something I hope students can write about in their essays or exhibitions. Very specifically we are talking about trying to make valuable conclusions about entire populations from a sample. The methodology is hypothesis testing which is a neat combination of a rigorous mathematical technique often used in the human and natural sciences, leading to inductive reasoning. It is a great journey to discuss. Probabilities can be accurately calculated for very complex situations. You only need to look closely at the chi squared distribution to see that. The path from these calculations towards truth and certainty is much cloudier of course and we have to make some important decisions about the reliability of any conclusions we make form hypothesis testing. There is also the possibility to have an explicit focus on sampling and the reliability and significance of different methods. We could also zero in on the current debate about the binary nature of significance levels where two results that are super close might be either side of a arbitrary boundary that means we can draw different conclusions form them. It is all really rich material.
A global view
Teacher Notes
This is a pretty simple and self explanatory activity. There is a bit of set up time in term sof printing and cutting out the samples. I argue that I think this iis worth it. I know we can get technology to generate random samples for us, but I think there is no substitute for a very tangible and concrete experience here where students see both a population and a sample. For what its worth, I printed these, laminated them and cut them out in about 30 minutes. Now, my colleagues and I can use them year on year. I think its worth it!
Step 1 - Prepare the populations and put them in a bowl or something from which a sample can easily be taken.
Step 2 - Take a sample from population 1 - this could be done in a variety of ways, depending on how much you want to focus on sampling techniques here. You might have done this on a separate occasion and therefore choose to focus straight in on the samples themselves. You might take more time and explore sampling techniques. The quickest might be to have your students in groups of two or three, mix up the cards in a bowl and then deal out a handful to each group for counting.
Step 3 - Collect the information from the sample. In the fist instance I like a large scale table and everyone puts their cards in the right cell and then we tally. I think this is another good concrete experience that helps to reinforce that the notion of observed frequencies. Alternatively, or eventually, we can do some counting between us as a class and summarise on paper tables or using provided spreadsheet.
Step 4 - Complete a goodness of fit text on the data either using the spreadsheet or GDCs
Step 5 - Repeat 2 or 3 more times. The idea is to see first hand that samples will be different each other. This is the nature of sampling. In each case though, we use the test to examine the probability that this sample came from a population that was distributed as expected.
Population 1 is distributed as expected
Population 2 - is slightly different
Step 6 - Now repeat with the second population - you should get some different results!
.....