Friday, January 23, 2015

M&M ANOVA

Khan Academy does a nice job of explaining ANOVA at this link.  This is in fact where I learned it.  Below I have a nice application of ANOVA using M&Ms that I would like to share.

There are numerous tables below which can be glided over without any loss of understanding.  Indeed, if you just read the prose in between the tables you will be far better off.

In the Fall of 2014, I assigned a series of activities to my Elementary Statistics involving M&Ms.  These activities begin here. There were six groups of students involved and each group took a sample of ten bags of M&Ms.  These samples are listed below.

Group 1
Color/Bag 1 2 3 4 5 6 7 8 9 10
Red 2 5 2 4 3 7 3 1 7 5
Orange 7 3 6 5 6 7 5 7 6 9
Yellow 2 3 0 1 2 0 2 4 3 4
Green 3 0 6 6 3 2 1 3 4 2
Blue 3 5 4 2 4 7 4 5 4 5
Brown 1 1 1 0 1 3 1 1 1 3

Group 2
Color/Bag 1 2 3 4 5 6 7 8 9 10
Red 2 2 3 2 2 3 1 6 3 3
Orange 3 4 0 4 4 4 3 2 2 3
Yellow 2 5 5 1 1 1 0 1 2 4
Green 3 1 5 4 4 3 4 2 4 2
Blue 3 2 2 4 3 3 6 1 4 4
Brown 3 2 1 1 2 2 2 4 1 1
Group 3
Color/bag 1 2 3 4 5 6 7 8 9 10
Red 1 1 1 0 2 3 0 4 0 0
Orange 2 0 1 2 5 6 5 3 1 0
Yellow 1 1 2 0 3 5 0 7 1 3
Green 1 3 1 2 2 1 5 4 1 1
Blue 2 2 4 2 4 5 5 2 2 1
Brown 1 0 0 1 2 0 3 1 2 1
Group 4
Color/Bag 1 2 3 4 5 6 7 8 9 10
Red 4 5 1 3 1 0 2 2 3 2
Orange 1 1 7 3 3 6 3 5 4 6
Yellow 1 4 6 0 3 2 3 0 3 4
Green 2 3 1 8 4 4 5 6 1 3
Blue 5 1 1 5 4 2 3 3 3 2
Brown 2 3 4 1 3 3 2 1 2 0
Group 5
Color/Bag 1 2 3 4 5 6 7 8 9 10
Red 2 3 1 5 5 2 0 1 5 5
Orange 2 0 1 3 2 3 3 1 5 3
Yellow 6 2 5 2 3 4 7 7 1 5
Green 2 5 5 7 1 2 4 6 5 1
Blue 1 5 6 2 4 5 4 1 2 3
Brown 3 6 0 1 3 2 3 2 1 1
Group 6
Color/Bag 1 2 3 4 5 6 7 8 9 10
Red 1 2 2 4 3 3 1 2 1 7
Orange 3 3 1 1 3 1 3 2 1 2
Yellow 3 5 3 3 2 2 3 6 4 4
Green 3 1 7 3 3 6 2 2 4 2
Blue 4 4 1 4 3 3 3 4 4 3
Brown 2 1 4 1 4 2 4 2 5 1
This is a nice collection of real data and my thought was to make the most of it.  As a sample size of ten is small, my thought was to pool the data, but before this can be legitimately done, it must be justified.  One might argue that since all of the samples were taken from M&M's that might be justification enough, but I had lingering doubts.  What if proportion of M&M color is not consistent from batch to batch? What if M&Ms are put out in a variety of Fun Sizes?  What if my students had just royally goofed?  In order to be careful, I decided that after having taught elementary statistics for twenty years it was time to learn ANOVA.

I first wanted to get an good confidence interval for the average number of M&Ms per bag. I calculated that using the data for each group, finding the bag by bag total.  I put that into the following table:

Bag/Group 1 2 3 4 5 6
1 18 16 8 15 16 16
2 17 16 7 17 21 16
3 19 16 9 20 18 18
4 18 16 7 20 20 16
5 19 16 18 18 18 18
6 26 16 20 17 18 17
7 16 16 18 18 21 16
8 21 16 21 17 18 18
9 25 16 7 16 19 19
10 28 17 6 17 18 19
I then calculated the mean for each of the groups individually and the grand mean of the total pooled data.  Using this, I calculated the sum of the squares for differences within each of the groups and the sum of the squares for differences between the groups.  Those calculations are in the table below:

DATA
Bag/Group 1 2 3 4 5 6
1 18 16 8 15 16 16
2 17 16 7 17 21 16
3 19 16 9 20 18 18
4 18 16 7 20 20 16
5 19 16 18 18 18 18
6 26 16 20 17 18 17
7 16 16 18 18 21 16
8 21 16 21 17 18 18
9 25 16 7 16 19 19
10 28 17 6 17 18 19
GrandMean Means
17.07 20.7 16.1 12.1 17.5 18.7 17.3

SSW 
1 2 3 4 5 6
7.29 0.01 16.81 6.25 7.29 1.69
13.69 0.01 26.01 0.25 5.29 1.69
2.89 0.01 9.61 6.25 0.49 0.49
7.29 0.01 26.01 6.25 1.69 1.69
2.89 0.01 34.81 0.25 0.49 0.49
28.09 0.01 62.41 0.25 0.49 0.09
22.09 0.01 34.81 0.25 5.29 1.69
0.09 0.01 79.21 0.25 0.49 0.49
18.49 0.01 26.01 2.25 0.09 2.89
53.29 0.81 37.21 0.25 0.49 2.89
Sums
156.10 0.90 352.90 22.50 22.10 14.10

SSB
1 2 3 4 5 6
13.20 0.93 24.67 0.19 2.67 0.05
13.20 0.93 24.67 0.19 2.67 0.05
13.20 0.93 24.67 0.19 2.67 0.05
13.20 0.93 24.67 0.19 2.67 0.05
13.20 0.93 24.67 0.19 2.67 0.05
13.20 0.93 24.67 0.19 2.67 0.05
13.20 0.93 24.67 0.19 2.67 0.05
13.20 0.93 24.67 0.19 2.67 0.05
13.20 0.93 24.67 0.19 2.67 0.05
13.20 0.93 24.67 0.19 2.67 0.05
Sums
132.01 9.34 246.68 1.88 26.68 0.54
The SSW sums to 568.6 and the SSB sums to 417.1.  The numerator has m-1=5 degrees of freedom as we are comparing m=6 groups. The denominator has m*(n-1)=6*(10-1)=54 degrees of freedom as each of those groups took a sample of size n=10.  This gives an F test-statistics of F=7.92.  The critical number for those degrees of freedom with a significance level of  alpha=0.10 is 1.957.  As 7.92 is greater than 1.957, we must conclude that these samples are not all drawn from the same population.

This came as something of surprise to me.  As an educator of over 30 years experience, I immediately suspected student error.  Looking at the SSW and SSB table above, I noted that the numbers from group 3 were considerably larger than the rest.  I was curious as whether and how they had erred.  Discerning this was easy because I had had the students document their process.  In looking at the documentation from group 3, I found the follow photograph:




The student had been told to use Fun Size M&Ms.  It was assumed that they would plain and that the bags would not be mixed.  We are well tutored in how one spells ass-u-me. 

I would be remiss at this point, however, if I did not say that I had pushed this further. Elementating group 3 does not fix the problem.  The remaining groups are not sampling the same populations and an examination of the documentation of the other groups does not reveal a similar glaring error in methods.   Of all six groups, only 4 and 6 seem to be sampling the same population.

I will be having my class do a similar experiment this semester--with better instructions from the teacher--and after this I will conduct this study again.

Wednesday, January 21, 2015

M&M Binomial

This uses data gathered from the M&M Activity.

Let's begin this with a thought experiment.  Imagine you have the job of filling bags with M&Ms.  One might imagine that there is a huge bin that has been filled with M&Ms and that you are just parcelling them out into the bags.

There is a mountain of M&Ms and a certain proportion of these are red, orange, yellow, green, blue, and brown.  The number is so large that the act of choosing, say, a red M&M on one trial does not appreciably reduce the probability of getting one on the next trial.  All of this argues that, if one is interested in particular colors, each trial is a Bernoulli Trial with a fixed probability of success.

Each M&M is about the same weight and you are aiming to fill your bag to at least a certain weight but not more than one M&M more than that.  This results in the bags having a small variation in number of M&Ms per bag.  All of this means that we have a, more or less, fixed number of M&Ms per bag.

We combine these two observations, and what we have looks astonishingly like a Binomial Experiment.

Using a sample of 10 bags of Plain M&Ms, I investigated whether they did in fact follow the binomial distribution for red M&Ms.  This sample contained a total of 182 M&Ms of which 28 were red.  This yields a proportion p=0.154 of red M&Ms. The ten bags contained from 17 to 19 each. I therefore chose to set n=19.

Given those, I used my data to create a frequency table.  In the table below, the first column is the possible number of successes for a binomial experiment with 19 trials, that is to say the numbers from 0 to 19 inclusive; keeping with standard notation, that column is labeled x.  The second column is the number of bags that had that many red M&Ms.  As we are setting up do to a chi square test to see if the binomial model fits, we have labeled the frequency column with an O.

x O
0 1
1 2
2 2
3 0
4 4
5 0
6 1
7 0
8 0
9 0
10 0
11 0
12 0
13 0
14 0
15 0
16 0
17 0
18 0
19 0

We ask the question of how well this fits with the expectations of the binomial distribution B(19,0.154).  I do the sample calculation for x=3 below:

Putting this calculation into a spread sheet, I obtain:

n O E
0 1 0.42
1 2 1.45
2 2 2.36
3 0 2.44
4 4 1.77
5 0 0.97
6 1 0.41
7 0 0.14
8 0 0.04
9 0 0.01
10 0 0.00
11 0 0.00
12 0 0.00
13 0 0.00
14 0 0.00
15 0 0.00
16 0 0.00
17 0 0.00
18 0 0.00
19 0 0.00

We can then compare the values of the O and the E columns by using the (O-E)^2/E measure.  The results are in the table below:

n O E (O-E)^3/E
0 1 0.42 0.81
1 2 1.45 0.21
2 2 2.36 0.06
3 0 2.44 2.44
4 4 1.77 2.80
5 0 0.97 0.97
6 1 0.41 0.85
7 0 0.14 0.14
8 0 0.04 0.04
9 0 0.01 0.01
10 0 0.00 0.00
11 0 0.00 0.00
12 0 0.00 0.00
13 0 0.00 0.00
14 0 0.00 0.00
15 0 0.00 0.00
16 0 0.00 0.00
17 0 0.00 0.00
18 0 0.00 0.00
19 0 0.00 0.00
Note that the sum of the (O-E)^2/E column is 8.32.  The critical number for the chi square test for this is 30.14.  This is the table look up with 19 degrees of freedom and a significance level of 0.05.

As 8.32 is not larger than 30.14, we cannot reject the null hypothesis of the chi square test.  The null hypothesis is that the model fits.  We've no proven that it is binomial, but we can say that our data is not inconsistent with the binomial distribution B(19, 0.154).

Do this test with the data you collected from the M&M Activity.

M&M Hypothesis Test

This activity will make use of the data you gathered in M&M Activity.

As has been observed, there are six colors typically found in each bag of M&Ms. In the previous blog post, we tested whether the proportions of color within each bag remained consistent with differences due to no more than random variation.

Using data I had gather, I hypothesize that each bag of M&Ms contains 18 total and that each of the colors has the following average:
Red 3
Orange 3
Yellow 2.5
Green 4
Blue 4
Brown 1.5

In order to perform this test, we use a table of the following type.

There are blanks to be filled in depending the particular putative mean and the set level of significance.  I chose to test the green at a 0.05 level of significance.(The level of significance is chosen ahead of time so as not to effect the decision.)  This gave me the following table:


This is a small sample of size n=10 with 9 degrees of freedom, so I used the T-table* to give me a critical number of 2.262.  The number of greens from my sample was: 5,5,4, 3,3,3,3,4,3,1.  This has an average of 3.4 and a standard deviation of 1.2,  The calculation and test were completed as follows:
In this case, I could not prove that the average is not 4.  

Follow these procedures yourself and test at a 0.10 level of significance. You will have to look up a new T-value, but as you sample size is n=10, it will still have 9 degrees of freedom. Also test whether the number of M&Ms per bag is 18.

*(In order to use the T-statistic, we have to assume that the data is normally distributed. This can be justified, but it is beyond the scope of the current activity.)

Friday, January 16, 2015

M&M Chi-square

The following activity will make use of data gathered in the M&M Activity.

Every regular bag of plain M&Ms contains six colors: red, orange, yellow, green, blue, and brown.  In handling numerous samples of M&Ms, the question arises as to whether there is always about the same proportion of any given color, i.e, is there always about the same proportion of green?

This activity will test that using the chi-square.  We will not be able to prove that the proportion is the same; instead we will able to determine one of two things:

  1. The proportion is NOT the same.
  2. The amount per bag is consistent with being the same proportion.
I will conduct my test using green M&Ms as they are reputed to have mystical properties the description of which are beyond the scope of our investigation.

I drew a new sample of ten bags of M&Ms which were not connected to the previous activity. The data was as below:


The first column is simple the bag number and does not figure into calculations.  The second column is the total number of M&Ms per bag. The third column is the number of green M&Ms in a given bag.

We will be conducting a chi-square test.  In the chi-square, the null hypothesis is that all of the proportions are the same.  So it cannot be proven that the proportions are the same by the use of this test.  However, it can be proven that they are not the same.  We will compute a chi-square test statistic in the course of this, and if that is too large we will be forced to decide the proportions are not the same.  Otherwise, we will say the data is consistent with having the same proportion per bag.

To do the calculations, I put my data into an Excel spreadsheet will columns as labeled below:
Bag Number N O
1 17 3
2 17 2
3 19 5
4 18 6
5 19 6
6 18 1
7 18 2
8 18 6
9 19 2
10 19 6
182 39
The first column contains the bag number, the second column, headed by N, contains the total number of M&Ms per bag.  The third column, headed by O, contains the number of green M&Ms per bag. The last entry in the N and the O columns is the total of that column.

The null hypothesis of the chi-square test allows us to assume all the proportions are the same in each sample and, therefore, to pool data.  So pHat, the proportion of M&Ms that are green is 39/182, i.e. pHat is approximately 0.21.  Since the first bag contains 17 M&Ms we would expect it to contain 17*pHat=17*0.21=3.64 M7Ms.  We call this the expected number and denote it by E.  We do this calculation for every bag and obtain the following table:

Bag Number N O E
1 17 3 3.64
2 17 2 3.64
3 19 5 4.07
4 18 6 3.86
5 19 6 4.07
6 18 1 3.86
7 18 2 3.86
8 18 6 3.86
9 19 2 4.07
10 19 6 4.07
182 39

Note that the expectations in the E column differ from the observed reality in the O column.  The question is whether this is too much to be accounted for by mere random variation.  This requires a measure.  The one we use is (O-E)^2/E.  We take the difference of O and E, square it, and then divide that by the expectation.  The division by E allows to deal with variation that, while perhaps large in absolute terms, is small relative to the expected value.  The completed table appears as below:
Bag Number N O E (O-E)^2/E
1 17 3 3.64 0.11
2 17 2 3.64 0.74
3 19 5 4.07 0.21
4 18 6 3.86 1.19
5 19 6 4.07 0.91
6 18 1 3.86 2.12
7 18 2 3.86 0.89
8 18 6 3.86 1.19
9 19 2 4.07 1.05
10 19 6 4.07 0.91
182 39 9.34


The final entry in the (O-E)^2/E column is the total amount of error for observations verses expectations.  In this case that amount is 9.34.  The question is whether this is too much to be plausible.  To decide this, we need to look up the critical number on the chi-square table.  As there were 10 samples, this will require 9 degrees of freedom. For DF=9, this number is 16.92.  As 9.34, is not larger than 16.92, we cannot reject the null hypothesis.  We must therefore conclude that the data is consistent with there being approximately the same proportion of green M&Ms in each bag.

Your assignment is

  1. to investigate this hypothesis with the particular color of M&M assigned to your group.  
  2. document this as well as possible.
A video on how to do this on Excel can be found here. 

Tuesday, September 30, 2014

Not what you do, but how you do it

Not what you do, but how you do it

By Bobby Neal Winters
Before you read any further, you should know that I do not support the Common Core Mathematics.  My reasons for not supporting it have nothing to do with the specifics of the curriculum.  Instead, my lack of support grows from general principles: I believe that teachers should be professionals and allowed to use their professional judgement regarding how to teach.
Students in different parts of the country have different expectations, different levels of preparations, different goals and aspirations. Professional teachers, along with the school district and parents, should be allowed to use teaching techniques appropriate for those particular circumstances.
This having been said, I find myself disturbed by the rhetoric being used to attack the Common Core.  A particular method of attack is being used which I believe has harmful consequences to the level of the debate.
Consider the following problem in subtraction: 342-97. One method to solve this coming from the Common Core is to perform the following calculation:
97+ 3 = 100
100+200=300
300+42=342
Since 3+200+42= 245, it follows that 342-97=245.  There are quicker ways to do this.  There is the standard method that most of us learned that yields that answer more quickly.  I don’t think anyone would argue that.  That having been said, in my opinion, the value of this technique lay elsewhere.
For example, if you get something that cost $15.23 and give the cashier a $20 bill, the cashier will give you 2 cents to make $15.25, then 3 quarters to make $16; then $4 to make $20. It is the same process.
What does this process do?  It allows us to subtract by doing addition.  Someone who knew only the algorithms for addition could pick up this technique and be able to subtract,  They would be doing it  more slowly than someone who knows the standard algorithm, but they would be learning about what the concept of subtraction means as it relates to addition.  They are also learning how to implement a rather simple algorithm that makes good use of the decimal system of numbers.
Lets look at another method that comes in for criticism.  Consider the addition problem 8 + 5.  Think of it as 8+2+3=10+3=13.  Yes, if you know your addition facts, you can just jump to the 8+5=13.  But look at what his technique does.  It shows us that we can manipulate our numbers.  The 5 isn’t just 5; it is 2+3 and that 2 can be very handy in getting the 8 up to 10.
If we go back to our first problem, we can say 342-97= 342-100+3= 242+3=245, to get a completely different way of doing the problem.
These techniques breed familiarity with numbers and provide a gateway for growth.
As a math teacher of many years, I do work with numbers, but not necessarily in the way one might expect.  When I give an exam, I put 100 points worth of problems on it.  I have a class of 40 or more students most of the time and over the years I’ve developed a system for grading these exams.  I grade one page at a time and at the bottom lefthand corner of each page I write the number missed on the page.  
When I am finished with all of the tests I will go through them and calculate the total points.  If a student has missed 14, 8, and 12 points on the first, second, and third pages, respectively, I will proceed with the following calculation:
100-10=90, 90-4=86, 86-8=78, 78-10=68, 68-2=66.  Sometimes, if I am looking ahead, I would combine the last few steps by recognizing that 8+12=20 and calculate that 86-20=66, and often, in a calculation like 86-8, I will hiccup on the usual subtraction fact and, in my head, do 86-8=86-6-2=78.
I do this all in my head without writing a single step down and I do it for 40 to 50 papers at a run.  It rarely takes me more than a few minutes.  The two techniques illustrated above lead to this sort of mental manipulation of numbers.  
Yes, the standard subtraction algorithm is an incredibly useful technique. It should be taught and mastered.  However, these other techniques which I’ve seen used consistently used as examples of why the Common Core is criminally stupid, are  useful techniques when taught correctly by teachers who are professionals and the techniques are given the correct emphasis.
With math, with teaching, and with rhetoric, it’s not only important what you do, but how you do it.   Of these three, the math is the easy part.
In math, we know a variety of techniques and the secret is which one to use at what time.  It is all between us and whatever problem we are working on.
Teaching is more difficult.  We have our techniques, but when we use them, we by necessity have to include the students.  Each student has different abilities, different preparation, and has a different level of support at home.  Rolling down a one-size-fits-all solution and not allowing teachers to use their best judgment is going to cause trouble.
But then we get to the rhetoric.  
One source I read, criticized the particular techniques because they made students cry. Crying is not a bad thing. I cried in long division to the point my mother just did my homework for me.  I cried even worse in trigonometry when she couldn’t help me at all.  Learning is often a struggle.  This appeal to the emotions is effective rhetoric, but it seems to assume we should never stretch ourselves in learning.
I can understand why these two particular techniques were attacked.  They appear to make difficult what can be done easily another way. Indeed, I was taken in myself at first until I sat down with pencil and paper and worked through the first technique I cited.  It looked stupid to me, someone who’s been teaching math for over 30 years, until I worked it through once.
It is easier to attack a technical package like this by pulling out the complicated looking bits and making sport of them out of context to a non-technical audience.  It is effective; is sews a lot of confusion; it’s harmful to the level of debate.
It’s not just the result that is important. It’s how you do it.  This is not a healthy way to argue against the Common Core.