If there is a difference, what is it?
Last month, I described a simple problem to determine which gear material resulted in longer wear. We reviewed the extremely powerful technique called Analysis of Variance (ANOVA) and found a statistically significant and important effect on the average wear due to gear material. I also promised to tell you about an infrequently used next step that would make you a lot of money. That is what we are going to talk about this month.
(In the event that you missed last month’s column, due to being hit by a meteor or some other equally valid excuse, click herefor the research description and data.)
We showed with our initial ANOVA that there was a significant difference in wear for the five different materials, and using the statistic ω², we showed that the differences were large enough to be considered important as compared to the total variation. But the ANOVA only tells us that there’s a difference somewhere, it doesn’t tell us where the differences are.
Sometimes that may be enough. Let’s say that all the different materials we’re testing cost the same for raw material, processing, and use. If higher is better, then we’re assured that taking the material with the highest average is significantly different than at least one of the other materials, so we would choose to do that.
But, what if the total costs aren’t the same? Although material 4 is the highest on average, if it cost twice as much as material 2, and material 4 and material 2 were both statistically indistinguishable, maybe we can use material 2 instead and capture the same benefit while reducing costs.
Let’s extend our example and say that the remediation cost of a unit is $1. This remediation cost is where the specification and the Taguchi loss function (TLF) intersect (click here to read my article discussing the TLF) and includes cost of rework/scrap/customer complaints and returns and warranty costs. Our lower specification limit is 38, which we have trouble meeting consistently right now (our C_{pk}= 0.629). We don’t have an upper specification, so no cost is incurred in that direction, but it would be ideal to achieve an average of 52 if it is possible—that would put us above all our competitors. We make 200,000 units per month. Material 5 is the current material and materials 1 through 4 are the ones we’re considering changing over to. They’re provided by four different vendors and some have different purchase prices:
Material 1 
$0.0375/unit 
Material 2 
$0.0625/unit 
Material 3 
$0.0625/unit 
Material 4 
$0.0729/unit 
Material 5 
$0.0417/unit 
To make a decision about which one or ones are most economical, we need to have a way to determine if certain settings can be shown to be statistically different. Fortunately, there are a number of ways to do this type of post hoc analysis. Which one you choose depends on whether the variances for the different settings are equal, and on your tolerance for Type I error. (Remember that Type I error is when you conclude that there is a difference when in fact there is not.)
All of the posthoc tests are based on the ttest, but with controls for the Type I error inflation I mentioned last month.
You might be wondering why in the world we didn’t just do these rather than an ANOVA. Well, as is typical in statistics (and in real life) there ain’t no such thing as a free lunch. In controlling for Type I error we lose power—the ability to detect a change if it’s there. So we do the ANOVA first, which is a highpower test to see if there’s a change, and then come back with a posthoc test to determine where the differences might be. (This means that sometimes you can have a situation where your ANOVA indicates a significant difference, but none of the posthocs do. In this case, you can either gather more data or just conclude that at this point we only know for sure that the extremes are different.)
The key to performing these tests correctly is to do all the tests you need to do, but not one more. To figure out what we want to test, let’s take a look at the data grouped by material.
Figure 1  Box and Whisker Plot of Experimental Results 
The ANOVA indicated significant differences, so we know for sure that material 5 (our current material) is at least different than material 4. But because material 4 is the most expensive of the options as well, we don’t want to jump right on that unless we can show a good cost basis for it.
Now I have to give away what is going to happen next month, because to choose the right post hoc test I need to know if the materials all have the same variability. We’re going to find that the dispersions for the materials aren’t all the same. Now forget I said that.
We’ll use the GamesHowell procedure to control Type I inflation with unequal dispersion. This tests each pair of settings to determine if they’re different. The table below sums up what we find. For example, materials 1 and 2 are said to be equal at a pvalue of 0.103.
Table 1  Output from PHASTTM
Group 
Mean 
Material 1 
Material 2 
Material 3 
Material 4 
Material 1 
41.625 

Material 2 
45 
Equal 

Material 3 
44.5 
Unequal 
Equal 

Material 4 
52.25 
Unequal 
Unequal 
Unequal 

Material 5 
40.125 
Equal 
Unequal 
Unequal 
Unequal 
Another way to sum this up is to show a table of homogenous subsets:
Material 
S1 
S2 
S3 
S4 
5 
40.125 

1 
41.625 
41.625 

3 
44.5 
44.5 

2 
45 
45 

4 
52.25 
Or, using parameter notation: μ_{5}= μ_{1}; μ_{1} = μ_{2}; μ_{2} = μ_{3} >μ_{5}, μ_{1}, μ_{3}, μ_{2} < μ_{4}. So material 4 is definitively the highest in wear resistance, and materials 2 and 3 are also higher on average than our current gear material. Material 1, although cheaper, doesn’t show any difference from our current gear.
But this still doesn’t answer which is the most economical material to select. To answer that, we have to know if there are significant differences in variation amongst the materials. It would be terrible to choose a material that has higher strength on average, but has so much variability that we end up failing a large percentage. Remember, the TLF shows us that we lose money due to variation even within the specification.
Will we end up choosing the highwear material that is more expensive, or will it be more economical to choose one of the materials that are higher than our current material, but not quite as high as material 4? Or how about material 1? Our procurement department likes that one because it’s cheaper than anything else. Maybe it’s the most economical way to go. And will material 5 finally admit to material 1 that it’s pregnant with material 3’s baby?
Tune in next month for the next issue of “As the PostHoc Turns.” I bet you can hardly wait!
PHASTTM is a free program to help you perform posthocs correctly. You can download it from ROI’s website.
But I could be wrong.