Homoscedasticity
 Details
 Created: 20 March 2008
 Written by Steven Ouellette
For the last couple of columns, we have been analyzing a simple experiment where we look for the best choice for a new gear material. We have used analysis of variance (ANOVA) to detect a difference in the mean wear for the four different materials. In the last column, we performed a posthoc test to determine which means were actually different from one another. However, to find the best solution, we need to know true costs, and to get that I need to know the variability associated with the different materials. At the end of the article, I’ll give you a chance to show off —the first one to email me the correct cost analysis will get bragging rights and their name (or reasonably tasteful nom de plume if you prefer) in print.
If you somehow missed the fun because you were watching the results from your party primary trickle in, click here to go to the first article setting up the problem and the means analysis, and here to go to the second article discussing posthoc analysis and giving the cost data.
There are actually a number of reasons why I need to test experimental results for differences in variances as well as the means, but in my experience very few Black Belts do this (Fewer still do, or even learn about, posthoc tests.) The first is to make sure that I meet the assumptions of ANOVA. In practical terms, however, ANOVA is robust to differences in variance if the sample sizes are equal, so often we can design around this. Second, as I hinted last column, we need to know if the variances are equal as this is required to choose the right posthoc test. But most important, in real life business applications it’s absolutely crucial to know the effect, if any, on variability. I have seen experimental results that showed a large improvement in some critical characteristic on average, but that greatly increased the variability. Had that setting been chosen, indeed the average would have been improved. On the other hand, (assuming that your company doesn’t sell average customer experience but individual products or service instances) it also could have increased the number of customers getting a nonconforming product or service. In addition, should they buy from you again, they have a tough time predicting what they’re going to get due to the higher variation.
Customers create their image of our quality from the variation they experience around the product or service targets that are important to them.
Here is what I mean by that. If I buy a television from you, I have in my mind a number of critical performance targets in my head. If I get these targets and feel I paid a fair price for them, my impression of your quality is high (even if you actually have rotten quality most of the time). In this case, I might be a returning customer, which in business terms means future revenue without spending further marketing dollars. If you didn’t meet my targets at a fair price, my impression of your quality is low (even if on average you hit those targets). Let’s say that I got a great TV the first time and I buy from you again. If that second TV doesn’t meet my targets, my impression of your quality is low, even though, and perhaps especially because my first TV was great.
This is part of what Taguchi’s loss function is telling us—that there’s incremental cost to process, product, and service variation even within specification.
So, if I have convinced you that we need to understand the variation of our gears, how do we go about doing that?
There are a number of ways of analyzing homoscedasticity. If you can work that word into casual conversation, it will get you both the shiningeyed admiration of your peers as well as a promotion. It just means that the variances are all equal.
What I recommend for reallife data is something known as Levene’s test. It’s easy to understand and perform, and is somewhat robust to departures from normality. (There actually is a better choice for nonnormal data that I want you to read about once the authors publish it.) All we do is calculate the absolute value of the deviations of each observation from the average for each setting, then run the same ANOVA on those data, known as ADA. Many software packages will output this test along with your ANOVA if you ask for it.
MVPstats can automatically report the pvalue for Levene’s test, but even better for instruction purposes, it can generate the ADAs themselves. Once I do that, I then run the ANOVA on the different materials and get this output:
We’re actually measuring dispersion here and using it to tell us about the variance, so our null hypothesis was that the dispersions are equal. Since p is below our Type I error (α = 0.05) we reject that hypothesis and conclude that there are in fact significant differences in the dispersion for the different materials. And in fact, those differences account for a good 22 percent of the variability in the ADAs.
Uhoh. This complicates things. If all the variances had been equal, then any improvement in the average wear was an unalloyed good (pun intended). But what if my highest wear material is also the highest variability one? I might be able to increase my average wear while at the same time getting more parts that fall below my lower specification limit. Not good. This would be as good as careerlimiting or as not playing golf.
So how do I figure out which one or ones are different from the others?
What’s that?
You said posthoc analysis?
And here I thought you had gone to sleep. You’re absolutely correct.
Due to the nature of these data, we decide on the type of posthoc based on whether the sample sizes are equal or not. Because they’re equal, let’s go ahead and perform the Tukey posthoc in PHASTTM. (Remember, you can download and use PHASTTM for free.)
Group Name  Mean  Material 1  Material 2  Material 3  Material 4 
MATERIAL 1  1.531 




MATERIAL 2  2.5  EQUAL (0.243) 



MATERIAL 3  1  EQUAL (0.779)  UNEQUAL (0.02) 


MATERIAL 4  1.563  EQUAL (1)  EQUAL (0.273)  EQUAL (0.74) 

MATERIAL 5  0.875  EQUAL (0.619)  UNEQUAL (0.01)  EQUAL (0.999)  EQUAL (0.575) 
The numbers in parentheses are the pvalue of the test between the group in the row and column.
Or, as before, we can present the results as a table of homogenous subsets:
MATERIAL  S1  S2 
5  0.875 

3  1 

1  1.531  1.531 
4  1.563  1.563 
2  2.5 
(The numbers in these tables are the average of the ADAs, not the variances.)
Using parameter notation and extending the conclusions for the dispersion to the variance, we conclude that σ^{2}_{5} = σ^{2}_{3} = σ^{2}_{1} = σ^{2}_{4} ; σ^{2}_{1} = σ^{2}_{4} = σ^{2}_{2} ; and for sure that σ^{2}_{5} = σ^{2}_{3} < σ^{2}_{2}.
So where do I go from here?
Well, I have to take the information we have so far and use that to estimate the true costs for each material. Remember that most of those materials have a different price from the vendor, and now I know that not only do they have different averages, they also have different variances. How will that affect my decision? I’ll give you two hints. First, you should come up with a point estimate for the average and the variance for each material. There are different ways to do this, but I’ll show you how I pool next month. Then you need to incorporate the losses for being off target for each of the materials, as predicted by the Taguchi loss function. Once I have that, all I have to do is compare the total costs and losses to my current product (Material 5) and choose the best one and estimate the savings.
So where do you go from here?
Hey, I have already done the heavy lifting for you and given you all the information you need, so it ought to be easy for you to figure out which is the ideal material and why. Is it Material 1? After all, it’s cheaper than any of the others. Maybe the losses due to being off target are small enough to where that price break is a net benefit. Or maybe it is Material 4—it’s closest to our ideal target, but on the other hand it’s the most expensive and has some level of variation associated with it. How about Material 3? It seems to have better wear resistance than our current product, low variability, and is in the middle on cost.
So do the analysis and send in what you think the optimum is and how you decided, along with an estimated annual cost savings, if any. To do that, click on the feedback link at the bottom of the article. Don’t fear commitment! Either you are right, and you get kudos, or you are not, and you get to learn something to blow away your coworkers. The first one to send in the correct answer (which one to choose and the annual cost savings) will get credit in my next article, which in turn will probably result in a lead role in a bigbudget Hollywood movie and the type of empty, but wealthy, life we all pretend to loathe.
But I could be wrong.
Special thanks again to MVPstats and PHASTTM for making the analysis easy.