The chart isn't strange, it's just misunderstood.

Over the past couple of articles, we have explored how an incomplete understanding of how SPC limits are calculated can lead to constructing control charts that look strange.  But using some of the things I mentioned, hopefully you can see that these “strange” control charts actually reveal quite interesting information about what is going on (and what to do about it).  In the last article I left you with a weird looking control chart to see if you could figure out what was going on in the process.  Instead of throwing out the chart and concluding that “SPC doesn’t work here,” let’s take a look at that and see what we could have learned about the process.

Here is the X-bar and R chart similar to the one I left you with last time: 
Figure 1 - A Strange X-bar and R Chart

I am getting an out-of-control signal indicated by the red points on the means plot – the means are staying in the center third of the chart (within ± 1 standard error).  And yes, it is out of control, since we don’t expect to see the means that close to the centerline.  As we talked about last time, a neat thing about control charts for continuous data like these is that the limits on the means chart are calculated from the dispersion, the range in this case, not the variation of the means (or even the raw data) themselves.  So the dispersion of the means is much less than that predicted by the ranges.

Now, you could treat those means as individuals and put them on an individuals and moving range chart: 
Figure 2 - Plot of Average Thickness as Individuals

This chart might lead you to think that the process is stable.  And in a way it is, but this may be the right answer to the wrong question.  The right question is: Why was the first chart so weird, and what does that tell us about our process?

Like the example in the last article, it has to do with the sources of variability being different between and within samples.  So I’d reason it through like this:

“Hmm, the limits on the mean chart are too wide.  Since the limits come from the average range, the average range is too large compared to the actual variation I see in the means.  So there needs to be some additional component of variability within each sample that is not present between the samples.  What could cause that?”

Depending on the process, there are different things it could be, so I asked my client how the samples were taken.  He said that once every hour they took a sample of the product at five locations across the width of an oven.  I said, “Aha!  Your product varies in thickness across the width of the oven!”  He said he didn’t know – they had never checked that, but it was possible.  I generated control charts for each of the five locations, the two edges, the two quarter points, and the center: 
Figure 3 - X and Moving range in five different oven locations

What we actually have here is five different process streams (well, maybe three since the two quarter points and the two edges might actually be coming from the same distributions) – each location across the width of the oven is in control, but at a different mean thickness. (Notice where the averages of the means are – even without a hypothesis test we know that at least the center and edges are significantly different, since the first chart was out of control.)  When we grouped them together as a “sample,” we were unknowingly putting two components of variation in the range: the common cause variability as well as variation due to being different average thicknesses.  So while the overall average thickness across the entire oven was stable (as we can see from the averages-as-individuals chart), the range was higher than anticipated because the center tended to be thicker than the edges.  This resulted in a large average range which in turn inflated the limits on the X-bar and R chart.

(Stats geek note: Interestingly, in this case a one-way random-effects ANOVA does not signal a difference as it did for the example in the last article.  ANOVA is very powerful since it is a one-tailed test for a non-directional hypothesis, but it is based on the assumption that the within variability truly characterizes the minimum variability in the process.  So in this case, while we get a suspiciously low F-ratio of 0.188, we can’t reject the null of an ANOVA in that direction no matter how crazy low the F-ratio is.)

This turned out to be very important for the business.  The reason that they were interested in thickness is that they were stacking multiple pieces of this product into a box.  If the product was too thin, it rattled around in the box, possibly breaking the product.  If it was too thick, they couldn’t fit enough into the box to meet the minimum amount.

In the short term, they could use their knowledge of the different thicknesses to pack across the width of the oven, rather than down the length, so as to get the right amount at the right height on average.  If they needed a few thinner ones or thicker ones to round out the box, they knew exactly where to find them.  However, this involved a lot of non-value adding time spent sorting, so in the longer term, they would investigate why there was a thickness profile across the width.  By reducing or eliminating the differences, they could standardize the number per box, pack from anywhere in the oven, and reduce the size of the packaging since the overall variation was reduced.

And at this point, I hadn’t yet stepped out onto the factory floor.

If you know how to read them, those “weird-looking” SPC charts can tell you something about a process that you've never seen, something that even the people who run the process don’t know.  It's a useful and valuable skill for those, like Black Belts, that are looking to improve a process with which they are unfamiliar.

Next month, I’ll finally finish off this miniseries with a few other ways that people misinterpret what their control charts are saying or by misusing these tools.