Measurements Part 2

Last month I talked about the “sigma” index and how it can be used. So what could be bad about the “sigma” index?

Never forget that metrics drive behavior and that people are very, very good at finding a way to succeed when given a goal. It just may not be the thing you wanted them to do.

If “sigma” is the prime or only measurement of success for a company or division, think of the behavior that drives. If I find a nonconformance in my area, I don’t want to record it. Worst case, it gets assigned to my area and I get the ding for other people’s output, and you can be sure I won’t ever find another one after that. Best case is that we allocate it to the area that “caused” the nonconformance. In many cases, however, the nonconformity isn’t the product of one area but the output of the system, so there’s no place to accurately allocate anyway. I’m not saying you should avoid allocating costs back to their point of origin when possible. This is critical for understanding your process. But allocating nonconformance back to an area for punishment can drive both the originating area, as well as their friends further down the process, to hide the occurrences. If a process is running in statistical control and still producing nonconformance, it’s the process that needs to change, and punishing the workers won’t help.

Second, “sigma” is intended to be an index of the nonconforming rate for a process. Right off the bat, I have a problem. Conformance to specification is how we used to define quality back when we thought that nonconformance to specification was inevitable and that some level of nonconformance was acceptable to send to the customer. Do you remember the old MIL-STD 105, which told us how to sample to achieve an acceptable quality level (AQL) of some percentage nonconforming over time? This was the era when, if we took a sample of 125 from a batch and found three units that weren’t within specification, we were allowed to send that batch to customers—sending them nonconforming parts knowingly. Remember, this was a military standard. Can you imagine knowingly sending out-of-specification units to anyone these days, especially to the military? They would have you figuratively (or maybe literally) shot.

Well, I happen to know that at least a few businesses can imagine this, because some companies still use MIL-STD 105 and AQLs (not yours, I know).

A more modern definition of quality would be something like this: Reducing variability around a customer-desired target in the absence of defects. This means that quality starts inside the specification, that “barely in spec” is poor quality, and that quality increases as you approach the target. “Outside of spec” isn’t poor quality; it lacks quality of any kind.

It’s possible to have identical nonconforming rates and vastly different conformance to target, but “sigma” as calculated from the nonconformance rate would be the same. Take a look at the graphic below, which would have a short-term “sigma” identical to that of figure 1 from last month as calculated from ppm nonconforming. So the process in last month’s figure 1 is “rated” the same as in the chart below, even though the chances of a customer actually getting what they want (the target) are pretty small. This seems a step backward to outdated quality paradigms.

An exponential distribution

Also, comparing the “sigma” of one process to the “sigma” of another isn’t particularly helpful. If a company were to use “sigma” as a tool for prioritizing their improvement activities, they might find that process A has a “sigma” of 3 and process B a “sigma” of 5. Should they spend money on improving process A? Well, if process A is in-plant toilet paper usage rate and process B is some critical customer requirement, maybe not. The point is that while you can calculate a “sigma” for about anything, you still need to understand the effect that process has on your customers, your business, and your profit before you go prioritizing. Not all “sigmas” are equal.

Next, the creativity of people in a “sigma” environment goes to making their “sigma” look good, rather than to actually making value-added improvements. One way to do this is to inflate your specifications so that your good old bad process is now a six “sigma” process. This is the penultimate evil in the quality sciences—specifications absolutely must be based on customer needs. They must not be based on production’s convenience. Very bad things—such as deaths and injuries—can happen if you do.

I have also seen people use the defects per million opportunities (DPMO) calculation to generate “sigma.” The idea here is that a lot of things can go wrong in a process, so it might be natural to ask how many times we mess up. In a process, there might be 20 steps that can go wrong on a given unit of output. The output might be an assembly, or it might be a form with 20 spaces to fill in. You can calculate DPMO as:

Once you have a DPMO, you can look that up in a “sigma” conversion table to get the “sigma” index.

So if I fill out 100 forms, I have 2,000 opportunities to make a mistake. If I make two mistakes, I have a DPMO of 1,000. This translates to a “sigma” of about 4.6, which sounds pretty good if not all the way where we want to be. (Note that I’m not even going to get into sample sizes necessary to have some level of confidence in my estimate of “sigma” using defect rates.)

But if you ask me, while the DPMO indicates a sigma of 4.6, the real customer-experienced defect rate (defects per unit) is 2 percent, 20 times the 0.1 percent seemingly implied by DPMO. If your management goal was to achieve six “sigma” in your process, what technique would you use to calculate “sigma”?

It gets worse

I have also seen cases where, to improve the DPMO and thus the “sigma,” process owners will monkey with the number of defect opportunities. How many opportunities to make a mistake are there if you’re making televisions? I mean, theoretically you could misplace every electronic component in a television and staple them to your forehead, so do we include that in the number of opportunities? By increasing the number of defect opportunities, our “sigma” seems to get better the more opportunities we say there are. Conversely, if I mistake-proof certain opportunities for defects and eliminate the possibility of a defect from happening, my DPMO could get worse if I take that out of the denominator (I’m dividing by a smaller number making my DPMO go up). Therefore the preferred solution—eliminating the possibility of error—is contraindicated if I’m using DPMO to calculate “sigma.”

Hard-line “sigma” proponents will no doubt say that this is gaming the system and not the intention of “sigma.” I agree. The problem is that, as Deming showed us, people will attempt to game a system where the process output is a function of the process and outside the control of the people who work it.

Another issue I have is the whole assumption of the drifting mean. It seems to me that “sigma” assumes that the mean is going to drift around a lot. If the process has to use an individual chart, well, this chart is relatively insensitive to shifts in the mean and maybe you might see something like that and have a tough time tracking the source down. However, if it’s a process that’s amenable to using one of the X-bar type charts, you really should catch a process that drifts by 1.5 standard deviations pretty quickly and eliminate it. Because the whole purpose of doing a control chart is to find and eliminate the sources of variation that are causing these drifts in a relatively short amount of time, you should have a pretty stable process. So built into the drifting mean is the assumption that you aren’t reacting to a control chart, or that you’re continually recalculating the control limits on the last few points and never seeing these drifts.

I can think of situations where there would be a lot of drifts such as this. Imagine if I’m making a bunch of parts off of numerous fixtures that get changed out all the time. In this case, I might have a large number of small shifts adding up, let’s say, to ± 1.5 standard deviations. But process performance analysis can give you a lot more information to make decisions in cases like this.

The effect of “sigma”

So overall, I’m not a fan of “sigma” as a metric to run a business. It simplifies way too much, so all the information about your process in the raw data is reduced to an index that doesn’t tell you where to start working, and it carries some hidden assumptions that I don’t like.

I have talked to a number of people at companies that started out using “sigma,” and a number of them have since abandoned the measure, for these reasons among others.

For measuring process output, I much prefer using capability indexes Cp, Cpk, and Cpm, or their equivalent performance indexes Pp, Ppk, and Ppm. They’re pretty simple to understand and they give you information about, respectively, the width of the process output as compared to the width of your specification (useful in determining if moving the average is called for), the amount of nonconformance (useful in determining how well you meet the specification), and your ability to hit target (the best measure of how the customer experiences your quality).

Then again, I could be wrong.

Special thanks to Michael Petrovich and his program MVPstats, which makes graphics like this one so easy to generate.