More on Acceptance Sampling

By the time you read this, the amazingly long U.S. presidential election will be over. All U.S. citizens will be wandering around aimlessly bumping into objects, pressing their hands to the sides of their heads as the indignation poisons slowly leave their bodies, leaving them with a hangover-like malaise and an intermittent need to babble a mishmash of campaign slogans. So, to cheer you up, I present you with the rest of the story I began last month about acceptance sampling.

The second flawed premise

Let’s examine the assumption there even is an acceptable proportion of the lot that is bad or below the acceptable quality level (AQL). If the AQL in this case were 1 percent, that means that in the box that I’m accepting, on average I would have five bad units. I’m pretty sure that if you were a soldier who got the round that went “pfft” instead of “bang,” that you would be ready to do a quality control test on the next batch using the supplier as the target (Oh wait, that would reward the poor quality suppliers, wouldn’t it?).

In our story, though, thankfully we weren’t making primers, but a component that went into consumer electronic devices. A failure of this type would make the end product unusable. So if I’m buying these widgets, I have to support the expenses of a system to test the integrated product and then determine what to do with those that fail. Do I throw them away and lose the money I have in them so far, or put more money into them to rework and maybe make it the second (or more) time through? There’s also the opportunity cost of my assembly process, which just made some bad ones, when it could have made good ones. In this case, I need to have more assembly capacity (a.k.a., unproductive capital and personnel) than I would if they all worked the first time around.

Well, at least I can take solace in having controlled my producer’s risk, right?

Oh! and did I mention that the producer was also losing money based on their acceptance sampling? If a lot’s sample failed, they would rework all the units in the lot and try it again, sometimes three or more times (with the probability of passing decreasing each time, as I will show you below). What do you think that does to the price to the customer and/or profit? This is why the modern definition of quality is minimal variation around a customer-defined target in the absence of defects. Or, in Six Sigma’s (less precise) terms, ultralow nonconformance rates. But hey, this was manufactured in China, so the cost of a “redo” was low, right?

But there’s an even bigger problem lurking right behind us, like on Halloween when we were seven-years old, dressed as knights and princesses, and a teenager dressed like Jason snuck up on us. Remember that?

The binomial distribution upon which acceptance sampling is based is a very useful distribution, because it has only a few assumptions. If your process meets these assumptions, the binomial distribution works. The samples need to be randomly drawn from a large population, classified into one of two mutually exclusive and complete categories (e.g., pass/fail), have a constant probability of occurrence, and have each occurrence unaffected by another (independent). There are a lot of real-life systems that can meet these requirements.

Not this one, though.

This component was a high-tech assembly of very tight-tolerance parts. As with many systems, the probability of failure changes with time, making the binomial only relevant for very short span of time (where the instantaneous failure rate is essentially constant) or a very long span of time (where an average failure rate could be used across the lifetime). In particular, this component experiences a decreasing failure rate after an initial run time. I know this is counterintuitive, but a decreasing failure rate is bad because it means that the failure rate was high to start with, and then decreases to a lower, constant level. What you would rather see is it starting off with the lower failure rate and remaining constant. That’s why the other name for an initial decreasing failure rate is “infant mortality.”

So my buddy would run the acceptance test and pass or fail on what they found. However, this was during a short initial period, where the probability of seeing a failure was small. The component must have been initially fairly robust to what was actually going on—gradually accumulating damage. Once the customer had integrated the part into their product and began their integrated testing, they hit a high failure rate as those components that were going to fail all did so within a fairly short amount of time. If they had let these integrated units run longer, they probably would have seen the failure rate drop off to a lower, more constant, rate. But they didn’t because they went ballistic when they saw a larger proportion fail than allowed by the contract.

Going from acceptance-based quality control to quality assurance
Hopefully, at this point I have convinced you that acceptance sampling for attributes is dangerous for two reasons. The first is a misunderstanding of producer’s vs. consumer’s risk, mistakenly thinking that the average outgoing quality level places some sort of limit on the number of defectives you’re sending out. The second is a basic misapplication of even the idea of binomial-based acceptance sampling when the failure rate isn’t constant, and is changing with time.

Well, I promised in my introduction that I was going to cheer you up, right? So let’s examine how to transition from acceptance sampling to a more cost-effective process.

Let’s look at the initial manufacturing process.

Figure 1 - Quality Control - the most expensive way to control a process

We were sampling the output of the process and comparing it to the acceptance testing criteria, reworking lots based on what we found, sorting output, and sending it on. Because defectives are presumed inevitable, no change is made to the process in light of these defectives. This is the most expensive way to control a process. By the time you detect something has gone wrong, you have already made the mistake and compromised the output. Add to this the infant mortality problem, which complicates making a go/no-go decision, and this is a dangerous plan.

You could move to a process control plan. In this case, you monitor the output, and based on what you find, you adjust the process. This is the first phase of process control.

Figure 2 - Process Control I - Good, But Not Delicious

You might go back to your statistical process control training and institute a p-chart to track the stability of your output to determine when to adjust your process. As shown in figure 2, the binomial isn’t applicable because the failure rate changes with time, and the p-chart is based on the binomial distribution, too. Besides, a p-chart is just another after-the-fact determination meaning that you’ve already lost the money.

So where you really need to go is with a type of process control where you can somehow predict what output is going to be based on other factors. In fact, this is the second phase of process control, where you can assure that your customer gets quality.

Figure 3 - Process Control II – Quality Assurance

Here, you don’t measure the output of the process, you monitor the process settings or inputs that control the output. By creating process control charts on these settings, establishing targets and specifications for these critical process variables, and achieving control and capability to meet these requirements, you prevent making bad product in the first place. It’s much, much cheaper to modify a critical process variable when it goes out of control, then to try to repair the product.

What’s the catch? Well, going from product control to process control is pretty easy, but going from process control to quality assurance is tough—it requires a whole new level of process understanding. Once you have process control, you should be able to make hypotheses about process variables and their effect on outputs, test them with experiments, and begin to build that knowledge. As you learn more about the effects different parameters have on your output, you naturally transition to controlling those parameters. I don’t know that I have ever seen a process that has fully transitioned away from some output measurement, but there are many where the quality of the output is a foregone conclusion because all of our critical process variables are on target and in control.

I don’t want to minimize the difficulty of this step, and a sound strategy for doing this must be planned out or it won’t happen. But it can happen, given the knowledge and the will to do so.

The other huge advantage to this is that as you move from attributes data (go/no go) to process variables, when it tends to be continuous-type data, your ability to detect process changes before they become catastrophic is much stronger. This means that much smaller sample sizes are needed to validate that the process is performing to your customers’ expectations; and you can detect when process variables stray long before they stray far enough to cause problems in the product. So you save money on samples, you avoid the cost involved in making defectives, and your customer gets ultrahigh conformance to their target and specs.

People say they want to use attribute sampling to save money. That’s about the most expensive luxury that businesses today cannot afford. Heck, at a big business, the money wasted because of acceptance sampling could easily pay for a new Gulfstream jet with aniline leather seats, and 24k gold-plated fixtures with the optional double-decker hot tub.

I bet you’re wondering what happened to that company. The managers never understood the difference among acceptance sampling, quality control, and quality assurance. This affected the decisions they made for existing products as well as new products they were developing.

In the letter from the bankruptcy lawyers, they say the company had big plans, but “ran out of runway.”

Baloney—the management made decisions, in all ignorance, that put about 200 people out of work, and persisted in staying with those decisions even when shown they were flawed.

Don’t make the same mistake, OK? I want you to keep coming back for more of my heretical maundering.