Understanding variation is fundamental to process management

Steve DSteve D PQ Systems Employee
edited October 2017 in Quality Discussion

Alternatives to knee-jerk reaction: understanding variation

An article by Barb Cleary on variation

What is known as “point mentality” is a knee-jerk response to what appears to be a problem. We may learn this when we take our child’s temperature and find that it’s high; we are inclined to do something right away—give ibuprofen, orange juice, and bed rest—rather than waiting to see if this is a “trend.”

In fact, sometimes we can’t wait to evaluate patterns in data, but need to respond immediately to a situation. Emergencies require response. But most of the time, we will save time, energy, and other resources by examining data over a period of time and making decisions based on trends or other patterns in the operation of the system itself. The fact that a child fails one third-grade spelling test does not mean that he or she will be a failure in life—or even in spelling.

Every system has variation; some of this is due to the system itself, known as common cause variation; some of it is due to singular incidents or special situations; this is special cause variation. W. Edwards Deming estimated that 94% of problems (or possibilities for improvement) lie with the system as common-cause variation; 6% are special causes. [Out of the Crisis, 315]

Describing variability over a period of time helps one to understand how the system is working, and to predict how it will continue to work in the future. The alternative is a constant tampering with the system, responding to every whim it may have.

Students in a morning class at a school find the room chilly when they come in, and the teacher adjusts the thermostat to a higher temperature. But when afternoon classes meet, students complain that the room is too warm, so the teacher turns the thermostat down. This up-and-down approach to the problem is inefficient; furthermore, it does not solve the problem of temperature control for either group, since by the time the room is warm in the morning, students have moved on to their next class. The same pattern is true for the afternoon class.

By collecting data related to room temperature, then analyzing causes for the variation that ensues, students can see that the variation in room temperature is due to a special cause—tampering with the thermostat. Since the thermostat is turned down at the end of the day and remains low until morning; the temperature in the room is lower in the morning and higher in later hours.

An alternative analysis would involve recording room temperature throughout the day for several days or weeks, without adjusting the thermostat. This data would show that even without changing the thermostat setting, the room warms up naturally when the sun comes into the windows in the afternoon. The next step would be to investigate causes for the variation in temperature during the day, and eventually to come up with a theory for improving the situation.  Mounting the thermostat near the window, rather than on the opposite side of the room, might reduce the variation, since it would record cooler temperatures from outside in the morning, and adjust the interior temperature accordingly, and warmer temperatures in the afternoon, so the thermostat would not trigger as much heat from a furnace.

Before studying the problem, the inclination of students was to assign blame: “Why can’t the first-period class leave the thermostat alone?” and “Tell the afternoon class to turn the temperature up!”  By understanding that much of the variation was due to natural causes, students were able to focus on keeping the temperature steady instead of moving the thermostat abruptly from highs to lows. They recommended to the maintenance staff that the thermostat be moved, and continued to record data to assure themselves of improvement in the system.

If one explains to a room full of people that on the count of 3, everyone should clap simultaneously, do you suppose that there will be only one giant, simultaneous clap? Of course not; there is variation in the system. Instead of a single sound, there will be a rippling sound of applause, no matter how carefully orchestrated the practice seems to be. Can you count on finding exactly 49 pieces of candy in a package of chocolate-covered peanuts? If you examine enough packages, you will find that there may actually be between 47 and 50 pieces in the packages. Are you being cheated?  This is common-cause variation. When you leave your house every morning at exactly 7:12, can you expect to arrive at work at exactly the same time each day? No—in the case of traffic, there will be common-cause variation (timing of traffic lights, pace of traffic, number of cars on the road at the same time) as well as special-cause variation (an accident on the highway, delays in your carpool, flat tires, etc.)  Fleeting events cause special variation. Imperfections in the system itself generate common-cause variation.

The key to improvement lies in understanding this variation, so that decisions can be based on trends in data, rather than only on intuitive reactions. This involves recognizing special causes and distinguishing them from common causes. Without this distinction, managers are likely to make two kinds of mistakes. The first is ascribing a variation or problem to a special cause (e.g., “The operator was late to work”) when it is really due to a common cause (there aren’t enough operators for a particular process). The second involves assuming that variation is due to a common cause, rather than a special one. (319, Deming)

How does one tell the difference between special- and common-cause variation and avoid the mistakes that can ensue from misunderstanding these concepts? The answer lies in the use of control charts, where data is collected and analyzed with respect to trends and patterns that can be acted upon. In the 1920s, Walter Shewhart developed the idea of 3-sigma control charts. Control limits, which are generated by the data itself (collected over time) clarify the distinction between common- and special-cause variation. 

Every system has some degree of variation, as anyone who has thrown darts at a dartboard is clearly aware. The key to improvement lies in understanding the cause of variation, and understanding whether this cause lies in the system itself (the dartboard is not mounted firmly, for example, or the bull’s eye is not clearly discernible in dim lighting), or in a special cause (blindfolding the thrower, perhaps, or moving the target as the dart is in the air).

Once the concept of variation is grasped, one can begin to work on the system, to reduce the amount of variation. This work involves collecting data, studying causes, testing improvement theories, and constantly evaluating the impact of improvement strategies. Jumping in to improvement strategies without understanding variation is like taking ibuprofen for a broken leg: you will never get to a genuine solution of the problem.  

But do consider ibuprofen for a one-time fever.


  • Dr. Deming spoke of "unknown and unknowable" variables. One of these is certainly the % of top management in the USA and Japan who are even aware of the forces of variation and the correct responses to them. How many are aware of the enormous burdens imposed by the standard management improvement metric of "10% more or 10% less"  made with no knowledge or consideration of variation or process capability?
  • The decision makers aren't knowledgeable and the knowledgeable aren't decision makers.  Example: Question: What is the difference between increasing tolerances to accept mean-shifted distributions vs. high-variation distributions? Answer: We inspected 3 parts. Can we increase the tolerances to improve capability?
Sign In or Register to comment.