Understanding variation is fundamental to process management

Alternatives to
knee-jerk reaction: understanding variation
An article by Barb Cleary on variation
What
is known as “point mentality” is a knee-jerk response to what appears to be a
problem. We may learn this when we take our child’s temperature and find that
it’s high; we are inclined to do something right away—give ibuprofen, orange
juice, and bed rest—rather than waiting to see if this is a “trend.”
In
fact, sometimes we can’t wait to evaluate patterns in data, but need to respond
immediately to a situation. Emergencies require response. But most of the time,
we will save time, energy, and other resources by examining data over a period
of time and making decisions based on trends or other patterns in the operation
of the system itself. The fact that a child fails one third-grade spelling test
does not mean that he or she will be a failure in life—or even in spelling.
Every
system has variation; some of this is due to the system itself, known as common
cause variation; some of it is due to singular incidents or special situations;
this is special cause variation. W. Edwards Deming estimated that 94% of
problems (or possibilities for improvement) lie with the system as common-cause
variation; 6% are special causes. [Out of the Crisis, 315]
Describing
variability over a period of time helps one to understand how the system is
working, and to predict how it will continue to work in the future. The
alternative is a constant tampering with the system, responding to every whim
it may have.
Students
in a morning class at a school find the room chilly when they come in, and the
teacher adjusts the thermostat to a higher temperature. But when afternoon
classes meet, students complain that the room is too warm, so the teacher turns
the thermostat down. This up-and-down approach to the problem is inefficient;
furthermore, it does not solve the problem of temperature control for either
group, since by the time the room is warm in the morning, students have moved
on to their next class. The same pattern is true for the afternoon class.
By
collecting data related to room temperature, then analyzing causes for the
variation that ensues, students can see that the variation in room temperature
is due to a special cause—tampering with the thermostat. Since the thermostat
is turned down at the end of the day and remains low until morning; the
temperature in the room is lower in the morning and higher in later hours.
An alternative analysis would involve recording room temperature throughout the day for several days or weeks, without adjusting the thermostat. This data would show that even without changing the thermostat setting, the room warms up naturally when the sun comes into the windows in the afternoon. The next step would be to investigate causes for the variation in temperature during the day, and eventually to come up with a theory for improving the situation. Mounting the thermostat near the window, rather than on the opposite side of the room, might reduce the variation, since it would record cooler temperatures from outside in the morning, and adjust the interior temperature accordingly, and warmer temperatures in the afternoon, so the thermostat would not trigger as much heat from a furnace.
Before studying the problem, the inclination of students was to assign blame: “Why can’t the first-period class leave the thermostat alone?” and “Tell the afternoon class to turn the temperature up!” By understanding that much of the variation was due to natural causes, students were able to focus on keeping the temperature steady instead of moving the thermostat abruptly from highs to lows. They recommended to the maintenance staff that the thermostat be moved, and continued to record data to assure themselves of improvement in the system.
If
one explains to a room full of people that on the count of 3, everyone should
clap simultaneously, do you suppose that there will be only one giant,
simultaneous clap? Of course not; there is variation in the system. Instead of
a single sound, there will be a rippling sound of applause, no matter how
carefully orchestrated the practice seems to be. Can you count on finding exactly
49 pieces of candy in a package of chocolate-covered peanuts? If you examine
enough packages, you will find that there may actually be between 47 and 50
pieces in the packages. Are you being cheated?
This is common-cause variation. When you leave your house every morning
at exactly 7:12, can you expect to arrive at work at exactly the same time each
day? No—in the case of traffic, there will be common-cause variation (timing of
traffic lights, pace of traffic, number of cars on the road at the same time)
as well as special-cause variation (an accident on the highway, delays in your
carpool, flat tires, etc.) Fleeting
events cause special variation. Imperfections in the system itself generate
common-cause variation.
The key to improvement lies in understanding this variation, so that decisions can be based on trends in data, rather than only on intuitive reactions. This involves recognizing special causes and distinguishing them from common causes. Without this distinction, managers are likely to make two kinds of mistakes. The first is ascribing a variation or problem to a special cause (e.g., “The operator was late to work”) when it is really due to a common cause (there aren’t enough operators for a particular process). The second involves assuming that variation is due to a common cause, rather than a special one. (319, Deming)
How does one tell the difference between special- and common-cause variation and avoid the mistakes that can ensue from misunderstanding these concepts? The answer lies in the use of control charts, where data is collected and analyzed with respect to trends and patterns that can be acted upon. In the 1920s, Walter Shewhart developed the idea of 3-sigma control charts. Control limits, which are generated by the data itself (collected over time) clarify the distinction between common- and special-cause variation.
Every system has some degree of variation, as anyone who has thrown darts at a dartboard is clearly aware. The key to improvement lies in understanding the cause of variation, and understanding whether this cause lies in the system itself (the dartboard is not mounted firmly, for example, or the bull’s eye is not clearly discernible in dim lighting), or in a special cause (blindfolding the thrower, perhaps, or moving the target as the dart is in the air).
Once the concept of variation is grasped, one can begin to work on the system, to reduce the amount of variation. This work involves collecting data, studying causes, testing improvement theories, and constantly evaluating the impact of improvement strategies. Jumping in to improvement strategies without understanding variation is like taking ibuprofen for a broken leg: you will never get to a genuine solution of the problem.
But do consider ibuprofen for a one-time fever.
Comments