If you’ve attended one of my performance management workshops, you know I caution against using the traditional traffic light metaphor to represent performance. Yes, the visual is nearly universally understood (green = good, red = not good) and simple communication is indeed a key to improving performance. However, in many business situations, three states are not sufficient to distinguish between levels of performance.
Consider what happens when a driver approaches an intersection in a car. There are only two potential outcomes – proceed or stop – represented by green and red. The extra state, yellow, indicates the light is about to turn red but it doesn’t actually add another potential outcome: the choices are still proceed or stop.
Officially, when drivers see a yellow light, the should slow down and stop when they get to the intersection. In practice (at least in my home state of California), most drivers proceed through the intersection — presumably with a little more caution than if the light was green. The traffic light has three states but only two outcomes.
In business, our decisions are usually significantly more complicated which makes the traffic light metaphor challenging. A sales region might target selling 100 units of a product every month. If it sold 50 units one month, its poor performance would be represented by red. Similarly, 80 units might yield a yellow to show caution while 90 units would produce a green for good performance. However, what color should it use if it exceeds the target value by selling 150 units?
In sales, as in many other areas, it would be demotivational not to differentiate between exceeding and almost reaching the target. The most natural solution would be to include an additional color. Perhaps dark green could represent exceeding the target. If that happens, you might as well add a fifth color for balance; dark red represents very poor performance.
Including additional states is also a way to handle an issue that can happen when you’re using qualitative KPIs based on surveys. Because people naturally tend to avoid outliers, they bias their answers such that many KPIs end up as yellow. As such, I recommend adding both a dark red and a dark green to the traffic light, which yields five potential states. With five states, even if people avoid outliers, there are still three potential results: light green, yellow, and light red.
For those of you that remember the difference between nominal, ordinal, interval and ratio measurements (statistics, anyone?), there’s also a mathematical reason to prefer five states over three. The short argument is that red/green/yellow are nominal labels that – through convention – have become ordinal rankings with green as best and red as worst. The limitation with ordinal ranks is that, while they show that one result is better than another, they don’t show how much better one result is than another.
For example, if I finish first in a race, no one knows if I won by one second or by one minute. Interval measurements like thermometers (the differences between 80, 81, and 82 degrees are the same) and ratio measurements like age (40 is twice as old as 20 which is twice as old as 10) solve this problem but can’t be represented by a small number of states. The more states you have in an ordinal ranking the more it approximates an interval one. So, 5 is better than 3. Statistics class dismissed.
For some organizations, using a red traffic light to indicate poor performance might send the wrong cultural message. If red is your corporate color, you may not want to associate it with poor performance. These organizations could instead consider the emoticons used in instant messaging tools (smiling face, frowning face) or thumbs up/down images (gladiator style).
All images, including traffic lights, suffer from the fact that end users don’t intuitively understand the grading scales. At what value does a KPI change from red to yellow? How much better is dark green than light green? At the risk of pulling out the statistics book again, we have an ordinal ranking problem. To solve this communication problem, I often recommend that organizations use letter grades (A/B/C/D/F), which are widely understood and self-documenting (90% and above is an “A”). And, for those keeping track, these are interval measurements.
Some of my colleagues have suggested we go one step further and use thermometers. After all, you can often deduce close to the exact value from the height of the mercury. Therein lies my concern with thermometers; they track the actual value rather than the gap between the actual and target. There are times when getting close to the target is ok (fundraising) and other times when it’s not good enough (zero defects in an airplane). I’ll expand on this idea in a later post.
I’m a big fan of visual clues but sometimes it may be better to use words rather than colors, images, or letter grades. A scoring system with intervals like “Best In Class,” “Exceeds Performance,” “On Target,” “Needs Work,” and “Unacceptable” is instantly understandable and therefore may be more likely to be accepted. Acceptance leads to adoption.
In short, there is not a single correct answer. Instead, consider this a caution for automatically using traffic lights to indicate performance.