A/B testing, experiments, HCI, human factors, psychology, Study, UX
Which is Better? A or B?
Nearly everyone in the field of Human-Computer Interaction (related fields are known as Human Factors, and User Experience) has heard of A/B testing. How should we lay out our web pages? Should we have a tool bar? Should it be always visible or only visible on rollover? What type fonts and color schemes should we use?
Clearly A/B testing is useful. However — there are at least two fundamental limitations to A/B testing.
First, for almost any real application, there are way too many choices for each of them to be tested. This is where the experience of the practitioner and/or the knowledge of the field and of human psychology can be very helpful. Your experience and theory can help you make an educated guess about how to prioritize the questions to be studied. Some questions may have an obvious answer. Others might not make much difference. Some questions are more fundamental than others. For instance, if you decided not to use any text at all on your site, it wouldn’t matter which “font” your users prefer.
Second, some decisions interact with others. For instance, you may test a font size in the laboratory with your friends. Just as you suspected, it’s perfectly legible. Then, it turns out that your users are mainly elderly people who use your app while going on cruises or bus tours. In general, the elderly have less acute vision that the friends you studied in the lab. Not only that, you were showing the font on a stable display under steady conditions of illumination. The bus riders are subject to vibration (which also makes reading more difficult) and frequent changes in illumination due to the sun or artificial light being intermittently filtered by trees, buildings, etc. Age, Vibration, and Illumination changes are variables that interact by being positively correlated. In other cases, variables interact in other and more complex ways. For example, increasing stress/motivation at first increases performance. But beyond a certain point, increasing stress or motivation actually decreases performance. This is sometimes known as the Yerkes-Dodson Law (https://en.wikipedia.org/wiki/Yerkes–Dodson_law)
The story doesn’t stop there though. How much stress is optimal partly depends on the novelty and complexity of the task. If it’s a simple or extremely practiced task, quite a bit of stress is the optimal point. Imagine how long you might hang on to a bar for one dollar, for a thousand dollars, or to save yourself from falling to the bottom of a 1000 foot ravine. For a moderately complex task, a moderate level of motivation is optimal. For something completely novel and creative, however, a low level of stress is often optimal.
The real point isn’t about these particular interactions. The more general point is that testing many variables independently will not necessarily result in an optimal overall solution. Experience — your own — and the experiences of others — can help dissect a design problem into those decisions that are likely to be relatively independent of each other and those that must be considered together.
Life itself has apparently “figured out” an interesting way to deal with the issue of the interaction of variables. Genes that work well together end up close together on the chromosome. That means that they are more likely to stay together and not end up on different chromosomes because of cross-over. By contrast, genes that are independent or even have a negative impact, when taken together, tend to end up far apart so that they are likely to be put on different chromosomes.
So, for example, one might expect that a gene for more “feather-like skin” and more “wing-like front legs” might be close to each other while a gene for thicker, heavier bones would be far away.
Clearly, the tricky way variables interact isn’t limited to “User Experience Design” of course. Think of learning a sport such as tennis or golf. You can’t really learn and practice each component of a stroke separately. That’s not the way the body works. If you are turning your hips, for example, as you swing, your arm and hand will feel differently than if you tried to keep them still while you swung.
Do you have any good tips for dealing with interactions of variables? In User Experience or any other domain?
Some experiences in UX/HCI
Chain Saws Make the Best Hair Trimmers
Peter, you reminded me of an old IT suggestion. When a company migrated software, they could not decide if they should migrate all of the report writing examples that had been created. So, they took a bold move and migrated none of them to see which ones people complained about missing. So, instead of creating 172 report examples, they ended up with a about twenty. I thought this was a great idea. Keith
John Thomas said:
I heard of a similar concept at IBM relating to “fixing” bugs. If a bug hasn’t been reported for three years, say, it may not be worthwhile to fix it because the chances of introducing a new bug via the “fix” is greater than the chances of getting more reports.