You have been assigned the task of conducting a comparative usability test to determine whether your company’s new website is better than the previous version of the website. The first question to consider is what defines BETTER. Is your company interested in performance measures, e.g., Can users find information quicker and more accurately on the new site than the older version of the website? Alternatively, is the company more interested in preference measures e.g., Do users prefer the “look and feel” of one site to the other? More realistically, they probably want all of the above and more!
When selecting a comparative test design, there are basically three options:
- between-subjects design
- within-subjects or repeated measures design
- a mixed design
A between-subjects design requires different groups of users to complete representative tasks on each website. For example, one group of eight users would complete tasks with Site A and a different group of eight users would complete the same tasks with Site B. Results would be compared across the two sites.
A within-subject design requires the same group of users to complete a set of tasks on both websites. Depending on how different the user interfaces of the sites are, you may decide to use the same tasks or different tasks, equated in difficulty.
A mixed design involves incorporating aspects of both within-subjects and between-subjects designs. For example, six users may complete tasks on both sites (within-subject design) and then twelve other users would complete tasks on only one of the two sites. For example, six users complete tasks on Site A and six users complete tasks on Site B (between-subjects design).
To decide which design to use, you first need to consider what type of data you want to collect. If the focus of the test is directed at performance measures (time on task, task success) then a between-subjects design may be more appropriate to decrease carryover effects and bias. If preference data is most important, then a within-subjects design may be more appropriate, since users have an opportunity to interact with both sites, make comparisons, and state a preference. Performance data still can be collected in this design, but your results may be biased if the user interface is similar for both sites (and the same tasks are used), since users may have transferred learning from the previous site. All of the pros and cons of each design should be carefully examined before selecting the design for a usability test (see Table 1).
Table 1: Pros and Cons of Comparative Test Designs
|Limit practice or carryover effects||Have to run twice as many subjects as within-subjects design|
|Only need one set of tasks||Do not get preference or comparison data|
|Fewer number of participants||Need to counterbalance sites|
|Get preference data||Fatigue|
|Get comparison data||May need to create two sets of tasks (equal in difficulty)|
|Get the benefit of both preference and performance data without bias||Have to run more participants than within-subjects or between-subjects design alone|
Unfortunately, there is no single recipe to follow to demonstrate your new site is superior to the old one. Careful consideration of your site’s business goals, usability goals, and target population will help you to determine the best way to compare.