Summary. Currently there is an abundance of style guidelines which apply to desktop applications. Some emerging technologies, such as touch screen driven devices, are challenging these suggested conventions from the past. This study explored the appropriateness of various button and grouping design style characteristics in representing underlying system functionality. Style appropriateness judgments were compared across different scenarios: either mutually exclusive or nonmutually exclusive. Results suggest that grouping style and the number of buttons included within a group strongly influenced preference, especially for the nonmutually exclusive condition.
Technological advances have made many software systems powerful tools, replacing inflexible hardware controls with customizable "soft" controls. Graphical user interfaces (GUIs) encourage users to interact with software counterparts of familiar objects, allowing for a level of system transparency. Guidelines exist for the style and layout of many interface elements, including mutually exclusive and nonmutually exclusive buttons. Mutually exclusive button sets allow users to select only one option, whereas nonmutually exclusive button sets allow users to select more than one option. Often guidelines provide the exact dimensions of the interface objects, a list of approved colors to use, and advice regarding proper alignment and labeling techniques (Galitz, 1997; Marcus, 1995; Microsoft, 1995; Fowler, 1995; Apple, 1992, 2008). However, design characteristics across styles vary, and existing standards are driven largely by best practice or personal opinion rather than by empirical data (Tullis, 1997). Also, many guidelines do not address emerging technologies which require different modes of interaction (such as touch screen devices), and thus, may also require implementation of different design characteristics.
Gaver (1991) mentions the importance of visual cues, such as three-dimensional buttons, that offer object information that can be acted upon. These cues should not be arbitrary, but a refined way of communicating causal information about the features of the interface. Constraints, which limit the number of available possibilities of interaction, also play a role in reducing cognitive workload when users interact with an interface.
Methods of grouping can be seen as a type of constraint, while improving the readability of items and expressing meaningful relationships between elements in the group. Spatial proximity separates elements into discrete chunks, and separate pieces are near to each other in space and appear as a group (Tullis, 1997). Another principle, common region, is the tendency for elements in the same bounded area to be grouped together (Palmer, Brooks, and Nelson, 2003). Uniform connectedness is a grouping principle, which includes figure-ground organizations which tend to be perceived initially as cohesive units. Common properties such as color, texture, or brightness can serve to create a single unit. Highlighting can be viewed as an example of uniform connectedness. Elements grouped by connecting regions (such as line segments) also are seen as distinct units (Palmer and Rock, 1994; Palmer, Brooks and Nelson, 2003).
The purpose of this study was to explore design characteristics demonstrating psychological grouping principles (see Figure 1) by using Thurstone’s paired comparison paradigm. To date, design characteristics of relatively common interface elements have not been examined for their appropriateness in indicating the system’s underlying functionality to the user.
Figure 1. Psychological grouping principles examined in this study.
Twenty participants (13 female, 7 male) from Wichita State University and the surrounding community were randomly assigned to either the mutually exclusive or nonmutually exclusive condition. Thirty percent of the participants were in the 18-23 year old age range, half were in the 24-29 year old age range, and the remainder were in the 30-59 year old range. Most of the participants had more than five years of computer and internet experience.
Fourteen stimuli were created, which incorporated color, dimensionality, and grouping cues from existing GUIs (see Figure 2). For all stimuli, buttons which were in a "selected" or "on" state were shown as cyan, a typical color used for indicating a "selected" status in many interface designs. For half of the stimuli, this was the only cue to the button’s state. The other half of the stimuli had an additional cue: the button was both cyan and depressed. In addition to the two button conditions, there were two levels of four grouping styles. One was a 1-point white border around the buttons, creating a common region. The second and third styles used uniform connectedness to join the buttons into one cohesive unit either by using 1-point white line segments as connectors or by using a 75% black rectangle as a ground or "highlight" for a set of buttons. A fourth style used only proximity to group the buttons. Two levels of each grouping style included either two out of three or three out of three buttons within a group.
All stimuli were created using Adobe Photoshop and Adobe Illustrator CS3. Buttons were sized as 50 pixels square, and finished stimulus files were exported as JPEGs with 96 ppi resolution. The stimulus files were coded into random pairings in MatLab R2008a. An Intel Core 2 CPU PC with a 17" monitor running at a resolution of 1280 x 1024 pixels and a refresh rate of 75 Hz was used for gathering data.
|Grouping Style||No Depressed Button||One Depressed Button|
|Groupings of two|
|Groupings of three|
Figure 2. The fourteen button set stimuli used in this experiment.
Note: Stimuli shown here are 80% actual size.
Participants were asked to complete a short demographic background questionnaire. Then, participants were walked through a brief tutorial which explained the difference between mutually exclusive and nonmutually exclusive, or multiple selection, button sets. After the tutorial, participants completed a short practice exercise, which consisted of eight paired comparison trials and confidence level ratings on a 6-point Likert scale (1 being "not confident" and six being "very confident").
All participants saw the same fourteen stimuli paired in ninety-one uniquely random combinations. There were two blocks of trials to assure that the stimuli were presented equally on both sides of the monitor. After each stimulus presentation, the participant made a choice of which stimulus was more appropriate for their given scenario. Each time a selection was made, a new screen prompted them for their confidence rating. The MatLab program recorded reaction times for both the paired comparison decision process as well as the time needed to record confidence. In addition, the stimuli presented in each pairing, the side they were located on the screen, and which one the participant chose were recorded.
After the participant completed both trial blocks, they were shown all fourteen of the stimuli at once. They were asked to complete a short, online questionnaire that asked about the criteria, or rules, they created for accepting or rejecting the stimuli. The facilitator made notes of these responses and, if further clarification was needed, the participants were asked to elaborate.
Thurstone’s Law of Comparative Judgment (Thurstone, 1927) was used to analyze the paired comparison data and derive scale scores of the overall ranks of the subjective stimulus preferences. Data was screened for internal and inter-rater consistency. Using a chi-square test for significance of these coefficients yielded the following statistics: χ2 (128, N =10) = 153, p = .056 for the mutually exclusive scenario, and χ2 (128, N = 10) = 467.72, p < .001 for the nonmutually exclusive scenario. Chi-square tests indicated a statistically significant level of agreement among the comparative judgments across judges for the nonmutually exclusive condition and a marginally significant level of agreement for the mutually exclusive scenario.
The resulting scale scores for each scenario are displayed on the same scale (Figure 3). The button sets which lie toward the top most often were selected as appropriate, and those that are toward the bottom were selected less often when compared to other stimuli. Stimulus sets that lie close together along the continuum were perceived to possess a high degree of similarity regarding appropriateness of features; those which lie farther apart on the scale indicate a higher degree of dissimilarity. The scales start at zero, which indicates the value of the least frequently chosen stimulus for each of the two conditions. The upper limit is determined by the overall dispersion of choice frequencies for each stimulus, in this case 2.011, which is the upper limit for the nonmutually exclusive condition. It is interesting to note the differences between the two conditions when the scales are placed side by side, as shown in Figure 3.
Figure 3. Comparison of scale scores from each condition.
It appears that in the nonmutually exclusive condition, there is some evidence to indicate that certain characteristics of the given button sets may be perceived as more appropriate than others. The most frequently chosen stimulus, a highlighted group of three including a depressed button, appears to have been perceived as the most appropriate style coupling. The least frequently chosen stimuli included a two-button group and a line segment, which suggests that this style is not perceived as appropriate for the nonmutually exclusive condition (see Figure 4). Perhaps one of the most clear-cut findings is the number of buttons within a group. The top half of the scale contains three-button groupings, while the bottom half of the scale consists of two-button groupings. A cluster of styles in the middle of the continuum suggests that perhaps the three button groupings using line segments and the proximal groupings were perceived as similar, or at least similar in appropriateness for the nonmutually exclusive scenario.
|Mutually Exclusive||Nonmutually Exclusive|
Figure 4. Most and least appropriate stimuli for both conditions.
Note: Stimuli shown here are 80% actual size.
Aside from the number of buttons included in a group, participants consistently preferred the highlight sets, group box sets, and line segment sets in that order. The proximal groupings were both around the mid-point on appropriateness, suggesting that the ambiguous grouping style might be seen as neutral. It also appears that the style of button (whether or not it was depressed) was not a large factor in the choice, as this attribute is dispersed along the continuum with no distinct pattern.
Though the mutually exclusive condition was not significant, it seems as though participants tended to choose groupings of two buttons with some form of "bounding box" around the grouped buttons, such as the highlighted or group box style. Proximal groupings and groupings using a connecting line were not seen as appropriate. It is interesting that participants in both conditions preferred the same style.
This study is simply a first step toward a recommendation for touch screen mutually exclusive and nonmutually exclusive controls. However, there seems to be some clear cut preferences for the chosen attributes in the nonmutually exclusive condition. These findings imply that the correct selection of style could have an effect on GUI design and perhaps the efficiency and satisfaction of a user’s interaction with the system. If the design of nonmutually exclusive controls takes into account combinations of attributes that most effectively communicate the control’s function to the user and fit the user’s mental model, it could contribute to a more useful and usable interface.
One limitation of this study is that it investigated only participants’ initial impressions. If they were allowed to interact with these different button styles, their preferences might change. In fact, preferences may be more heavily weighted by the dynamic nature of the design rather than the physical attributes of the design.
Given the results, a follow-up analysis should be completed to see which attributes were in fact predictive of the stimulus choice. Further research should investigate more button and grouping styles to see whether there are any other attributes that may have a significant contribution to appropriateness. Importantly, the concept of style appropriateness should be investigated in relation to usefulness or intuitiveness. Thus, these styles should be tested in an actual dynamic system to determine the influence of context on user preference and performance, along with measures of satisfaction and accuracy.
Apple. (1992). Macintosh human interface guidelines. Reading, MA: Addison-Wesley
Apple. (2008). Apple human interface guidelines: User experience. Published online: Apple Inc.
Fowler, S. L., and Stanwick, V. R. (1995). The GUI style guide. Boston, MA: AP Professional.
Galitz, W. O. (1997). The essential guide to user interface design: an introduction to GUI design
principles and techniques. New York: John Wiley & Sons, Inc.
Gaver, W. W. (1991). Technology affordances. Proceedings of the CHI Conference on Human
Factors in Computing Systems (pp. 79-84). New York: ACM Press.
Marcus, A. (1995). A comparison of graphical user interfaces. In R. M. Baecker, J. Grudin, W.
A. S. Buxton, and S. Greenberg (Eds.), Readings in human-computer interaction: toward
the year 2000 (pp. 457- 468). San Francisco: Morgan Kaufmann Publishers, Inc.
Microsoft. (1995). The windows interface guidelines for software design. Redmond, WA:
Palmer, S. E., Brooks, J. L., and Nelson, R. (2003). When does grouping happen? Acta
Psychologica, 114, 311 – 330.
Palmer, S., and Rock, I. (1994). Rethinking perceptual organization: the role of uniform
connectedness. Psychonomic Bulletin and Review, 1(1), 29 – 55.
Thurstone, L. L. (1927). Law of comparative judgment. Psychological Review, 273-286.
Tullis, T. S. (1997). Screen design. In M. Helander, T. K. Landauer, & P. Prabhu (Eds.),
Handbook of human computer interaction (2nd ed., pp. 503 – 531). New York: Elsevier.