Automating Usability Guidelines

Author: Brook Novak, University of Waikato, 2008
The HTML contents below is a directy copy and past from a LaTeX to HTML converter and contains many artifacts – missing references. Sorry for the inconvenience. Download full version: PDF

Abstract:

Validating whether or not a user interface adheres to a set of user interface guidelines can be exhaustive: an interface and/or set guidelines might be to large, or the guidelines may be written a format that some designer might find difficult to follow. Manually validating an interfaces usability with reference to usability guidelines can also give room for incorrect validation. For example, the interface designer can overlook an invalid element, or misinterpret a guideline. This article discusses the development of a tool called GUAT which automates the guideline validation process for interfaces written in Glade. GUAT is put to the test against a popular suite of open source tools for MySQL database related tasks. Issues with codifying interface guidelines are brought to light during the development of GUAT and while evaluating the results produced from the MySQL tools. Such issues can be generalized with automating interface guidelines for all interfaces that can be parsed.
  • Introduction
    • GUAT
    • Usability Standards
  • Implementing Usability Standards in Code
    • GUAT Framework Overview
    • Implementation Iterations for Validation Tasks
    • Language-based Guidelines
    • Font-based Guidelines
    • Spatial-based Guidelines
    • Guidelines over all of an Application
    • Reporting Results
    • Types of Results
  • Unimplemented Guidelines
    • Guideline Classifications
    • Gnome HIG 2.0 Coverage
    • Section 508 Coverage
  • A real-world test case: MySQL Tools
    • Overview
    • MySQL Administrator
    • MySQL Query Browser
    • MySQL Workbench
    • MySQL Migration Toolkit
    • Results for interfaces shared by multiple tools
  • Future Work
  • Summary of unimplemented guidelines for Gnome HIG 2.0
  • Summary of standards not implemented for Section 508
  • Bibliography

Introduction

Usability guidelines for software interfaces can help improve many aspects of interface design including learnability, consistency and intuitiveness [4]. This clearly can bring benefits to both developers and end-users. But there are few interface designers that know the existence of guidelines [1]. Even if designers are aware of guidelines, it can be overwhelming for them to decide which set of guidelines to choose. Guidelines can be lengthy or unclear, which can make it difficult for developers to verify that their interfaces adhere to all of the guidelines. Another concern of manually verifying guidelines is that it leaves room for human error: designers might have incorrect interpretations of what some of the guidelines specify [3], or if guidelines are not carefully read and understood, subtle but crucial elements could be overlooked. This paper reports on the development of a tool called GUAT (GNOME Usability Analysis Tool), which automates the process for validating user interfaces against a subset of the GNOME Human Interface Guidelines 2.0 (GNOME HIG 2.0) and Section 508 standards.Codifying a full set of interface guidelines for automating checks is a complex problem: some guidelines may rely on human judgement (for example determining whether label concisely describes its associated control). Guidelines can also be partially implemented, where clauses are implemented if they do not depend on clauses that cannot be automated. Implementations can sometimes yield incorrect results (excluding buggy implementations) because the validation of a guideline based on static interface specifications can sometimes at best be inferred. Very few validation implementations for guidelines can be deduced.

An estimated 40% of GNOME HIG 2.0 was implemented, and only a single standard in section 508 could be codified.

A test-driven development process was employed for the implementation process of the selected guidelines to ensure that guidelines could be failed and passed correctly. In chapter [*], GUAT is used against a suite of open source tools developed by MySQLfor providing database-related tasks for MySQL databases. Limitations of GUAT arise from evaluating the results, which are of concern to automated guideline validation tools in general.

Usability Standards

There are many different published interface guidelines. They generally fall into two classes: technology centric and general [2]. Technology centric guidelines are usually platform specific, for example guidelines that are based on conventions used for all interfaces in a specific operating system or virtual machine.Two different guidelines were chosen for this investigation: A Technology-centric set of guidelines called GNOME HIG 2.0 and a general set of guidelines called Section 508.

GNOME HIG 2.0 was a logical choice because it is specific to the GNOME platform – which is Glades target platform as outlined in section [*]. The GNOME HIG 2.0 guidelines cover a range of general usability principles and accessibility related principles1.2. They contain conventional guidelines that are specific for the GNOME desktop environment.

Section 508 is a law that requires Federal agencies’ electronic and information technology is accessible by people who have disabilities1.3. Although Section 508 law is enforced in the U.S. it has been made available to the public through a website to encourage accessibility in IT.

Implementing Usability Standards in Code

The chapter outlines the development process for “codifying” the usability standards.

GUAT Framework Overview

Figure: High level class diagram of extended version of GUAT

In a nutshell, GUAT reads Glade files and outputs results based on the given Glade file contents. Figure [*] shows a high level diagram representing actual classes and relationships as well as other key entities. The GUAT framework takes one or more Glade files as its input. These files are loaded as GladeFile instances. The Validator has a set of ValidationTask‘s which perform a range of usability checks on the GladeFile‘s. The Validator performs all the ValidationTask‘s per GladeFile. When a ValidationTask is finished is assessment it returns a set of Result‘s. A Result is either a pass, warning or error. They contain information about why the result failed (if applicable), as well as any documented references (represented as DocReference‘s) give reason to why the particular content had failed. Line numbers as well as the actual XML content were added to the Result‘s for verbose feedback. Several observers (such as the TextResultWriter listens for results from the Validator and output the results in a specific format.

Document references are specific sections of a specific published usability/accessibility guidelines/standards. It is important that these are provided with error messages so that the developer can become educated in usability/accessibility aspects as these references usually provide reasoning’s for why the guidelines are constituted. To supply sufficient guideline information for each DocReference, not only the document name and section is provided but also a description of the actual guideline as well as an URL that refers to the original guideline documents.

To help make the code cleaner and easier to follow, an embedded database what written so that all document references for GNOME HIG 2.0 and Section 508 were centralised. In figure [*] shows how the database is an embedded XML file and is queried via a utility class called DocReferenceRepository. Every time a document reference needs to be created the DocReferenceRepository supplies the appropriate document reference given a guideline name (for example “Section 508”) and a section number. This was helpful for keeping document reference data consistent (since specific guidelines can be requested more than once through the ValidationTask implementations) and also for keeping track of which references were implemented.

An abstract class called AbstractValidationTask developed cumulatively over time while implementing the standards to help with common parsing tasks, for example extracting access keys from a label.

Implementation Iterations for Validation Tasks

A test-driven development process was employed to ensure that the validation tasks were failing and passing Glade interfaces correctly. Testing validation tasks was of high importance to this study because it has given stronger grounds for results and conclusions conveyed in this study. Unit testing was used for checking utility code and validation tasks. A fail model was created in Glade Interface Designer, which breached all the guidelines that were codified one or more times. Unit tests were written which ran the validation tasks against a fail model, the results were checked that the exact amount of expected failures for each guideline were raised. Coding the guidelines was performed in iterations; where each iteration went through the following steps:

  1. Select a section and fully read it – noting codable guidelines.
  2. Add each guideline into the document reference XML database.
  3. Using the Glade Interface Designer breach the selected constraints for the fail model at least once.
  4. Write unit tests to assert expected number of failures in fail model for selected guidelines.
  5. Discover the Glade XML elements needed to be checked in order to validate the selected guidelines.
  6. Write code to validate the selected guidelines.
  7. Run unit tests: enter test->debug cycles until all tests pass.
  8. Back to step 1.

The final iteration checked that all guidelines are passable. A pass model was created in Glade Interface Designer once all guidelines were implemented. The pass model started as a copy of the fail model and was continuously amended until all errors and warnings were overcome. There were some instances where tasks were found to be impossible to satisfy because of bugs. The code was amended accordingly so that eventually all coded guidelines passed in the pass model and failed in the fail model.

A unit test was written to check that all document references in the document reference database were considered in the tests so that no guideline was missed. This helped keep track of that guidelines that were implemented.

Glade’s XML format is not publicly documented anywhere. This made it more difficult to codifying standards because relevant XML elements had to be discovered by inspecting existing Glade files. The problem became more sever because Glade XML omits all elements with default values. The Glade Interface Designer really helped with this because it provided all the possible Glade options and widgets. By enumerating over all the possible values and inspecting the Glade XML all the XML elements, attributes and values were discovered as well as the defaults. This highlights the importance of providing documentation for file formats – especially for open source software where developers may want to write software concerning file contents. It is possible that some Glade XML elements were missed in step 5 during the task implementation iterations (see above). Thus it would have been not only been easier to develop an automated guideline check for Glade if it were well documented but more robust.

Language-based Guidelines

There were many guidelines through out the Gnome HIG 2.0 reference which required text to follow sentence or header capitalization conventions. For example, a guideline section 11.1.3 requires all menu items and titles to use header capitalizationREFERENCE NEEDED. Section 8.3.2 outlines the two capitalization as follows:

Header capitalization
Capitalize all words in the element, with the following exceptions:

  • Articles: a, an, the.
  • Conjunctions: and, but, for, not, so, yet …
  • Prepositions of three or fewer letters: at, for, by, in, to …
Sentence capitalization
Capitalize the first letter of the first word, and any other words normally capitalized in sentences, such as application names.

Figure: Linguistic services in GUAT

Checking for capitals after each white-space or word-breaking punctuation would not be sufficient for validating the capitalisation guidelines because types of words must be considered. So to validate whether or not widget adheres to these types of guidelines a language utility was developed. Figure [*] shows a class diagram of linguistic services added to GUAT. The GnomeLanguageManager is a singleton which provides the interface for all linguistic services outside the package. All validation tasks query for sentence and header capitalization, as well as spell checks, via the GnomeLanguageManager. Before such services can be invoked a language must be set. At this stage only an English implementation has been implemented (GnomeEnglishLanguageUtil in Figure [*]). Other languages can be supported by implementing the GnomeLanguageUtil interface and not having to alter validating task implementations. Not all languages will share the same rules as English when it comes to capitalization (as noted in section 8.3.2), for example, Swedish has no concept of Header capitalization.The GnomeEnglishLanguageUtil supports multiple dictionaries: for example US, Canada or UK dictionaries. These are selected via the GnomeLanguageManager. In order to classify word types three types of dictionaries must be given; the dictionary of all words, all prepositions and all conjunctions. This gives the means to validate header capitalization. Only one dictionary was gathered for this tool but would be very simple to add support of other dictionaries.

Font-based Guidelines

There were guidelines that specifically concerning the use of fonts. For example, a guideline in section 8.4 states: “Only use the fonts that the user has specified in their theme, and in sizes relative to the default size specified in their theme. This will ensure maximum legibility and accessibility for all users.”. Glade uses Pango for specifying styles, fonts and sizes for all text elements (where applicable). Pango is an HTML-like markup language which uses the same escape tags as HTML. A Pango parsing utility was written to determine the fonts, sizes, styles and escape tags for pango-enabled text elements. This was not only necessary so that font-related guidelines could be validated but all guidelines to that involved text that was permit-able to use Pango with because the Pango markup had to be stripped and tags translated in order to extract the content relevant for language based guidelines. For example, text with Pango markup would fail capitalization and standard sentencing rules if not properly stripped/translated.
There were many guidelines concerned with spatial aspects of widgets. For example many guidelines states that a specific widget must have a label whether twelve pixels to the left (centered vertically with widget) or 6 pixels above (left aligned with widget). Glade provides a range of containers which layout widget positions and usually the sizes as well. Widgets spatial qualities are abstractly defined as “packing” – which is interpreted by a widgets parent container to determine its position and size. Although all of the types of packing and containers were considered so that spatial-based guidelines could be verified, a degree of uncertainties were introduced. Widgets widths and heights can be naturally requested rather than a fixed value, these natural dimension requests are the default values. A guideline in section 10.1.1 in GNOME HIG 2.0 states that target-able widgets sizes should not be hard-coded so that people who use themes that set larger-sized widgets can be used. Hence it is not common for widgets to have fixed dimensions. Implementations for guidelines that depended on widget dimensions therefore cannot be guaranteed to be correct since natural widths and heights usually had to be inferred.

Guidelines over all of an Application

A guideline in section 8.2.3 in Gnome HIG 2.0 states: “Be consistent. Use the same spacing, alignment, and component sizes in all dialogs appearing in your application…”. Glade files can contain multiple windows, but an application can specify the entire applications interface over many Glade files. For example many (or possibly all) of the standard GNOME applications use one Glade file per window. GUAT was therefore extended to support validation of guidelines over all glade files provided as input. Such tasks are refereed to as global tasks.After all Glade files are processed, PostValidationTask‘s are run. As indicated in Figure [*], a PostValidationTask is a ValidationTask. Initially they are treated like an ordinary ValidationTask by the Validator: where they inspect Glade files one by one. The Validator then invokes a post-validation method. All information extracted from the first (core) validation phase is then used. This therefore requires that all files passed into a GUAT session should be from the same application (or many applications in the same software suite for keeping layouts consistent).

Reporting Results

As shown in figure [*], there are five formats added to GUAT for reporting validation results: console output, HTML output, text file output (results and statistics) and Comma Separated Value (CSV) output. The CSV format gives the ability to perform statically analysis on results.
Sorry – see pdf, binaries or here
Figure: Verbose results in HTML format
Sorry – see pdf, binaries or here
Figure: Warning message noting that results are only based from a subset of the selected guidelines

[*] shows some sample results written in HTML format. It was important to note to the users that the guidelines were only a subset of the full guidelines. This was achieve be adding a note in the footer of the HTML document as shown in Figure [*]. The layout was refined while amending the pass model as discussed in Section [*]. The code was useful for testing cycles, but may not be useful for target users as they would probably not directly edit the actual Glade XML.

Types of Results

There were two levels of failures: errors and warnings. Section 508 or GNOME HIG 2.0 does not specify warning levels or severity of guidelines or standards. Therefore almost every guideline was considered as an error. However for some guidelines that has a low level of certainty that the guideline would also be correctly validated were set as warnings. Only experience would note on what guidelines to set as warnings for Section 508 and GNOME HIG 2.0.
Not all guidelines or standards could be validated purely by inspecting the contents of Glade files. In this chapter establishes classifications of for guidelines in terms of automation, and summarizes that guidelines that was not implemented.

Guideline Classifications

There were five different classifications realized while implementing HIG and Section 508 guidelines for Glade:

Human judgement
There were many guidelines that could not be automated because they required human judgement. Such guidelines are generally opinionated, where an interface could be argued whether they pass or fail. Three different classes of human judgement were realized: visual, textual and conditional:

Visual
Based on the visual content of an interface element, where even when supplied with visual data (such as pixels) a compelling result can be inferred. For example, a guideline that states the design of icon for a button must be suggestive of what the action of clicking that button would do. Even if it were possible to use state of the art image classification techniques, such guidelines are opinionated; therefore cannot be reduced to a pass or fail result.
Textual
Based on the meaning of textual content. For example, a guideline that states that tip tool texts for button widgets must clearly describe its function.
Content Relationships
Based on whether content is related by the shared purpose of the widget. For example, a guideline that states that views with a lot of widgets should be positioned in groups, where a group of widgets share a similar purpose.
Dynamic
Guidelines that are only possible to validate by accessing the behaviour of interface. For example, whether the click of a button responds with visual feedback, and responds within a specific time threshold.
Impossible to violate
Guidelines which cannot possibly be violated with Glade due to Gtk and the Glade interfaces both enforce such guidelines. For example, guidelines stating that windows should have close boxes as the right-most button on the top of a window. Such guidelines are conventional, that is, specific to a platform or language or general conventions that have been used by many software interfaces in the past. This classification can be abstracted for any format of interface markup, for example Microsoft User Interface guidelines might be impossible to violate if using .NET for the interface specification.
Automatable but out of scope
These are guidelines were it is conceivable to automate them (that is, they do not require human judgement or behaviour assessment), but depends on information not supplied within Glade files. For example, a guideline stating that the application should provide a desktop item (shortcut) within the applications menu cannot be validated by looking at the glade files, however it could be validated given the (automated) installer (by checking the application menu on the Gnome desktop after installation). This classification can be abstracted for any format of interface markup.
Statically Automatable
Guidelines that can be codified given only the interface markup, not the source or binaries. Such guidelines can be implemented and are implemented in GUAT.

Figure: Venn diagram for guideline classifications

Figure [*] illustrates the crossovers with the different classifications. It highlights that guidelines can fall into more than one classification, for example a guideline could be classified as being both dynamic and requiring human judgement. The diagram also shows which classifications are mutually exclusive, for example a guideline cannot be classified as both dynamic and statically automatable; as their very definitions are converse. Also note how Figure [*] indicates that guidelines for all classifications can be impossible to violate depending on the interface specification and guideline.

Gnome HIG 2.0 Coverage

A total of 143 of the Gnome HIG 2.0 guidelines were codified. Many of which can be violated in multiple ways. For example, section X can be violated if there is no label to the left of or above of a slider, and if there is a label left of or above of the slider, the guideline can be still violated if the label does not have an access key, or if the label does not follow sentence capitalization rules.

A approximated method was used for quantifying how much of Gnome HIG 2.0 was codified. Gnome HIG 2.0 contain 13 major sections. The following sections were discounted from being regarded as specifying one or more guideline:

Section 1: Usability Principles
Provides a background to the guidelines. Although some guidelines are mentioned within this section they are only used as concrete examples and are covered in later sections (that are already considered).
Section 12: Checklists
Essentially a guide for designing interfaces accordingly to the Gnome HIG 2.0 reference.
Section 13: Credits
The Contains credits of the authors, reviewers and contributors.

The total amount of guidelines were estimated by counting all listed items under the “guidelines” headings in the guideline documents. For sections that did not have guideline heading, they were counted as only having one outline. It is estimated that are total of 354 guidelines in the Gnome HIG 2.0, giving an estimation of 40% of the Gnome HIG 2.0 being covered.

The method of counting guidelines is not accurate because some guidelines can be listed in other forms such as tables or paragraphs. Thus the estimation of coverage for Gnome HIG 2.0 is an over-estimation of what was actually covered. However there were many guidelines that could not be breached (as outlined below) because they were handled by Gtk, such guidelines are thus implicitly verified as passes, thus by excluding these guidelines as implemented when quantifying coverage the estimation is an underestimate. Although overall it still would probably be an over estimation.

Appendix [*] summaries the major areas of guidelines that could not be implemented.

Section 508 Coverage

Only one standard in section 508 was able to be codified in GUAT: standard G under section 1194.21, which states: “Sufficient information about a user interface element including the identity, operation and state of the element shall be available to assistive technology. When an image represents a program element, the information conveyed by the image must also be available in text.” This standard code be statically violated where icon elements did not provide a textual alternative (such as a tool-tip or an alternative label). It was more difficult to make a connection between the Section 508 standards and the Glade markup than it was for Gnome HIG 2.0 because of Section 508 being more abstract.Section 508 is broken for into four sub parts Below outlines the sub parts and what sub parts were considered as being able to be validated in code:

Sub-part A
General: Essentially outlines the scope and coverage of the standards. Does not specify standards that can be validated in any form.
Subpart B
Technical standards: Contains several sub sections of standards.
Subpart C
Functional performance criteria: Contains several sub sections of standards.
Subpart D
Information, documentation, and support: These are more service based standards rather than the actual interface itself – hence not material.

Sub part B and C contains statards which can be conceivably validated automatically. However the majority of these are not relevent to application interfaces, only subsection 1194.21: Software applications and operating systems, were relevent for subpart C. Sub section 1194.22 for example was not relevent as it was specific to web applications. All except for one of the standards outline in sub section 1194.21 and subpart C were unable to be codified because only one of the stardards was classified as being statically automatable. See Appendix [*] for summary of Section 508 standards that were not implemented.

A real-world test case: MySQL Tools

GUAT was implemented with a test-driven style of development, however only two models were used for testing and those models were somewhat biased in that they were developed specifically for tests. This chapter evaluates GUAT by validating a popular suite of GUI tools released by MySQL for database development. Limitations of GUAT are discovered, which are limitations that are of concern for all types of automated tools in general.There are four open source tools available to the general public which use Glade on Linux ports. Each of the tools source code includes the Glade files for all of the interfaces. The four tools are MySQL Administrator, MySQL Query Browser, MySQL Workbench and MySQL Migration Toolkit.

Section [*] gives an overview of the results cumulatively across all interfaces. The following sections evaluates the results for each tool in turn.

Overview

A total of 2,025 errors and 27 warnings resulted from validating the 32 glade files that specifics all of the MySQL tools. These errors and warnings collectively referenced to a total of 2,692 guidelines in both the GNOME HIG 2.0 and Section 508 (note that a single result can have multiple document references). A total of 59 unique guidelines were violated.

Figure: Summary of sections violated by MySQL tools

Figure [*] plots all of the guidelines violated by (top most) section name. Note that Section 508 refers to the one and only violatable section outlined in section TODO: The ref for that section, all other sections refer to the Gnome HIG 2.0 reference. As shown in Figure [*] the majority of violations are attributed to visual design guidelines. These failures are mostly spatial-based guidelines (under section 8.2: Window Layout) which are validated with an element of uncertainly as described in section TODO: Spatial guidelines ref. The controls section is also responsible for a relatively significant amount of violations. This would be because the controls section contains most of the guidelines as there are 19 sub-sections under controls (for example section 6.14 covers all guidelines with Listbox controls).

Table: Top five sections that had most violations over all of the MySQL tools (all GNOME HIG 2.0)

Section Description Violations
8.2.3.a As a basic rule of thumb, leave space between user interface components in increments of 6 pixels, going up as the relationship between related elements becomes more distant. For example, between icon labels and associated graphics within an icon, 6 pixels are adequate. Between labels and associated components, leave 12 horizontal pixels. For vertical spacing between groups of components, 18 pixels is adequate. A general padding of 12 pixels is recommended between the contents of a dialog window and the window borders. 454
11.1.1.d Apply standard capitalization rules. See Section 8.3.2 – Capitalization for guidelines about capitalization of user interface labels. 287
6.7.a Label all buttons with imperative verbs, using header capitalization. For example, Save, Sort or Update Now. Provide an access key in the label that allows the user to directly activate the button from the keyboard. 218
8.2.2.d Assign access keys to all editable controls. Ensure that using the access key focuses its associated control. 195
6.19.a Before you add a frame with a visible border or separator to any window, consider carefully if you really need it. It is usually better to do without, if the groups can be separated by space alone. Do not use frames and separators to compensate for poor control layout or alignment. 160

Table [*] contains the five guidelines that were violate the most cumulatively over all of the Glade files – which happen to be all GNOME HIG 2.0 guidelines. Section 8.2.3.a is a window layout guideline, for which the GUAT implementation has a higher level certainty of producing genuine results than other spatial based guidelines. The overwhelming amount of window layout based errors might suggest that many false negatives are being produced by incorrect implementation, however these guidelines apply to every widget in a windows and therefore are validated the most. Furthermore this error can cascade: thus one adjustment to the error-some user interface could fix multiple violations of these guidelines.

MySQL Administrator

The MySQL Administrator tool allows users to administer and monitor their MySQL environments. Figure [*] shows an annotated screen-shot of MySQL Administrator’s main window in the Glade Interface Designer. It highlights some of the violations detected by GUAT.

Figure: Some of the violations detected in MySQL Administrator’s main window

Figure: Screen-shot of MySQL Administrator’s main window at run-time (Microsoft Windows port)

The MySQL Administrator tool contains 15 windows within 14 glade files, which excludes common interfaces shared between tools (see section TODO: Sec ref for commons). Over all of the Glade interfaces 751 errors and 19 warnings, violating guidelines 996 times4.1. The guideline that was violated the most was GNOME HIG 2.0 section 8.2.3.a with 239 violations.

The window in Figure [*] has an error regarding the use of Frames within Frames. A guideline in GNOME HIG 2.0 Section 6.19 states that frames should not be nested to avoid visual clutter. The purpose of a Frame in Glade is to group widgets together. However the window in Figure [*] appears to use a Frame as a placeholder for placing other interface content to be loaded in at run-time (shown in figure [*]). Note that the dynamically placed content in the Frame shown Figure [*] is validated because it is common content defined in separate Glade files. It is arguable that the inner-frame error is not really a violation because it does not seem to be the designers intent to use the Frames in the ways described under Section 6.19 in GNOME HIG 2.0. That is, the context of the Frame in the window in Figure [*] is different to the context of a Frame covered in the guideline.

MySQL Query Browser

The MySQL Query Browser is visual tool for creating, executing, and optimizing SQL queries for MySQL Database Servers. Figure [*] shows an annotated screen-shot of MySQL Query Browser’s main window in the Glade Interface Designer. It highlights some of the violations detected by GUAT.

Figure: some of the violations detected in MySQL Query Browser’s “work area” window

The MySQL Query Browser contains seven windows within four glade files (excluding common interfaces shared with other tools). Over all of the Glade interfaces 292 errors and five warnings, violating guidelines 377 times. The guideline that was violated the most was GNOME HIG 2.0 section 6.7.a with 77 violations.

The toolbars window Figure [*] are specified at horizontal box containers containing buttons with labels and/or icons, as well as separators for grouping and aesthetics. GUAT however does not interpret these buttons and labels as toolbars: only toolbar and toolbutton widgets are considered to be toolbar elements. Notice that in Figure [*] there are no guidelines violated under the GNOME HIG 2.0 toolbar section (section 5) because MySQL chooses to specify all their toolbars in this way. A Toolbar is a UI concept which can be composed of other UI concepts (for example strategically aligned buttons) in many ways. This is a limitation of GUAT: as it does not understand what the purpose/intent of a widget. This issue can be abstracted for other concepts in other Interface specification systems.

A keyboard navigation guideline in GNOME HIG 2.0 is violated because the menu item “Bookmark” in the “work area” window assigns a keyboard accelerated reserved for the common application menu item “Add Bookmark”. Clearly the designer was on the right track, it is arguable that it is not really an error because there is no “Remove Bookmark” present therefore it might be obvious that invoking “Bookmark” will add a bookmark. A better error message would be to notify the designer that they should rename the label instead of raising the error based on the keyboard navigation guideline.

MySQL Workbench

MySQL Workbench provides facilities such as visual database design, generation and documentation. Figure [*] shows an annotated screen-shot of MySQL Workbench’s main window in the Glade Interface Designer. It highlights some of the violations detected by GUAT.

Some of the violations detected in MySQL Workbench’s main window

The MySQL Workbench contains four glade files specify one window per file (excluding common interfaces shared with other tools). Over all of the Glade interfaces 371 errors and seven warnings, violating guidelines 531 times. The guideline that was violated the most was GNOME HIG 2.0 section 11.1.1.d with 77 violations, mostly due to menu items not following the appropriate capitalization rules. GUAT detects 55 failures based from the GNOME HIG 2.0 Window Layout section (a visual design subsection) for the window in Figure [*].

The combobox controls horizontally aligned on the left of the “Tool Options:” in Figure [*] actually reside in a tabbed notebook control (a Gtk control that manages tab pages). There are six different tab pages within this notebook: where each tab page is missing a label. The missing labels are considered an error as a guideline in the notebook controls section of the GNOME HIG 2.0 guidelines states that tab pages must provide labels (section 6.16). The designer however intends to select the tabs through another mechanism thus have hidden labels. This use of tabbed notebooks was not considered when developing GUAT and would be a quick fix to only raise an error if the tabs are visible.

The combobox controls horizontally aligned on the left of the “Tool Options:” in Figure [*] do not contain any items. This violates a GNOME HIG 2.0 guideline that states that comboboxes should not be used for storing less than three items. However these are combo boxes are populated at run-time thus may not violate the guideline. An exception could be introduced to ignore cases where comboboxes are empty since it is likely that the contents such comboboxes are not statically defined.

Another case with the violation of a keyboard navigation guideline as defined in section [*] occurred where a menu item called “Delete Selected” assigns a keyboard accelerated assigned for the common menu item called “Delete”. Once again it could be argued that this guideline is not breached, and in this particular case by changing it to just “Delete” might be less intuitive and thus worse off in terms on usability that if it did not use the standard labelling.

MySQL Migration Toolkit

The MySQL Migration Toolkit provides facilities to migrate proprietary databases to MySQL. Figure [*] is a screen-shot of a window for the Migration Toolkit in the Glade Interface Designer. It highlights all five of the violations detected by GUAT.

Figure: All violations detected in MySQL Migration Object Shell window

The migration toolkit only consisted of a single window (as shown in Figure [*]). Most of the migration toolkit content is defined in the common interfaces shared by multiple tools. The next section covers such common interfaces.

Results for interfaces shared by multiple tools

The MySQL suite of graphical database tools share many interfaces: there are nine Glade-files/windows that are reused amongst all of the suite. Figure [*] shows an annotated screen-shot of a connect dialog shared amongst most of the tools.

Figure: Some of the violations detected in the MySQL connect dialog (a common interface)

Over all of the Glade interfaces 608 errors and eight warnings, violating guidelines 809 times. The guideline that was violated the most was GNOME HIG 2.0 section 8.2.3.a with 110 violations.

Once again there is a combo box which is dynamically populated – thus has an error as shown in Figure [*].

Future Work

Glade uses the Accessibility Toolkit (ATK) which could open up more areas to do with automating accessibility based guidelines.Glade interfaces specify signals to point to methods in code. Use of signals could be the road to dynamic checking with Glade. For example a external tool could invoke the signals per widget and wait for a response. However some signals might lead to more complex responses that requires human intervention, for example an input dialog that leads to a range of many different responses.

Support for multiple sets of guidelines is another area of future work. With the more guidelines supported the more problems are faced with conflicting guidelines thus yielding unsatisfiable errors [3]. Even duplication of errors might become overkill in identify usability or accessibility error in interfaces.

Bibliography

[1] A. Beirekdar, M. Keita, M. Noirhomme, F. Randolet, J. Vanderdonckt, and C. Mariage. Flexible reporting for automated usability and accessibility evaluation of web sites. In HUMAN-COMPUTER INTERACTION – INTERACT 2005, PROCEEDINGS, volume 3585, pages 281–294, 2005.
[2] Scott Henninger, Kyle Haynes, and Michael W. Reith. A framework for developing experience-based usability guidelines. In DIS ’95: Proceedingsof the 1st conference on Designing interactive systems, pages 43–53, New York, NY, USA, 1995. ACM.
[3] Vanderdonckt J. Development milestones towards a tool for working with guidelines. Interacting with Computers, 12:81–118(38), November 1999.
[4] P. Reed, K. Holdaway, S. Isensee, E. Buie, J. Fox, J. Williams, and A. Lund. User interface guidelines and standards: progress, issues, and prospects. Interacting with Computers, 12(2):119–142, November 1999.

Appendices

Summary of unimplemented guidelines for Gnome HIG 2.0

Sections Description Reasoning
2 Covers how an application should integrate with the Desktop once installed. These guidelines are all out of scope since such information is not included in Glade markup.
3 Covers all aspects of windows: how they should look, what buttons there should be and what types of titles should be used – all depending on the type of window and/or type of application. Most of this section cannot be automated because information is out of scope: window types are not specified (although there are window hints, there are not still specific enough). Many guidelines are also impossible to breach because of Glade limiting the way windows can be customized. Naming of windows require human judgment to determine whether or not a title fits the application according to the type of application. This section also covers modality: stating that application should not use system modal windows, however such guidelines are impossible to violate due to Glade only supporting Application modal windows.
Sections Description Reasoning
6.3 Covers control sensitivity in terms of enabled/locked states for controls. These guidelines are classified as requiring dynamic assessment.
7 This section is about feedback. It has the areas: including characteristics of responsive applications, acceptable response times, responding to user requests, types of visual feedback, choosing appropriate feedback and allowing interruptions. The majority of the feedback sections were unable to be statically automated because they were dynamic. This entire section could not be codified because they all were classified as either or a combination of requiring human judgement or dynamic assessment.
8.1 Covers some visual accessibility topics. Defines a recommended color palette to be used for all elements of an interface. Mentions they color should not be the only means to convey information. These guidelines essentially require visual human judgement. Although tools like vischeckREFERENCE NEEDEDcan simulate color-related visual impairments, such tools still require human intervention to determine whether or not an interface passes such guidelines. All widget colors except for Pango text and images are defined by user-themes, thus many guidelines cannot be violated for the majority of glade interfaces.
9 Covers many aspects of icons. Including styles of icons (perspectives and lighting), kinds of icons, designing effective icons and making icons accessible. This entire section can be classified as requiring visual human judgement. The accessibility guidelines could partially be automated since it states that high and low contrast icons should be provided, however a high/low contrast design could still be no better off than a color design. This section is very opinionated, for example accessibility icons should be a metaphor for the original icons.
Sections Description Reasoning
10.1 Covers aspects of mouse interaction such as drag and drop behavior or widget/data selection. Some of this section was codable (for example checking for small mouse-targets), but the majority of guidelines were classified as being dynamic due to most of the content related to behaviour of interface elements with the mouse.
11.1 Covers language aspects of labelling controls, such as keeping descriptions concise. Many of these guidelines are classified as textual human judgment. Some guidelines require human judgement to determine whether a text label is related to a control or not.
11.2 Covers error and warning messages: essentially defining what is considered a good warning or error message. These guidelines are classified as textual human judgment. They are also classified as dynamic because errors messages and warnings are usually created at runtime.

Summary of standards not implemented for Section 508

Sections Description Reasoning
1194.21 Technical standards for software applications and operating systems. Most of these require a combination of human visual judgement and the ability to dynamically assess the interfaces. Note that standard A in this section, which states “When software is designed to run on a system that has a keyboard, product functions shall be executable from a keyboard where the function itself or the result of performing a function can be discerned textually.”, is almost impossible to breach since all Gnome elements are accessible with a keyboard by default. However it is conceivable that functions could be implemented without a graphical interface which would be out of scope and require dynamic assessment anyway.
Sections Description Reasoning
1194.31 All of subpart C: Functional performance criteria. Covers how modes of operation and information retrieval should be accessible in different ways to accommodate for users who are visually or orally impaired. These are a mix of dynamic and visual human judgement classed guidelines.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s