Better Practice Checklist - 3. Testing Websites with Users
May 2004 (organisational details updated January 2008)
Introduction
Australian Government departments and agencies use a range of new technologies for information provision, service delivery and administration. A common theme amongst examples of excellence in online service delivery is that they have been developed with the users' needs and preferences in mind, and are evaluated and improved upon over time. User testing can be an important part of this development and evaluation process, and can help ensure that online services are as effective as possible.
A key role of the Australian Government Information Management Office (AGIMO), Department of Finance and Deregulation is to identify and promote 'Better Practice'. This checklist has been created to help agencies to consider how they can test their web resources with users.
The checklist suggests that a number of issues should be considered when testing resources with users. The items in the checklist are, however, not mandatory. They have been provided to help agencies to consider users' experience of their web resources when developing and maintaining websites and web-based services.
This checklist is intended to be a guide to staff responsible for websites, as well as for online content creation and presentation. Other managers also may find the checklist useful in dealing with contractors, or where these functions are otherwise outsourced. This checklist focuses on non-technical issues.
It should be noted that the checklist is not intended to be comprehensive. Rather, it highlights key issues for agencies. The checklist is iterative and draws on the expertise and experience of practitioners. The subject matter and issues are reviewed and updated to reflect developments.
Acknowledgments
Originally published by the National Office for the Information Economy (NOIE) in 2001 (Version 1), and updated in 2002, (Version 2), this checklist was revised in 2004 with assistance from Australian Government agencies. In particular, we would like to thank the Department of Employment and Workplace Relations.
Overview of user testing
Web resources can be evaluated using a variety of methods, including the collection of website statistics, the analysis of feedback provided via phone or email, and structured interviews with users. Testing websites with individuals or groups of users can be very effective. Testing can provide useful information that can be used to make highly effective modifications to web resources. User testing can generally be done with minimal resources and produce useful results. Other methods of evaluation can, however, supplement the results of user testing and result in even more accurate assessments of web resource effectiveness. Alternative methods may also include expert evaluation, heuristic review, usability walk-through, surveys and monitoring software.
User testing involves members from the target audience working through a set of tasks using the web resource being evaluated. Usability-testing sessions may be conducted in a usability laboratory, at the person's desk or in a room set up to resemble a usability laboratory. There is normally one person per test session, with the session lasting from 30 minutes to an hour, although longer sessions may be conducted. However, due to OH&S requirements there must be a 15-minute break every 45 minutes. The purpose of the usability test is to evaluate, from the user's perspective, the ease of use and intuitiveness of the resource. As users test the resource, they provide feedback on it, advising about what they like and dislike and about any difficulties they may have while using the site. This information can be used to revise the site in development or redesign.
The quantifiable savings that can be achieved through early and continued usability testing include:
- increase in productivity
- decrease in user training requirements
- decrease in calls to the Help Desk and need for technical support
- decrease in user error rate
- decrease in programming costs associated with late design
- decrease in maintenance costs.
Non-quantifiable savings that can be achieved through early and continued usability testing include:
- reduction in staff frustration, leading to improved staff acceptance of the resource
- benefit to staff in the provision of a resource that is easy to learn and to use.
A number of companies offer usability-testing services. Agencies may also wish to skill up their staff by participating in the training opportunities that usability-testing firms offer. This has the benefit of directly improving agency expertise in this area, as well as making agency staff more informed consumers of such services.
Testing is an iterative process; and the more often sites are tested, the better they will perform.
Summary of Checkpoints
Before starting
Consider the different methods of user testing available
Define the goals of the user testing and identify what will be tested
Define what standards are acceptable and how results will be analysed
Develop a test 'scenario' or script'
Develop feedback questions
Determine how results will be collected
Consider testing the website throughout the process of creating the website
Setting up the test
Ensure that testing facilitators are prepared for their role
Ensure that any support team is prepared for its role
Ensure that testers are prepared for their role
Select testers carefully
Consider how many users will be testing the resources and the scale of the testing
Arrange for relevant people to observe the test
Organise a suitable venue and equipment
Conduct a 'dry run'
After the test
Debrief users and observers
Analyse and prioritise issues
Disseminate results and follow up
Checkpoints
Before starting
Consider the different methods of user testing available
There are a number of methods of user testing, including:
- Expert evaluation. The system or a website is evaluated by a person skilled in usability and user interface design. The expert role-plays less expert users in order to identify potential usability problems (based the expert's knowledge of design principles, standards and ergonomics).
- Heuristic review. Usability is evaluated by inspecting the user interface and assessing it against a set of usability design principles.
- Usability walk-through. This is a method for gathering early feedback about the usability of a design. A facilitator leads a group of participants step by step through a scenario using screen mock-ups (either paper or electronic). At each step, the facilitator elicits comments from the participants about any usability issues they see. This can be done in groups and is a very efficient way of collecting information about potential usability issues.
Two other methods also are useful, and these are covered in more detail in Better Practice Checklist 11, Website Usage Monitoring and Evaluation.
- Surveys. These can be useful in providing feedback on how users perceive a site. Their disadvantage is that they can introduce bias, as only users with particular characteristics may complete them.
- Software. Software is available to monitor the usage of a website (that is, to record which pages users access and how long they spend in particular pages). There is also software that identifies accessibility problems.
These methods can provide some information on the usability of a site, but not nearly as much as testing with users.
Define the goals of the user testing and identify what will be tested
For example, determine whether users' expectations and perceptions of the overall online service are being tested, or whether information is being collected about specific features that either work well or do not work well.
Consider what resources will be tested, and what particular aspects of the resources will be tested. Agencies may consider the following:
- Users must feel confident that they have found the correct information; or if they can't find it, they must feel confident that it doesn't exist.
- Users must always know where they are located in the site and must be able to clearly identify navigation elements such as links. Further information is available in Better Practice Checklist 2, Website Navigation [
]. - Users must understand the language used on the site and must be able to relate it to the language they understand.
- Users should find the site visually attractive.
Define what standards are acceptable and how results will be analysed
Consider what levels of functionality are desirable and acceptable. For example, should users be able to identify required resources within 5 seconds, 10 seconds or 30 seconds? If half of the users indicate that the web resources are very good or better, and half indicate that the resources are good or need improvement, your agency will need to determine whether this is acceptable or is a strong indication of the need for improvement.
Specific features that agencies may test include design considerations, loading time for images and accessibility issues.
Develop a test 'scenario' or 'script'
Scenarios and scripts contain the tasks the tester is asked to accomplish in the test, preferably without help. A typical scenario might be 'Register for X' on the website, or 'Find out whether you are eligible for X, and locate the forms that you will need to complete'.
Make sure that the tasks are real ones. If it is appropriate, let users have a hand in choosing the tasks - for example, 'Find a book you want to buy or you bought recently', rather than 'Find a cookbook for under $20'. Users will then have more 'emotional investment' in the task and can use more of their personal knowledge. In using the 'emotional investment' method, keep it in mind that the item they are looking for may not be there. In this case the tester's expectations need to be managed, as the tester will be focused on the result, not the process.
Develop feedback questions
It is useful to have feedback questions to ask testers at the end of each task. Asking a tester questions at the end of each task encourages discussion about the resource (as to what the tester did, why they did it that way and what they would change) while it is still fresh in the tester's memory. General feedback questions, about the system as a whole, can be asked at the end of the test session. General questions could include, for example, 'Do you like the colour scheme of the resource?'
Determine how results will be collected
Results can be collected through observation, measuring how well the tester can complete given tasks, discussions with testers and testers' written assessments. A questionnaire may also be used to collect information from testers. Whatever methods are used, consider how performance indicators will be recorded so that recording is accurate and objective. To support data collection, logging sheets or questionnaires may need to be developed.
Advanced testing facilities may have videotaping and other facilities that can help to record testers' reactions. It is not, however, necessary to use an expensive laboratory to observe users. A meeting or conference room with a PC and video equipment may be sufficient.
Permission from testers should, however, always be sought prior to recording any commentary through video or sound.
Consider testing the website throughout the process of creating the website
The key to success is 'Test early, test often'.
Website testing at various stages can produce valuable insights. Testing while the design is still on paper has a number of advantages, since issues can be identified while they can still be corrected cheaply (perhaps by just redrawing the screen roughly on paper). Users may feel freer to comment on something that looks unfinished. Also, as the design is still unpolished, users will not be distracted by details of implementation and can focus on the high-level issues of navigation and language. Agencies will, however, need to ensure that the site is sufficiently complete for any test to be meaningful.
Testing at the early stages of website development can help determine the most appropriate look and feel for the website. Testing close to website launch can help identify any issues that could cause problems for users. Testing at a later stage can identify areas that work particularly well, and areas that need improvement, and feed into evaluations of the entire project.
In general, testing early in developmental stages is most efficient, as changes can be made for relatively low cost. Testing conducted just prior to release or after release, when the resource is fully functional, is beneficial in that problems identified can form part of an enhancement release.
Setting up the test
Ensure that testing facilitators are prepared for their role
Facilitators should understand the objective of the testing and the importance of their role in achieving a useful outcome. Facilitators might benefit from training provided by external providers of user testing.
Facilitators should be familiar with the website and its goals. Although they need not be experts in the website (it is probably better if they are not), they should be confident with it.
Facilitators should be patient, calm, empathetic and impartial, and be good listeners. Ideally, they should not be part of the development team.
Facilitators' tasks will include:
- Briefing testers about the objectives of the test and their role in the testing process.
- Providing testers with a standard waiver form for signature, and answering any questions they might have about the form.
- Ensuring that the facilitator's own actions do not influence the outcomes of the test. For example, facilitators will need to ensure that they do not answer any questions from testers about the specifics of the resources being tested, if this will mask problems with resources that need to be drawn out.
- Drawing out additional information from testers, if appropriate.
- Observing and recording outcomes, and managing other observers. Other observers of the test may be tempted to assist or direct testers if they encounter difficulties, which again may interfere with results.
- Making testers feel comfortable, keeping time (sessions should be kept within 60 minutes with a 15-minute break every 45 minutes) and answering any questions testers may have about the process.
Ensure that any support team is prepared for its role
The support team could include:
- an observers' facilitator, who facilitates the briefing and debriefing in the observation room
- a camera operator
- an event logger, who logs events (such as mouse clicks, link selections or errors) on a form or on special purpose software.
Often a single person performs one or more of these roles.
Ensure that testers are prepared for their role
The facilitator should give testers an overview of how the process will work. In particular, this may involve:
- Asking testers to be honest in their responses and not to change their responses in order to please.
- Stressing that the purpose of the test is to uncover usability issues, not to test the skills of the testers, and reinforcing this throughout the testing.
- Requesting that testers verbalise what they are doing and why they are doing it, to make it easier for any problems to be identified.
- Encouraging testers to use the resources as they would at home or at work and not to work differently because of the test situation. For example, in a testing situation, testers might spend more time reading instructions than they normally would, thus changing the results.
Select testers carefully
Testers should represent the intended users of the site. Agencies will need to select testers according to specific profiles to ensure a realistic match between testers and real users. Testers should be impartial - that is, not involved in the development of the website - and need not have particular expertise in using the web resources. Users who come in 'cold', with no knowledge of the resource, are generally much more likely to reveal useful information than users who are familiar with the resource.
It may be appropriate to provide testers from outside the agency, or outside the Australian Public Service, with a small payment as an incentive to participate.
Consider how many users will be testing the resources and the scale of the testing
User testing need not be resource-intensive. Good results may be achieved by testing with relatively few users, and testing with additional users may not add significantly to the results. User-testing studies show that the first five testers generally identify the majority of problems with web resources. Agencies may consider running one or two user-testing sessions with five to ten users in total, and only extending this if the results appear to indicate that further testing is necessary.
If different types of users are likely to access resources, additional testing with representatives of these groups may be necessary to identify any issues particularly relevant to these users.
Arrange for relevant people to observe the test
If possible, arrange for the developers of the system, business unit owners and others to observe the test. This is likely to enhance their understanding of the user's experience and the need for any changes to the website.
Observers should be as unobtrusive as possible so as not to inhibit tester behaviour. Observers should be silent: no comments, no offers of help, no signs of exasperation, etc. (unless separated from the tester through the use of a usability laboratory). Observers are an important resource for providing feedback on what they have observed. This may be achieved by providing each observer with an observation checklist, to be completed by the observer as the tester works through the scenario.
Organise a suitable venue and equipment
Venues suitable for testing will vary according to the test, but would be expected to provide a workstation for each tester, facilities for observers, freedom from unnecessary distractions, etc. Testing may also be done on laptop computers and in home environments in order to simulate as realistic a scenario as possible.
Agencies may consider using in-house IT-training facilities, meeting rooms or other facilities. The Department of Employment and Workplace Relations (DEWR) has a purpose-built Usability Centre, which is available for bookings.
The Australian Bureau of Statistics (ABS) also has a facility that was created for conducting usability reviews. Contact the ABS to discuss availability.
Conduct a 'dry run'
Dry runs of the test are essential to make sure that the tasks or scenarios are clear and unambiguous, and that the information the tester is looking for is actually in the database. It is essential that the system configuration mirrors the target audience and, where it is expected that the target audience will use various system configurations, that those configurations also are tested. For example, screen resolutions 1024 x 768 and 800 x 600 are the two most popular.
The scenario needs to be tested to see how long it takes to complete, as testers will want to finish the scenario and will feel frustrated if they are cut short. Dry runs will also help ensure that the facilitator is clear about how the site works (especially if the facilitator has been brought in from outside) and that the supporting documentation (logging sheets, questionnaires, etc.) is correct.
After the test
Debrief users and observers
Testers may appreciate a short debrief, which could cover any major observations made during the test, a review of how the results are likely to be used and associated timeframes. It is important that testers go away feeling positive about the experience. Discuss with testers how they felt the test went, and resolve any issues that arose (for example, if they couldn't complete a task and would like to find out how to do it). Testers could also be asked to fill in a questionnaire to collect subjective information such as how they feel about the site, which aspects appeal to them, what they think should be changed and whether they would use the site again.
Observers might find a separate, more detailed debrief particularly useful. This debrief could outline the next steps for reporting and acting on the results, and include discussion about things that worked particularly well with the testing process and things that might be handled differently in any other tests. If observers are members of the development team, they might be asked not to make changes at this stage but to wait until the end of the testing to determine which issues are the most important ones to address.
Analyse and prioritise issues
Testing will usually identify a long list of issues, which may be more than the development team can feasibly fix. It is important to prioritise the issues so that the team knows which are the critical ones to address first. It may be helpful to prioritise each issue on a scale such as the following:
- Critical. Usability problem preventing completion of a task. Recommend not releasing the site until the issue is corrected.
- High. Usability problem creating significant delay or frustration, or occurring very frequently. Recommend correcting before releasing the site.
- Medium. Usability problem causing minor inconvenience, or easily overcome once users are aware of it. Recommend fixing if time permits, or fixing in next release.
- Low. Low-impact or cosmetic problem only. Recommend fixing if there is a lot of time available, or fixing when resources allow.
Disseminate results and follow up
The way in which results are presented will vary according to the test and the agency's purpose for conducting the test. While results can be presented in detailed or summary formats, the following may be included:
- purpose
- what was tested
- brief methodology
- results
- implications
- recommendations.
Other Better Practice Checklists
Contact for information on this page: AGIMO Better Practice Team

