Better Practice Checklist - 16. Implementing an Effective Website Search Facility

May 2004 (organisational details updated January 2008)

Introduction

Australian Government departments and agencies place increasing amounts of information on their websites, extranets and intranets. As the volume of material increases, the need for effective search facilities to help users locate specific information on these sites similarly increases. A key role of the Australian Government Information Management Office (AGIMO), Department of Finance and Deregulation is to identify and promote 'Better Practice'. This checklist has been created to help agencies maximise their use of new technologies by ensuring that their sites have effective search facilities.

This checklist suggests that a number of issues should be considered when designing and implementing search facilities. The items in the checklist are, however, not mandatory. The checklist has been provided as a guide to help agencies to consider a range of issues that may improve search facilities.

This checklist has been created for staff responsible for websites, including those in website or intranet teams. The information within this checklist may also be relevant to senior program managers, IT managers and others. This checklist focuses on non-technical issues.

It should be noted that the checklist is not intended to be comprehensive. Rather, it highlights key issues for agencies. The checklist is iterative and draws on the expertise and experience of practitioners. The subject matter and issues are reviewed and updated to reflect developments.

Download PDF of Checklist 16 - Implementing an Effective Website Search Facility [PDF Document - 324 KB]

Acknowledgments

This checklist was developed with the assistance of Australian Government agencies. In particular, we would like to thank the Attorney-General's Department, the Bureau of Meteorology, the Health Insurance Commission, CSIRO and the Content Management Community of Practice.

Implementing an effective search facility

Providing an effective search facility on websites, extranets or intranets involves more than just installing a search engine package 'out of the box'. Effective search facilities match the site they support and the site's users. They are tested with real content and users and refined as appropriate. A search facility should be able to give priority to important pages such as key pages and policy documents.

This checklist focuses on a range of issues to consider when selecting, designing or configuring a search engine for a website, extranet or intranet. While a range of suggestions are made regarding better practice approaches, the most effective way of ensuring that the search facility is useful is to conduct usability evaluations with real users and content. Further information about user testing is available in Better Practice Checklist 3, Testing Websites with Users.

This checklist does not cover the techniques involved in improving the ranking of sites in public search engines (such as Google and Yahoo). These approaches are often called 'search engine optimisation' (SEO), and a range of useful resources relating to this can be found on www.searchtools.com [External Site].

Enterprise search engines make use of links, anchor text and URL structure used by the leading public search engines to rank important resources above those which are merely relevant. By improving site structure and interlinking, using simple URLs and using descriptive anchor text, search results from Google and Yahoo! can be improved and will result in improved site-search results. (Anchor text refers to the words which people click on in a browser in order to follow a link).

Note that in this checklist the term 'search facility' is used to describe the provision of search capabilities on sites. 'Search engine' is used to describe the software product that supports the facility.

Summary of Checkpoints

Before starting

Check box Consider whether a search facility is necessary

Check box Consider whether a search engine needs to be procured

Check box Consider users' experiences with public search engines

Check box Consider the different needs of website and intranet search facilities

Check box Undertake housekeeping of site content, site structure, URLs, links and metadata


Identify requirements

Check box Identify and document the business requirements for the search facility

Check box Identify and document the technical requirements for the search facility

Check box Consider any requirements for the search engine to integrate with other information systems

Check box Prioritise and weight requirements for software selection, focusing on maximising the ability of searchers to find what they need.

Check box Evaluation and selection

Check box Explore available options

Check box Visit reference sites

Check box Focus on usability, simplicity and effectiveness

Check box Assess the vendor's implementation methodology

Check box Use scenarios during demonstrations

Check box Consider the total cost of ownership

Check box Consider product-pricing models

Check box Consider having the vendor set up a proof-of-concept search facility

Check box Assess other features


Designing the search interface

Check box Focus on meeting general user needs

Check box Keep the search form simple

Check box Make the search facility available throughout the site

Check box Provide scoped searches of important subsites where appropriate but make it clear what the scope is.

Check box Ensure that the search facility is prominent on the home page

Check box Consider whether advanced search options should be provided

Check box Test search interfaces with users


Designing the results pages

Check box Minimise the details provided for each result

Check box Ensure that descriptions are meaningful


Configuring the search engine

Check box Consider 'and-ing' search terms by default

Check box Use URL structure, links and descriptive anchortext to 'bring forward' important pages

Check box Implement search engine synonyms

Check box Consider spell-checking

Check box Consider stemming and 'fuzzy matching'

Check box Consider 'federated' searching

Check box Consider 'faceted' searching

Check box Consider the use of taxonomies



Checkpoints

Before starting

Check box Consider whether a search facility is necessary

Not all websites need a search facility, and agencies may wish to assess whether a particular site needs one, however if a website contains over fifty pages agencies should consider providing a search facility. To assist with this decision, agencies may wish to consider:

Check box Consider whether a search engine needs to be procured or whether an existing search facility can be used

If the initial analysis indicates that the site does need a search facility, it is worth considering whether this need could be met by leveraging off an already existing search capability. For example, some Internet search companies offer the ability for external sites to 'tap into' their content indexes, enabling a site-specific search facility to be provided without procuring and implementing a separate search engine. In addition, some companies provide an Application Service Provider (ASP) service for search facilities. For a monthly fee they will remotely index specified sites and provide site search interfaces.

These can be cost-effective solutions that can be implemented quickly. A condition of use of this approach may be that the search interface and results pages will be branded with the search company's logo/brand and that agencies may have little or no control over the accuracy, relevancy, completeness or currency of what is indexed.

Check box Consider that users' experiences with public search engines may influence their expectations

The popularity of public search facilities such as Google is increasingly influencing users' expectations, not only in ease of use and design but also in how accurately and quickly a search is conducted. Many users now expect that they can simply type in a word or two and will quickly find the desired information. Trying to emulate this user experience for the site may require considerable investment in appropriate search technologies, and ensuring that the site is well structured and uses good anchortext and simple URLs, as well as ongoing expertise to monitor the system and ensure that it is optimally configured.

Due to their experience with search facilities such as Google, users may also have preferences in terms of the layout and design of the search facility. Layout and design similar to those provided by public search facilities may be preferable to a number of users.

Check box Consider the different needs of website and intranet search facilities

Differences between websites and intranets will impact upon the business and technical requirements for a suitable search solution. Differences in users, technologies, content, and security and privacy requirements may need to be considered.

While it may be appropriate for organisations to implement the same search engine for both intranet and websites, the different needs of the users of these sites may lead to the implementation of the search engine in different ways, and the provision of different search facilities. For example, the search interface, capabilities, and the way in which results are displayed may differ between the Internet and the intranet search facilities, although both searches are generated by the same search software.

Check box Undertake housekeeping of site content, structure, URLs, links and metadata

The effectiveness of a search facility is only as good as the quality of the information it searches. The deployment of a powerful search engine may highlight inconsistencies or problems with the underlying site content. In preparation, agencies may wish to review content on the site (particularly structure, simplicity of URLs, links and descriptive anchor text) and their content management processes.

In addition, agencies may find it useful to review the AGLS metadata applied to key resources and their overall metadata strategies. Further information about metadata is available in Better Practice Checklist 6, Use of Metadata for Web Resources.

Identify requirements

Check box Identify and document the business requirements for the search facility

In consultation with stakeholders, identify and document the business requirements for a search facility. Each agency may have a unique set of requirements and may wish to consider:

Check box Identify and document the technical requirements for the search facility

The IT environment of the agency, including the software (such as operating system and applications), databases and hardware platform, needs to be identified. Specifying exact requirements may allow the compatibility of the search facility with the current agency environment to be meaningfully assessed.

Performance and scalability are technical requirements that may need to be specified. For example, growth in the number of documents indexed may impact upon storage capacity, bandwidth and product licence costs.

Check box Consider any requirements for the search engine to integrate with other information systems in the short, medium and long terms

There is an increasing requirement for search engines to be more tightly integrated with the information repositories that they search, such as records, document or content management systems. Agencies may need to consider these requirements, particularly when products such as content management systems 'dynamically' publish information directly out of databases.

Dynamic publication from these systems can lead to lower search effectiveness, if not implemented with due care, by creating multiple near-copies of the same information and by discouraging the use of links and descriptive anchor text. Other websites can be reluctant to link to obviously dynamic URLs because of the risk that they will change or disappear. Search quality can be improved significantly by ensuring that all dynamically published webpages worth linking to have simple, intuitive URLs. For example xxx.gov.au/checklists/ rather than xxx.gov.au/xxx?ObjectID=3E48F86-AA1A-11A1-B6300060B0AA00014.

Many users search for government information using public Internet search engines such as Google and Yahoo - and these rely on good URLs, links and anchortext.

An agency may wish to verify that an integrated search engine is capable of ranking relevant site entry pages above less important individual pages which contain the query words in content or metadata.

Check box Prioritise and weight requirements for software selection, focusing on maximising the ability of searchers to find what they need.

To assist product evaluation and differentiation, agencies may consider the priority and weightings they will assign to the attributes of the products they require. For example, attributes could be prioritised as Mandatory, Highly Desirable or Desirable and assigned a numerical weighting to indicate their relative importance.

The combination of priority and weight can then be used to generate an overall score for the product. It should not be forgotten, however, that the ultimate aim of any search facility is to guide searchers (at least those within the target group) as efficiently and effectively as possible to the information and services they need.

Evaluation and selection

The evaluation and selection process for a search engine would be expected to be similar to the process for the purchase of other strategic business systems. This may include conducting an RFI, RFQ or tender process carried out in accordance with the Commonwealth Procurement Guidelines.

Most agencies will have procurement units to provide advice and assistance with this process.

Check box Explore available options

A large number of search tools are available, ranging from freeware to solutions costing hundreds of thousands of dollars. Agencies may wish to evaluate how well a range of products meet their specific business requirements.

Sources of information include:

Check box Visit reference sites

Consider asking vendors to provide contact details for at least three similar organisations where their search solution has been installed. These sites may then be visited (without the vendor) to determine how well the product works in practice.

Check box Focus on usability, simplicity, and effectiveness

While a number of products may match the identified functional requirements, it can be useful to consider the overall usability and simplicity of the system. Is it an effective tool for finding what users need? Complex interfaces, difficult administration and configuration options may impact upon training needs and ongoing support arrangements. Poor search effectiveness may result in increased volume of telephone or in-person requests to be handled individually by staff. It may also result in reduced benefit from an agency's initiatives and services because users are unable to find them.

Check box Assess the vendor's implementation methodology

Agencies will typically look to the vendor (or implementation partner) for assistance in the initial installation and configuration of a search facility. Agencies may wish to assess the implementation methodology proposed by the vendor, paying particular attention to the technical aspects (such as installation, configuration management, scripting requirements, administration requirements, customisation options) and skills transfer.

Check box Use scenarios during demonstrations

Scenarios are an effective way of ensuring that vendor demonstrations clearly show how the search facility will meet the specific business needs. They also allow demonstrations to be compared and rated. Further information about the use of scenarios is available in Better Practice Checklist 14, Designing and Managing an Intranet.

Check box Consider the total cost of ownership

As well as initial implementation costs, agencies may wish to look at the total cost of ownership when evaluating search facilities. This can include:

It is also appropriate to consider the return on investment in an effective search facility. Sometimes returns are easy to quantify, such as reduced reliance on call-centres when clients can find answers to questions themselves. In other cases, the benefit may lie in increased staff productivity, higher quality of advice (because all available information has been found) or more effective delivery of services and information to users.

Check box Consider product-pricing models

Different products are likely to have different pricing models. These models may be:

Consider specifying the preferred pricing model to enable vendors to offer the most attractive price to meet the agency's requirements.

Check box Consider having the vendor set up a proof-of-concept search facility

A proof-of-concept search facility, based on the specific content holdings of the site, can help to identify issues relating to establishing and administering the product and evaluating search performance.

Note that search engines are highly configurable, and this may make it difficult to compare like with like in any demonstration, unless what is configured during any demonstrations or proof-of-concept trials are carefully specified and managed.

Consider how to measure the success of a search. Identify what users are likely to search for and what they may consider to be the best answers. One possible approach is to devise predefined query and result sets from content that is indexed for the evaluation. Document and note where the 'correct answer' is found, including order and page (for example, first hit - first page, thirteenth hit - second page, not found).

Having access to administrator functions allows further investigation and evaluation of the ease of including indexing and configuration management.

Check box Assess other features

As part of the evaluation and selection of a search engine, agencies may wish to consider the following questions:

Designing the search interface

Check box Focus on meeting general user needs

While specific users (such as library, website and intranet teams) have specialist skills in searching for information, these are not necessarily representative of the general user population, and agencies may decide that these groups will not be the primary focus of the design process.

Unless there is a specifically targeted group of users, consider designing the search interface to cater to the general audience of the site.

Check box Keep the search form simple

Consider providing only two key elements for the main search interface: a field to enter the search terms, and a 'Search' button. While other fields or drop-down lists can be provided, research has found that these are not well understood or used by general users (see Jakob Nielson's Alertbox, www.useit.com) [External Site].

Check box Ensure that the search facility is prominent on the home page

As many users may wish to use the search facility from the home page, consider making it prominent there.

Check box Make the search facility available throughout the site

To promote the provision of a consistent user experience between government websites, AGIMO recommends that common navigation elements, including 'Home', 'About', 'Contact us' and 'Search', are displayed at the 'head' of each page. The search link either should be accompanied by a search box or should provide a link to the site's search page.

Further information about common design elements and branding for the Australian Government is available at http://webpublishing.agimo.gov.au/Visual_Design_and_Branding [External Site].

Check box Consider whether advanced search options should be provided

Advanced search options enable the user to enter more complex searches with the objective of identifying more specific results. Advanced search options may include the use of boolean operators ('and', 'or', 'not', 'near', etc).

While an advanced search interface is useful for specialist audiences (such as librarians, researchers and other skilled users), consider whether it should be hidden, or downplayed.

If an advanced search is provided, agencies may decide that the majority of the resources involved in implementation should still be devoted to ensuring the effectiveness of the default 'simple' search.

Check box Test search interfaces with users

The most effective way of ensuring that the search facility meets user needs is to test the interfaces with users. Further information on testing with users is available in Better Practice Checklist 3, Testing Websites with Users.

Designing the results pages

Users can easily be overwhelmed by the prospect of browsing through thousands of matches when looking for a single page. Search results presented in a clear, inviting and user-friendly format are likely to be well received by users.

Check box Minimise the details provided for each result

Browsing search results can be made much easier by restricting the detail provided for each search. Agencies may consider providing the following in their search results:

While public search engine installations often provide other details in search results, such as size in bytes and links to other similar matches, this may not be helpful for most users. If any additional details are included, consider presenting them in smaller or lighter type, so as not to distract the user from the key information

Check box Ensure that descriptions are meaningful

Consider providing a meaningful abstract or summary for each search result. This can be sourced from metadata or automatically abstracted by the search software.

If automated abstracts are used, care needs to be taken to help ensure that the text does not include irrelevant text, such as the standard header and the navigation elements that appear on all pages. Most search engines provide 'start index' and 'stop index' markers that can be used to exclude these elements of the page (the syntax of these markers varies between search engines).

Configuring the search engine

From the user's perspective, the search engine should work 'like magic', returning the desired information even when only a few (potentially mistyped) words are entered. This is achievable if the search engine configuration is fine-tuned.

Check box Consider 'and-ing' search terms by default

Users may expect that if they type in more words, they will get fewer (and more specific) search results. They may also expect that the documents returned will contain all of the terms they entered. For this reason, it may be appropriate to have the search engine 'and' together the terms by default, particularly when searching across large information resources.

Check box Consider the use of URLs, links and anchortext to 'bring forward' important pages

In any large site, common searches can return thousands of results. Textual similarity of the query to document content, title or metadata may not always be a reliable guide to which are the most important resources on the subject of the query. For example, in many universities, thousands of documents match the commonly submitted query 'library'. Each set of minutes of a library acquisition committee contains the word library in its title, repeated many times in its content and prominently in its subject and description metadata. None of these minute pages is as good an answer as the library homepage, from where a visitor may readily find the catalogue, the opening hours, the location of the branches and links to the acquisition committees. But the library homepage is not readily distinguished from all the other minor documents by title, metadata or content. What tells a search engine that it is the best answer may be click-through data (searchers tend to click on this document when they type the query library), URL structure (library.xxx.edu.au or www.xxx.edu.au/library/ and, most importantly, thousands of incoming links using the word 'library' in their anchortext.

Check box Consider search engine 'best bets'

Analysis of search query logs may show commonly typed queries which are either misspelled or use different terminology to that of the agency and as a result do not return the most useful results. For example 'dole' - 'unemployment benefit'. These problems can be partly overcome by spelling correction, thesaurus expansion etc. but such techniques may require a two-stage interaction. Including non-preferred vocabulary among metadata can help and so can "best bets". 'Best bets' establishes a separate database of 'key' pages for a given subject. When the user enters search terms, these are looked up in this database, and any matching entries are listed at the very beginning of the results page. For example, on a bank website, the home loan centre homepage can be presented as a best bet in response to any query which includes 'mortgage', 'home loan', 'property finance', 'housing' etc.

By clearly marking these results as the 'best bets', users may have more confidence in their quality. The introduction of a human element into the creation of the 'best bets' allows a high-quality collection of perhaps a few hundred pages to be identified out of the tens of thousands that may exist.

Check box Implement search engine synonyms

There are a number of common reasons for users not finding information when searching. For example:

These can be resolved by implementing a list of synonyms in the search engine. When the user enters a term, it is looked up in the synonym list, and any equivalent words are also included in the search terms. In this way, the desired information can be found even if the search terms do not exactly match the content of the pages.

Synonyms can also be used to increase the extent to which users from culturally and linguistically diverse backgrounds can access the resources on the site. By providing, in the synonym list, terms in other languages spoken by target groups, foreign language terms entered can retrieve English language resources, and vice versa. Further information about making websites more accessible to people from diverse backgrounds is available in Better Practice Checklist 19, Access and Equity Issues for Websites.

Check box Consider spell-checking

A number of search engines provide the option of spell-checking the terms entered by users, which can be useful in assisting users to find the desired documents regardless of typographical or spelling errors.

Check box Consider stemming and 'fuzzy matching'

Stemming and fuzzy matching are designed to provide matches even when the user does not enter exactly the same terms as those used in the pages. While they can be useful, they can also have the side effect of considerably increasing the number of search results returned and may reduce the overall quality of matches.

'Stemming' is the process whereby the search engine automatically adds different endings to the words entered. For example, 'walk' would also find 'walks', 'walked' and 'walking'. This can be useful for managing inconsistencies in the documents being searched, as well as address the differences between Australian/British spelling and North American spelling.

'Fuzzy matching' is the name given to the general class of options that matches words that 'sound like' those entered by the user. The specific method of doing this varies between search tools and is referred to by different names ('soundex', 'phonetic matching', etc). Fuzzy matches are best presented as suggestions, rather than being automatically applied, to avoid topic drift. For example, a soundex algorithm rates lion, lean, lan, lawn, lane and logan (among many others) as equivalent to 'loan'.

Check box Consider 'federated' searching

'Federated' searching recognises that the information that the user is looking for may be drawn from a wide range of sources, including:

In this environment, 'federated' search queries each of these information sources in turn, using whatever form of searching is required to match the format and nature of the individual repository.

These results are then brought together and synthesised into a single meaningful list. Note that, as part of this, differences between the information sources are typically resolved in order to provide a consistently high-quality set of search results. Alternatively, the search results may clearly distinguish between information from the various sources.

Check box Consider 'faceted' searching

Different users are likely to search for items in many different ways. For example, a cooking recipe could be found according to ingredients, geographic region, style of cooking, cost, difficulty, equipment required, type of diet (vegetarian, etc) or other aspects of the recipe.

Each such aspect of an item is called a 'facet'. Once items are classified in this way, a 'faceted' search provides a mechanism for browsing through the classifications, with each step narrowing down the number of matching items.

This is a powerful method, which can be very relevant for large repositories of information or for sites, such as portals, that aggregate a diverse range of topics.

More information on this can be found in the article at www.searchtools.com/info/faceted-metadata.html [External Site].

Check box Consider the use of taxonomies

A taxonomy or thesaurus is used to classify items according to the subjects to which they belong. These are most often seen in library collections, records and document management systems, or even 'back of the book' indexes.

While taxonomies have traditionally been considered to be distinct from full-text search engines, there are now solutions that combine them to form an effective hybrid 'searching/browsing' interface. The best-known example of this is the Yahoo directory. Entering search terms brings up matching sites, as well as a listing of the categories to which the sites belong. This allows the search to be easily broadened to other sites in the same category.

Recent releases of enterprise search engines have started to provide similar capabilities for use on websites or intranets. Like the other advanced search methods outlined, these could be explored for larger and more complex sites. There may be considerable resource implications with this method.

Other Better Practice Checklists

  1. Providing Forms Online
  2. Website Navigation
  3. Testing Websites with Users
  4. Use of Cookies in Online Services
  5. Providing an Online Sales Facility
  6. Use of Metadata for Web Resources
  7. Archiving Web Resources
  8. Managing Online Content
  9. Selecting a Content Management System
  10. Implementing a Content Management System
  11. Website Usage Monitoring and Evaluation
  12. Online Policy Consultation
  13. Knowledge Management
  14. Designing and Managing an Intranet
  15. Information Architecture for Websites
  16. Implementing an Effective Website Search Facility
  17. Spatial Data on the Internet
  18. Digitisation of Records
  19. Access and Equity Issues for Websites
  20. Marketing E-government
  21. ICT Support for Telework
  22. Assistive Technology for Employees of the Australian Government
  23. Decommissioning Government Websites
  24. ICT Asset Management
  25. Managing the Environmental Impact of ICT

 Download PDF of Checklist 16 - Implementing an Effective Website Search Facility [PDF Document - 324 KB]


Contact for information on this page: AGIMO Better Practice Team


Back to top

Last Modified: 30 January, 2009