Better Practice Checklist - 16. Implementing an Effective Website Search Facility
May 2004 (organisational details updated January 2008)
Introduction
Australian Government departments and agencies place increasing amounts of information on their websites, extranets and intranets. As the volume of material increases, the need for effective search facilities to help users locate specific information on these sites similarly increases. A key role of the Australian Government Information Management Office (AGIMO), Department of Finance and Deregulation is to identify and promote 'Better Practice'. This checklist has been created to help agencies maximise their use of new technologies by ensuring that their sites have effective search facilities.
This checklist suggests that a number of issues should be considered when designing and implementing search facilities. The items in the checklist are, however, not mandatory. The checklist has been provided as a guide to help agencies to consider a range of issues that may improve search facilities.
This checklist has been created for staff responsible for websites, including those in website or intranet teams. The information within this checklist may also be relevant to senior program managers, IT managers and others. This checklist focuses on non-technical issues.
It should be noted that the checklist is not intended to be comprehensive. Rather, it highlights key issues for agencies. The checklist is iterative and draws on the expertise and experience of practitioners. The subject matter and issues are reviewed and updated to reflect developments.
Download PDF of Checklist 16 - Implementing an Effective Website Search Facility [
- 324 KB]
Acknowledgments
This checklist was developed with the assistance of Australian Government agencies. In particular, we would like to thank the Attorney-General's Department, the Bureau of Meteorology, the Health Insurance Commission, CSIRO and the Content Management Community of Practice.
Implementing an effective search facility
Providing an effective search facility on websites, extranets or intranets involves more than just installing a search engine package 'out of the box'. Effective search facilities match the site they support and the site's users. They are tested with real content and users and refined as appropriate. A search facility should be able to give priority to important pages such as key pages and policy documents.
This checklist focuses on a range of issues to consider when selecting, designing or configuring a search engine for a website, extranet or intranet. While a range of suggestions are made regarding better practice approaches, the most effective way of ensuring that the search facility is useful is to conduct usability evaluations with real users and content. Further information about user testing is available in Better Practice Checklist 3, Testing Websites with Users.
This checklist does not cover the techniques involved in improving the ranking of sites in public search engines (such as Google and Yahoo). These approaches are often called 'search engine optimisation' (SEO), and a range of useful resources relating to this can be found on www.searchtools.com [
].
Enterprise search engines make use of links, anchor text and URL structure used by the leading public search engines to rank important resources above those which are merely relevant. By improving site structure and interlinking, using simple URLs and using descriptive anchor text, search results from Google and Yahoo! can be improved and will result in improved site-search results. (Anchor text refers to the words which people click on in a browser in order to follow a link).
Note that in this checklist the term 'search facility' is used to describe the provision of search capabilities on sites. 'Search engine' is used to describe the software product that supports the facility.
Summary of Checkpoints
Before starting
Consider whether a search facility is necessary
Consider whether a search engine needs to be procured
Consider users' experiences with public search engines
Consider the different needs of website and intranet search facilities
Undertake housekeeping of site content, site structure, URLs, links and metadata
Identify requirements
Identify and document the business requirements for the search facility
Identify and document the technical requirements for the search facility
Consider any requirements for the search engine to integrate with other information systems
Prioritise and weight requirements for software selection, focusing on maximising the ability of searchers to find what they need.
Evaluation and selection
Explore available options
Visit reference sites
Focus on usability, simplicity and effectiveness
Assess the vendor's implementation methodology
Use scenarios during demonstrations
Consider the total cost of ownership
Consider product-pricing models
Consider having the vendor set up a proof-of-concept search facility
Assess other features
Designing the search interface
Focus on meeting general user needs
Keep the search form simple
Make the search facility available throughout the site
Provide scoped searches of important subsites where appropriate but make it clear what the scope is.
Ensure that the search facility is prominent on the home page
Consider whether advanced search options should be provided
Test search interfaces with users
Designing the results pages
Minimise the details provided for each result
Ensure that descriptions are meaningful
Configuring the search engine
Consider 'and-ing' search terms by default
Use URL structure, links and descriptive anchortext to 'bring forward' important pages
Implement search engine synonyms
Consider spell-checking
Consider stemming and 'fuzzy matching'
Consider 'federated' searching
Consider 'faceted' searching
Consider the use of taxonomies
Checkpoints
Before starting
Consider whether a search facility is necessary
Not all websites need a search facility, and agencies may wish to assess whether a particular site needs one, however if a website contains over fifty pages agencies should consider providing a search facility. To assist with this decision, agencies may wish to consider:
- site objectives and purpose
- user demographics, experience and expectations
- site structure and design, including navigational features
- site content
- user feedback.
Consider whether a search engine needs to be procured or whether an existing search facility can be used
If the initial analysis indicates that the site does need a search facility, it is worth considering whether this need could be met by leveraging off an already existing search capability. For example, some Internet search companies offer the ability for external sites to 'tap into' their content indexes, enabling a site-specific search facility to be provided without procuring and implementing a separate search engine. In addition, some companies provide an Application Service Provider (ASP) service for search facilities. For a monthly fee they will remotely index specified sites and provide site search interfaces.
These can be cost-effective solutions that can be implemented quickly. A condition of use of this approach may be that the search interface and results pages will be branded with the search company's logo/brand and that agencies may have little or no control over the accuracy, relevancy, completeness or currency of what is indexed.
Consider that users' experiences with public search engines may influence their expectations
The popularity of public search facilities such as Google is increasingly influencing users' expectations, not only in ease of use and design but also in how accurately and quickly a search is conducted. Many users now expect that they can simply type in a word or two and will quickly find the desired information. Trying to emulate this user experience for the site may require considerable investment in appropriate search technologies, and ensuring that the site is well structured and uses good anchortext and simple URLs, as well as ongoing expertise to monitor the system and ensure that it is optimally configured.
Due to their experience with search facilities such as Google, users may also have preferences in terms of the layout and design of the search facility. Layout and design similar to those provided by public search facilities may be preferable to a number of users.
Consider the different needs of website and intranet search facilities
Differences between websites and intranets will impact upon the business and technical requirements for a suitable search solution. Differences in users, technologies, content, and security and privacy requirements may need to be considered.
While it may be appropriate for organisations to implement the same search engine for both intranet and websites, the different needs of the users of these sites may lead to the implementation of the search engine in different ways, and the provision of different search facilities. For example, the search interface, capabilities, and the way in which results are displayed may differ between the Internet and the intranet search facilities, although both searches are generated by the same search software.
Undertake housekeeping of site content, structure, URLs, links and metadata
The effectiveness of a search facility is only as good as the quality of the information it searches. The deployment of a powerful search engine may highlight inconsistencies or problems with the underlying site content. In preparation, agencies may wish to review content on the site (particularly structure, simplicity of URLs, links and descriptive anchor text) and their content management processes.
In addition, agencies may find it useful to review the AGLS metadata applied to key resources and their overall metadata strategies. Further information about metadata is available in Better Practice Checklist 6, Use of Metadata for Web Resources.
Identify requirements
Identify and document the business requirements for the search facility
In consultation with stakeholders, identify and document the business requirements for a search facility. Each agency may have a unique set of requirements and may wish to consider:
- functions of the website (for example, intranet, website, extranet)
- goals and skills of the users
- size and structure of the site
- whether content is static or dynamic
- format of the content (HTML, PDF, etc)
- location of content (for example, databases, flat files, proprietary content management systems)
- security and privacy considerations
- other agency-wide applications of the search facility.
Identify and document the technical requirements for the search facility
The IT environment of the agency, including the software (such as operating system and applications), databases and hardware platform, needs to be identified. Specifying exact requirements may allow the compatibility of the search facility with the current agency environment to be meaningfully assessed.
Performance and scalability are technical requirements that may need to be specified. For example, growth in the number of documents indexed may impact upon storage capacity, bandwidth and product licence costs.
Consider any requirements for the search engine to integrate with other information systems in the short, medium and long terms
There is an increasing requirement for search engines to be more tightly integrated with the information repositories that they search, such as records, document or content management systems. Agencies may need to consider these requirements, particularly when products such as content management systems 'dynamically' publish information directly out of databases.
Dynamic publication from these systems can lead to lower search effectiveness, if not implemented with due care, by creating multiple near-copies of the same information and by discouraging the use of links and descriptive anchor text. Other websites can be reluctant to link to obviously dynamic URLs because of the risk that they will change or disappear. Search quality can be improved significantly by ensuring that all dynamically published webpages worth linking to have simple, intuitive URLs. For example xxx.gov.au/checklists/ rather than xxx.gov.au/xxx?ObjectID=3E48F86-AA1A-11A1-B6300060B0AA00014.
Many users search for government information using public Internet search engines such as Google and Yahoo - and these rely on good URLs, links and anchortext.
An agency may wish to verify that an integrated search engine is capable of ranking relevant site entry pages above less important individual pages which contain the query words in content or metadata.
Prioritise and weight requirements for software selection, focusing on maximising the ability of searchers to find what they need.
To assist product evaluation and differentiation, agencies may consider the priority and weightings they will assign to the attributes of the products they require. For example, attributes could be prioritised as Mandatory, Highly Desirable or Desirable and assigned a numerical weighting to indicate their relative importance.
The combination of priority and weight can then be used to generate an overall score for the product. It should not be forgotten, however, that the ultimate aim of any search facility is to guide searchers (at least those within the target group) as efficiently and effectively as possible to the information and services they need.
Evaluation and selection
The evaluation and selection process for a search engine would be expected to be similar to the process for the purchase of other strategic business systems. This may include conducting an RFI, RFQ or tender process carried out in accordance with the Commonwealth Procurement Guidelines.
Most agencies will have procurement units to provide advice and assistance with this process.
Explore available options
A large number of search tools are available, ranging from freeware to solutions costing hundreds of thousands of dollars. Agencies may wish to evaluate how well a range of products meet their specific business requirements.
Sources of information include:
- reviews by industry analysts
- reviews and evaluations undertaken by other organisations
- experiences of other government agencies
- vendor information
- communities of practice, mailing lists, etc.
Visit reference sites
Consider asking vendors to provide contact details for at least three similar organisations where their search solution has been installed. These sites may then be visited (without the vendor) to determine how well the product works in practice.
Focus on usability, simplicity, and effectiveness
While a number of products may match the identified functional requirements, it can be useful to consider the overall usability and simplicity of the system. Is it an effective tool for finding what users need? Complex interfaces, difficult administration and configuration options may impact upon training needs and ongoing support arrangements. Poor search effectiveness may result in increased volume of telephone or in-person requests to be handled individually by staff. It may also result in reduced benefit from an agency's initiatives and services because users are unable to find them.
Assess the vendor's implementation methodology
Agencies will typically look to the vendor (or implementation partner) for assistance in the initial installation and configuration of a search facility. Agencies may wish to assess the implementation methodology proposed by the vendor, paying particular attention to the technical aspects (such as installation, configuration management, scripting requirements, administration requirements, customisation options) and skills transfer.
Use scenarios during demonstrations
Scenarios are an effective way of ensuring that vendor demonstrations clearly show how the search facility will meet the specific business needs. They also allow demonstrations to be compared and rated. Further information about the use of scenarios is available in Better Practice Checklist 14, Designing and Managing an Intranet.
Consider the total cost of ownership
As well as initial implementation costs, agencies may wish to look at the total cost of ownership when evaluating search facilities. This can include:
- the amount of customisation required
- the technical skills and knowledge required by internal staff
- the degree of ongoing reliance on the vendor
- licensing models and fees
- the third-party products required
- the IT infrastructure (hardware and software) required.
It is also appropriate to consider the return on investment in an effective search facility. Sometimes returns are easy to quantify, such as reduced reliance on call-centres when clients can find answers to questions themselves. In other cases, the benefit may lie in increased staff productivity, higher quality of advice (because all available information has been found) or more effective delivery of services and information to users.
Consider product-pricing models
Different products are likely to have different pricing models. These models may be:
- document-based - that is, costs based on the number of documents indexed
- site-based - that is, costs based on the number of sites indexed
- feature-based - that is, additional costs for specific functionality
- infrastructure-based - that is, costs based on the number of CPUs or number of servers, or
- combinations of the above.
Consider specifying the preferred pricing model to enable vendors to offer the most attractive price to meet the agency's requirements.
Consider having the vendor set up a proof-of-concept search facility
A proof-of-concept search facility, based on the specific content holdings of the site, can help to identify issues relating to establishing and administering the product and evaluating search performance.
Note that search engines are highly configurable, and this may make it difficult to compare like with like in any demonstration, unless what is configured during any demonstrations or proof-of-concept trials are carefully specified and managed.
Consider how to measure the success of a search. Identify what users are likely to search for and what they may consider to be the best answers. One possible approach is to devise predefined query and result sets from content that is indexed for the evaluation. Document and note where the 'correct answer' is found, including order and page (for example, first hit - first page, thirteenth hit - second page, not found).
Having access to administrator functions allows further investigation and evaluation of the ease of including indexing and configuration management.
Assess other features
As part of the evaluation and selection of a search engine, agencies may wish to consider the following questions:
- Is the system able to run on the agency's IT infrastructure - that is, hardware, operating system, web server, etc?
- Are the support arrangements appropriate? Are they conveniently located?
- Is the system scalable, and is the architecture appropriate?
- Does the system address security considerations?
- Is support provided for flexible automated scheduling?
- Are the licensing terms appropriate for the agency?
- Are there additional costs for additional functionality?
- What functionality can be customised or configured, and how easy is this to do?
- Can the extent and frequency of content refreshes be easily changed?
- Does the product support partial reindexing?
- Is it possible to store cached copies of content?
- Is it possible to highlight search terms in results pages?
- Will the system handle the volume of content to be indexed?
- Will the system handle the anticipated search load?
- What are expected response times when searching and displaying results?
- How efficient is the index processing in terms of time and disk space?
- Does the system support the ability to index according to AGLS metadata and other tagged fields?
- How easy and flexible is it to modify the search query and the search results pages?
- How easy is it to control the underlying search query processing - for example, stemming, boolean logic, spell-checking, wildcards, case sensitivity, punctuation, stop words?
- Does the system include support for thesaurus, synonyms, controlled vocabularies, etc?
- How is relevancy of search results determined, and how can the system be used to boost the relevance of certain results?
- Is there support for automated search query analysis - for example, top search terms, no finds, successful finds?
- Does the system support advanced searching?
- Can logging and management reports be generated?
- Does the system support non-text indexing - for example, images, multimedia video and audio?
- Does the system provide context-sensitive help?
Designing the search interface
Focus on meeting general user needs
While specific users (such as library, website and intranet teams) have specialist skills in searching for information, these are not necessarily representative of the general user population, and agencies may decide that these groups will not be the primary focus of the design process.
Unless there is a specifically targeted group of users, consider designing the search interface to cater to the general audience of the site.
Keep the search form simple
Consider providing only two key elements for the main search interface: a field to enter the search terms, and a 'Search' button. While other fields or drop-down lists can be provided, research has found that these are not well understood or used by general users (see Jakob Nielson's Alertbox, www.useit.com) [
].
Ensure that the search facility is prominent on the home page
As many users may wish to use the search facility from the home page, consider making it prominent there.
Make the search facility available throughout the site
To promote the provision of a consistent user experience between government websites, AGIMO recommends that common navigation elements, including 'Home', 'About', 'Contact us' and 'Search', are displayed at the 'head' of each page. The search link either should be accompanied by a search box or should provide a link to the site's search page.
Further information about common design elements and branding for the Australian Government is available at http://webpublishing.agimo.gov.au/Visual_Design_and_Branding [
].
Consider whether advanced search options should be provided
Advanced search options enable the user to enter more complex searches with the objective of identifying more specific results. Advanced search options may include the use of boolean operators ('and', 'or', 'not', 'near', etc).
While an advanced search interface is useful for specialist audiences (such as librarians, researchers and other skilled users), consider whether it should be hidden, or downplayed.
If an advanced search is provided, agencies may decide that the majority of the resources involved in implementation should still be devoted to ensuring the effectiveness of the default 'simple' search.
Test search interfaces with users
The most effective way of ensuring that the search facility meets user needs is to test the interfaces with users. Further information on testing with users is available in Better Practice Checklist 3, Testing Websites with Users.
Designing the results pages
Users can easily be overwhelmed by the prospect of browsing through thousands of matches when looking for a single page. Search results presented in a clear, inviting and user-friendly format are likely to be well received by users.
Minimise the details provided for each result
Browsing search results can be made much easier by restricting the detail provided for each search. Agencies may consider providing the following in their search results:
- page URL
- page title (as link to the document)
- document summary or abstract.
While public search engine installations often provide other details in search results, such as size in bytes and links to other similar matches, this may not be helpful for most users. If any additional details are included, consider presenting them in smaller or lighter type, so as not to distract the user from the key information
Ensure that descriptions are meaningful
Consider providing a meaningful abstract or summary for each search result. This can be sourced from metadata or automatically abstracted by the search software.
If automated abstracts are used, care needs to be taken to help ensure that the text does not include irrelevant text, such as the standard header and the navigation elements that appear on all pages. Most search engines provide 'start index' and 'stop index' markers that can be used to exclude these elements of the page (the syntax of these markers varies between search engines).
Configuring the search engine
From the user's perspective, the search engine should work 'like magic', returning the desired information even when only a few (potentially mistyped) words are entered. This is achievable if the search engine configuration is fine-tuned.
Consider 'and-ing' search terms by default
Users may expect that if they type in more words, they will get fewer (and more specific) search results. They may also expect that the documents returned will contain all of the terms they entered. For this reason, it may be appropriate to have the search engine 'and' together the terms by default, particularly when searching across large information resources.
Consider the use of URLs, links and anchortext to 'bring forward' important pages
In any large site, common searches can return thousands of results. Textual similarity of the query to document content, title or metadata may not always be a reliable guide to which are the most important resources on the subject of the query. For example, in many universities, thousands of documents match the commonly submitted query 'library'. Each set of minutes of a library acquisition committee contains the word library in its title, repeated many times in its content and prominently in its subject and description metadata. None of these minute pages is as good an answer as the library homepage, from where a visitor may readily find the catalogue, the opening hours, the location of the branches and links to the acquisition committees. But the library homepage is not readily distinguished from all the other minor documents by title, metadata or content. What tells a search engine that it is the best answer may be click-through data (searchers tend to click on this document when they type the query library), URL structure (library.xxx.edu.au or www.xxx.edu.au/library/ and, most importantly, thousands of incoming links using the word 'library' in their anchortext.
Consider search engine 'best bets'
Analysis of search query logs may show commonly typed queries which are either misspelled or use different terminology to that of the agency and as a result do not return the most useful results. For example 'dole' - 'unemployment benefit'. These problems can be partly overcome by spelling correction, thesaurus expansion etc. but such techniques may require a two-stage interaction. Including non-preferred vocabulary among metadata can help and so can "best bets". 'Best bets' establishes a separate database of 'key' pages for a given subject. When the user enters search terms, these are looked up in this database, and any matching entries are listed at the very beginning of the results page. For example, on a bank website, the home loan centre homepage can be presented as a best bet in response to any query which includes 'mortgage', 'home loan', 'property finance', 'housing' etc.
By clearly marking these results as the 'best bets', users may have more confidence in their quality. The introduction of a human element into the creation of the 'best bets' allows a high-quality collection of perhaps a few hundred pages to be identified out of the tens of thousands that may exist.
Implement search engine synonyms
There are a number of common reasons for users not finding information when searching. For example:
- Users may enter the incorrect terminology.
- Pages may use a range of words for the one subject, so any search using one term will miss the results from pages using other terms.
- Users may make spelling mistakes or typographical errors.
- Users for whom English is not their first language may misspell or enter foreign language terms into search facilities.
These can be resolved by implementing a list of synonyms in the search engine. When the user enters a term, it is looked up in the synonym list, and any equivalent words are also included in the search terms. In this way, the desired information can be found even if the search terms do not exactly match the content of the pages.
Synonyms can also be used to increase the extent to which users from culturally and linguistically diverse backgrounds can access the resources on the site. By providing, in the synonym list, terms in other languages spoken by target groups, foreign language terms entered can retrieve English language resources, and vice versa. Further information about making websites more accessible to people from diverse backgrounds is available in Better Practice Checklist 19, Access and Equity Issues for Websites.
Consider spell-checking
A number of search engines provide the option of spell-checking the terms entered by users, which can be useful in assisting users to find the desired documents regardless of typographical or spelling errors.
Consider stemming and 'fuzzy matching'
Stemming and fuzzy matching are designed to provide matches even when the user does not enter exactly the same terms as those used in the pages. While they can be useful, they can also have the side effect of considerably increasing the number of search results returned and may reduce the overall quality of matches.
'Stemming' is the process whereby the search engine automatically adds different endings to the words entered. For example, 'walk' would also find 'walks', 'walked' and 'walking'. This can be useful for managing inconsistencies in the documents being searched, as well as address the differences between Australian/British spelling and North American spelling.
'Fuzzy matching' is the name given to the general class of options that matches words that 'sound like' those entered by the user. The specific method of doing this varies between search tools and is referred to by different names ('soundex', 'phonetic matching', etc). Fuzzy matches are best presented as suggestions, rather than being automatically applied, to avoid topic drift. For example, a soundex algorithm rates lion, lean, lan, lawn, lane and logan (among many others) as equivalent to 'loan'.
Consider 'federated' searching
'Federated' searching recognises that the information that the user is looking for may be drawn from a wide range of sources, including:
- web pages from other sites
- documents
- databases
- business systems
- discussion groups
- external feeds.
In this environment, 'federated' search queries each of these information sources in turn, using whatever form of searching is required to match the format and nature of the individual repository.
These results are then brought together and synthesised into a single meaningful list. Note that, as part of this, differences between the information sources are typically resolved in order to provide a consistently high-quality set of search results. Alternatively, the search results may clearly distinguish between information from the various sources.
Consider 'faceted' searching
Different users are likely to search for items in many different ways. For example, a cooking recipe could be found according to ingredients, geographic region, style of cooking, cost, difficulty, equipment required, type of diet (vegetarian, etc) or other aspects of the recipe.
Each such aspect of an item is called a 'facet'. Once items are classified in this way, a 'faceted' search provides a mechanism for browsing through the classifications, with each step narrowing down the number of matching items.
This is a powerful method, which can be very relevant for large repositories of information or for sites, such as portals, that aggregate a diverse range of topics.
More information on this can be found in the article at www.searchtools.com/info/faceted-metadata.html [
].
Consider the use of taxonomies
A taxonomy or thesaurus is used to classify items according to the subjects to which they belong. These are most often seen in library collections, records and document management systems, or even 'back of the book' indexes.
While taxonomies have traditionally been considered to be distinct from full-text search engines, there are now solutions that combine them to form an effective hybrid 'searching/browsing' interface. The best-known example of this is the Yahoo directory. Entering search terms brings up matching sites, as well as a listing of the categories to which the sites belong. This allows the search to be easily broadened to other sites in the same category.
Recent releases of enterprise search engines have started to provide similar capabilities for use on websites or intranets. Like the other advanced search methods outlined, these could be explored for larger and more complex sites. There may be considerable resource implications with this method.
Other Better Practice Checklists
- Providing Forms Online
- Website Navigation
- Testing Websites with Users
- Use of Cookies in Online Services
- Providing an Online Sales Facility
- Use of Metadata for Web Resources
- Archiving Web Resources
- Managing Online Content
- Selecting a Content Management System
- Implementing a Content Management System
- Website Usage Monitoring and Evaluation
- Online Policy Consultation
- Knowledge Management
- Designing and Managing an Intranet
- Information Architecture for Websites
- Implementing an Effective Website Search Facility
- Spatial Data on the Internet
- Digitisation of Records
- Access and Equity Issues for Websites
- Marketing E-government
- ICT Support for Telework
- Assistive Technology for Employees of the Australian Government
- Decommissioning Government Websites
- ICT Asset Management
- Managing the Environmental Impact of ICT
Download PDF of Checklist 16 - Implementing an Effective Website Search Facility [
- 324 KB]
Contact for information on this page: AGIMO Better Practice Team

