The Motive Web Design Glossary
A search engine is a program that builds an index of website content, that a person can then search to find relevant webpages.
A search is typically carried out by entering a keyword or phrase (query) into a text field and then clicking a button, but may also be initiated by clicking a hyperlink.
Types of search
Although the text-box and search button is fairly common-place, the type of search—often described in terms of the scope of content the search engine has indexed—is not always evident.
- internal search
- An internal search can only be used to find content on a single website (or intranet or extranet). For example the Motive search, at the top-right of each page, can only be used to find pages on the Motive website.
- external or public search
- A public search can be used to find content on any website, anywhere on the web. For example Google (also see details below on search engine registration).
- meta search engine
- A meta search engine uses the indexes of other search engines to find content, anywhere on the web. For example Dogpile.
Search engine registration
Linking and indexing
Only webpages that are linked to from a URL submitted to a search engine are indexed.
To add a website to its search index, a search engine must first be told where to ‘find it’. Notifying a search engine of a new website is referred to as search engine registration.
The registration process involves submitting an entry-level webpage address (URL) to a search engine. This entry-level page is typically the address of the homepage or sitemap.
Add URL pages
Quick links to the website registration pages for the top search engines and directories.
In addition to a webpage address, a search engine may also require basic information about your site, such as a short description of the website, topics covered, and owner.
Most public search engines have an ‘Add URL’, ‘Submit URL’ or ‘Suggest a site’ link that links to information on how to register a website. This link is typically found in the list of links at the bottom of the search engine homepage.
Once the website has been registered, the search engine will access the website using an indexing program (spider). The indexing program follows all the links on the submitted webpage to
other webpages under the same domain. It then follows the links it finds on those webpages, ‘crawling’ the entire website, to build a index of all the website content.
Search engine results
A search engine results page (SERP) lists webpages in order of their relevance to the query entered. The webpage listed at the top of the results page has been selected by the search engine as the most likely to provide the content the user is seeking.
Each search result listing usually features the destination webpage meta title (as the link text), followed by a description and/or an excerpt showing the query highlighted in the context of the webpage content (concordance).
Search engine ranking algorithms
Each search engine has its own method for calculating relevance, usually based on an analysis of the content of the destination webpage, including:
- meta title (visible at the top of the web browser window);
- number of incoming links, (commonly referred to as the page’s ‘popularity’). Popularity-based ranking assumes that the more incoming links a webpage has, the more likely it is to be a subject ‘authority’;
- incoming link text: a search engine may make assumptions about the content of a website based on how other people have described it through the text they have used to link to a site;
- use of appropriate semantic markup, for example, use of heading elements; and
- page text.
Each of these aspects of the webpage is scored and then weighted. For example, a search engine may assign a greater weighting to meta title text than other aspects of the webpage. In this case, a webpage that includes the query in its meta title text may then be ranked higher than a webpage where the meta title does not include the query.
The calculation each search engine uses to rank webpage relevance is often a closely-guarded secret.
The scores for each aspect of the webpage are combined to determine the overall relevance of the webpage.
The calculation (algorithm) each search engine uses to rank webpage relevance is often a closely-guarded (and patented) secret. This is both to prevent websites from artificially inflating their rankings; and also because the quality of the search results translates directly into user-loyalty, traffic and revenue generating opportunities.
crawler/robot/spider, directory, metadata, PageRank, reciprocal links, robots.txt, search engine optimisation.
References and further reading
- Bare Bones 101: A basic tutorial on searching the web (Ellen Chamberlain, University of South Carolina Beaufort)
Orientation to the essentials of searching the web; including the types of search engine and strategies for searching.
- Designing search checklist (Chiara Fox, Adaptive Path)
14 July 2008: A quick list of design elements and search features to consider when creating a search results page template.
- The Essentials of Google Search (Google)
Exploring how Google (and many other search engines) increase the effectiveness of searching: by enabling control over the types of match that are made for the word or phrase entered.
- Google Trends (Google)
Enter a keyword (or words) to
see what the world is searching for.
- Mental models for search are getting firmer (Jacob Nielsen, useit.com)
Users now have precise expectations for the behavior of search. Designs that invoke this mental model but work differently are confusing.
- Search engine indexing limits: Where do the bots stop? (Serge Bondar, Sitepoint)
28 Apr 2006: Content indexing programmes for Yahoo, Google and MSN each collect/store and index only a portion of the text on each webpage (based on filesize). Content outside of these filesize limits will not be used in SERPs.
- Producing great search results: Harder than it looks, Part 1 (Jarel Spool, UIE)
9 July 2008: An overview of the design, decision-making and processes behind creating an effective search results page.
- Producing great search results: Harder than it looks, Part 2 (Jarel Spool, UIE)
14 July 2008: Prevent Pogosticking. Most Relevant Links Should Be First. Eliminate the Wacko Results. Put More Results On Each Page. Handle ‘No Results’ Gracefully.
for Google’s successor (Wired)
Considering the future of search
- A spider’s view of Web 2.0 (Michael Wyszomierski and Greg Grothaus, Google)
6 Nov 2007: A spider is a program that builds an index of website content. This index is then used to generate search results. Certain techniques associated with Web 2.0 may have an impact on a spiders ability to index your website.
- Targeting your site at users in a particular geographic location (Google)
Google figures out the country that a website related to based on the top-level domain (for example
.com.au) and the IP address of the webserver that is hosting the website. If your domain name is generic, for example
.org, then you can set a geographic target (country) once you have created a Google account.
- [PDF] Usability Testing of FirstGov Search Functionality (Campbell S, Wolfson C, UPA)
Evaluating search interface and search result formatting on the US government portal.
- Use old words when writing for findability (Jacob Nielsen, useit.com)
Copy that includes keywords that are likely to be known to the casual searcher can help to improve search engine ranking and the relevance of internal search results. [Perhaps there is unintended irony in Neilsen’s use of ‘findability’ in the title of his article?]
- Web Admin’s
Guide to Site Search Tools
An introduction to choosing a search
tool for your website.
- Webmaster Guidelines (Google)
Essential introduction to having your website indexed by Google.
- Yahoo Add [Website] FAQ (Yahoo!)
Motive Web Design Glossary Trivia