Engines: How they Work
operate in three parts, (1) a mechanism that identifies web pages
to be included in the database, (2) a mechanism that indexes the
sites, and (3) a searching mechanism with an interface, which
scans the index for keywords.
a search, search terms are entered and the index is searched.
The index is a database that holds the information related to
the web documents. Documents in which the search terms occur are
presented as "hits."
tools retrieve "hits" or "matches" by seeking
occurrences of your search terms within its database and by attempting
to match the terms against its index.
is an automated device (software) which may be programmed to search
for terms (data "strings") matching certain criteria.
'Bots are also known as intelligent agents, spiders, crawlers,
robots, or worms.
A 'bot identifies
and notes the url's of web pages to be included in the database.
Then, another 'bot comes along and scans the interiors of web
documents and records occurrences of words and their position
within the text. This is the information used to create an index.
from one hypertext link to another.
engine had its own method for calculating relevance. Relevance
is a rank assigned to the hits that your search term(s) have generated.
Some search engines assign a number to each hit. That number next
to the URL indicates its "relevance ranking". Relevance
is simply the probability that the "hit" or "match"
is on-target with your query.
masters do not divulge their secrets for calculating relevance.
Appearing high in the major search engines' rankings on a topic
means big business.
engines look only in certain fields to index documents such as
the title field, first paragraph and in something called "meta-tags."
Meta-tags allow the creator of a web site to add descriptive keywords
which are not displayed in the actual web documents; they are
specifically to enhance retrieval of the document.