Introduction
Once a site starts growing a search facility is a necessity. There are a number of third
party search engines that can be installed in your site, but for many purposes the Index
Server that comes with IIS is more than adequate. What is the Index server? The online docs
put it succinctly as follows:
Indexing Service is a Microsoft� Windows� 2000 service that indexes files on
your disks and their properties as well as your Internet Information Services
(IIS) Web files and properties. Indexing Service stores the resulting
information in catalogs that you can efficiently search using a variety of
queries.
The Index Server allows you to run queries against the service using ADO and OLE DB. This
provides ease of use and flexibility in providing a search facility.
The Index Server Object
The index server is created like any other COM object on your server:
dim ixQuery ' Index Server query object.
set ixQuery = Server.CreateObject("ixsso.Query")
The object has a number of properties that can be set before running the query. The
most useful are the Columns, SortBy, MaxRecords and Query properties.
Columns
The Columns property allows you to specify which fields are returned by
the query. For a full list you should consult the online docs, but for the current example
we will return the following:
- doctitle
- The page title (as specified in the <TITLE>...</TITLE> element)
- vpath
- The Virtual path to the page
- size
- The size of the page
- characterization
- A description of the page
- rank
- A value specifying how well the page matches the search criteria
ixQuery.Columns = "doctitle, vpath, size, characterization, rank"
SortBy
SortBy specifies how the matches will be sorted. List the fields in relevant
sort order, and use "[d]" to specify that the field should be sorted in descending order.
ixQuery.SortBy = "rank[d], doctitle"
MaxRecords
You should limit the number of matches the query returns - chances are the user will only
browse the first couple of dozen in any case
ixQuery.MaxRecords = 300
Catalog
A Catalog represents the indexing results for a particular directory (or directories). If
you don't specify a catalog then the Index Server will use the default 'web' catalog that
indexes /inetpub/wwwroot. Sometimes you may want to specify a catalog (eg. if your
site is in a different directory or you want to have multiple catalogs for different search
pages).
Adding a catalog
To set up a catalog, go to the Indexing Services branch of the Services and Applications
branch in the Computer Management console (under Start -> Programs -> Admin Tools). Right click
on Indexing Services and select New -> Catalog. Enter in the name of your new catalog and
a location where the index files should be stored. Hit OK, then right click on the newly
created catalog and select New -> Directory. Add a directory that you wish to have indexed,
and repeat as necessary. Subdirectories will automatically be indexed too. You can also
specify directories within the directory tree that should not be indexed. To do this, add
the directory that you wish to be ignored, and click No in the Index this resource box).
Typically you would add a directory tree to be indexed, and then you may want to specify
certain subdirectories under that directory's heirachy that you don't want indexed. This
gives you some coarse grain control over what gets indexed.
Ensuring that the Catalog generates abstracts for your searches
If you want your catalog to contain abstracts of the files indexed then you need to
right click on the catalog and select properties. Click on the Generation tab
and ensure that the Generate Abstracts checkbox is ticked. If it's disabled, then
uncheck the Inherit above settings from Service box. You can then set the size
of the abstract to be generated.
Ensuring that the search generates correct vpath's for your search
To ensure that the index search generates correct virtual paths (vpaths) for the search you
should associate the catalog with the web server. In the computer management console under
'Indexing Service' right click on your catalog and select properties. Click on the Tracking tab
and choose your server from the 'WWW Server' dropdown
Specifying a Catalog to use in your search
Specify the catalog to use in your search by adding the following:
ixQuery.Catalog = "CodeProject"
Fire up the Internet Service Manager, open up the properties dialog for your site, select the 'Home Directory'
tab and ensure that the "Index this resource" check box is ticked.
You also need to ensure that the folder properties in Explorer are set to
allow the folder to be indexed. Navigate to the folder containing the folder
with your site's files, right click on your site's folder and choose properties,
click 'Advanced' and check the 'For fast searching, allow Indexing Service to
index this folder'.
Thanks to Kurt and Izidor Gams for updates on this.
Query
The actual query. This is the guts of the entire operation. The Index Server supports
3 query languages: Dialect 1 (Index Server 1.0), Dialect 2 (Index Server 3.0) and SQL
(Index Server 2.0 and above). See the topic "Query Languages for Indexing Service" in MSDN
for a full explanation of these different languages.
In our case we'll work with the simple dialect 1 - though it would be just as easy
to use the familiar SQL syntax if you wished.
At the simplest, you can simply set the Query propery of you Index Server object
as the search target. For example, if you were looking for all pages with the word
"Apples", the use
ixQuery.Query = "Apples"
We can refine this somewhat by specifying which files will and will not be searched,
the way in which you target query is interpretted (as a phrase, as a free text search,
as an exact match etc) and also the types of pages that will be searched (eg only pages
written after a certain date, or less than a certain size).
For example, to specify a free text search for the phrase "Apples are green", use
$contents Apples are green
To specify field restrictions, use the "@" prefix on a predefined field name, and
an expression. For instance:
@size < 1000000 ' size must be less than 1,000,000 bytes
@contents apple tree ' Contents must contain the phrase "apple tree"
@write > 70/10/24 ' Page must have been written after October 24, 1970
Filename restrictions can be specified by using the "#" prefix to specify a
regular expression search, and a wild card;
#filename *.asp ' search only ASP files
#vpath *\articles* ' search in the \articles subdirectory
All these expressions can be combined using the boolean operators AND, NOT, OR etc. Thus
if you search target expression is "Apples", you only want to search in ASP files, and
you want to ignore the \_vti directory, use the following:
ixQuery.Query = "(#filename *.asp) AND (NOT #vpath *\_vti*) AND (Apples)"
The Index Server Utility object
A related object to the Index Server object is the Index Server Utility object. This allows
you to specify to specify the depth of the search - either "shallow" (for the named directory
only) or "deep" (for a recursive search through all sub-directories).
dim util
set util = Server.CreateObject("ixsso.Util")
util.AddScopeToQuery ixQuery, Server.MapPath("/"), "deep"
The first parameter specifies the Index server object to associate the utility with; the second specifies the physical path to start the search (in our case the root folder); and the third specifies the type of search.
Performing the search
To run the actual query, simply call Query.CreateRecordset
dim queryRS
set queryRS = ixQuery.CreateRecordSet("nonsequential")
Displaying the results
To display the results simply loop through the recordset.
Response.Write "<table width='100%'>"
do while not queryRS.EOF
dim docTitle
docTitle = queryRS("doctitle")
if docTitle = "" then docTitle = "Untitled"
Response.Write "<tr>"
Response.Write "<td valign=top>"
Response.Write recordNumber & ".</td>"
Response.Write "<td valign=top>"
Response.Write "<a href='" & queryRS("vpath")
Response.Write "'>" & docTitle & "</a><br>"
Response.Write "<b>URL: </b> http://"
Response.Write Request.ServerVariables("server_name")
Response.Write queryRS("vpath") & "<br>"
Response.Write Server.HTMLEncode(queryRS("characterization"))
Response.Write "</td>"
Response.Write "</tr>"
recordNumber = recordNumber + 1
queryRS.MoveNext()
loop
Response.Write "</table>"
The demonstration script
The sample script ties all this together and also demonstrates how to provide the user with a facility to view the results page by page. Feel free to use and customise this script on your own sites.
History
16 Jun 2000 - posted
23 Apr 2001 - update to fix paging problem (thanks to Khaled)
29 Jul 2001 - update to include information on generating abstracts
31 Oct 2001 - update to include information on generating vpaths and ensuring indexing is working