Search engine for your website
When starting a website, you can focus on the authoring. The number of pages building up the website is small, can easily be categorized and links to the articles can be on one single page. But as time passes by, your website grows and the visitors need a way to find older articles. Or in other words, you need to implement a search functionality for your website. But how?
The search functionality can be implemented in several different ways, depending on how the data is stored. If all text is stored in a database, the search code can query the database directly to match query strings, without any need for a new database structure. This however has many drawbacks; It is difficult to implement a ranking system, it is inefficient, and may put unnecessary load on the database server.
A better way is to develop (or purchase) an indexing application. This will work more or less like a search engine like google.com. It will crawl your website (or use the file system) and index the result. One such application is Microsoft Indexing Service, which is available from the Add/Remove Windows Components (hence part of Windows Server 2003 and "free"). The indexing service is able to "catalog" (as they call it) the website, but you need to create the search form yourself (or use a pre-built form).
There are several other free, third-party, search engines. For example Lucene.Net (a port of Apache Lucene to .NET) together with Seekafile Server (for indexing). The Search Engine Project (TSEP) is another open-source search engine (and indexer), it is however PHP based.
There are however not a plethora of good open-source (or free) search engines. Remember that search engines are very sophisticated and a lot of effort is needed to build a good search engine. Look at google.com, Windows Live or Yahoo; if it were easy to build a good search engine, there sure would be a lot more alternatives. So do not expect too much of a free (or commercial) search engine. It may however be good enough for your requirements.
Sometimes the easiest (and thus also the cheapest) solution is to use a service from a company that crawls your website from the outside, without requiring you to install anything on your server. Google offers this service, and other companies as well.
Published 2007-01-31 00:00 GMT+0100 by Kristofer Gafvert