Log In

Reset Password

Counting the cost of a search engine

Businesses are spending lots on implementing search engine capabilities for their workers for use on corporate networks, and for clients on high-end Web sites.

Just how much is astonishing, as Forrester Research outlines in a new report advising on how to choose from a variety of options. Companies are spending anywhere from $50,000 to $385,000 to implement a corporate search engine.

Forrester estimates that companies also spend additional amounts on hardware and maintenance.

"Hardware costs vary greatly depending on the efficiency of the search algorithm," Forrester says. "When one large engineering firm went to a new search engine, it bought high-end Dell servers at $7,000 each, eliminating the leased, high-end servers required by the old engine - each of which cost $9,000 per month.

"Maintenance costs usually run 15% to 20% of the original licence.

"Add to this salaries of knowledge engineers to groom ontologies or managers to update synonym lists, plus IT staff to test retrieval of new content." With those kind of costs businesses face a major hurdle in determining what search engine is suitable for their workers and customers' needs. Search engine vendors attempt to confuse the issue by using what Forrester labels as "technobabble", offering a "dizzying array of capabilities based on arcane technology".

Complicating matters further, there are nearly 100 types of search engines available from competitors in three tiers. Companies that sell databases - like IBM, Oracle, and Microsoft - sell search by default because each repository has to provide a way for users to find information stored in it.

The engines range from Oracle's term retrieval to Microsoft's "highly evolved" probability matching. Others like AltaVista, Autonomy, Inktomi, Verity and Google, the hot new upstart, have encroached on those markets by sheer popularity with thousands of ordinary Internet users.

"There's a simple path through this complex landscape," Forrester says. "To cut through the clutter, buyers should focus on user needs and content attributes. User needs affect the type of queries the engine must support - just keywords or something more sophisticated like guided navigation.

"Content attributes - like data type, metadata and breadth of subject matter - boost search performance when matched with the right kind of engine." Due to the nature of the various methods search engines use, figuring out what's right looks like a lot of work. For example, most search engines combine the basic Boolean search capability with one or several of seven other types of search methods.

Boolean searches retrieve documents based on the number of times keywords appear in the text. Simple expressions like "AND", "OR", and "NOT" make queries more specific.

Meanwhile a clustering search method creates groups of documents based on similarity, usually based on a statistical analysis of the contents and the structure of the document text. Search engines like Autonomy, GammaSite, and Vivisimon use the cluster method.

Search engines like Albert, Inxight Software, and InQuira use natural language processing to try and produce better results. This method uses grammatical rules to find and understand words in a particular category, or classify words by parts of speech in an attempt to understand their meaning.

Descriptions of other search methods you are likely to encounter from vendors are referred to as ontological, probabilistic, taxonomic, and vector-based.

Wading through this thicket of capabilities is daunting. Forrester advises companies to consider whether their most important content is stored in a relational database, in which information is linked according to the manner in which it is input. Some search engines are specifically designed to mine such databases.

Another consideration is the ability of users who are looking for information in the corporate database.

"The vast majority of consumers and employees have poor search skills and limited subject matter knowledge," Forrester warns.

"As a result, search engines must compensate for typical usage patterns, such as queries of two words or less that include unpredictable terms." Companies also need to determine the breadth of the content available to on their networks. Some search solutions use a formal description of key terms and relationships as a cost-effective method when content is limited to a single subject like wine, or medical conditions. "Where the subject matter is effectively unlimited - like all the repositories of an entire multinational company - the cost of building and maintaining this formal description is too high to be practical for most initiatives," Forrester said.

As a further piece of advice, Forrester noted a common trick search engine sellers will use to get a company executive to buy their product. This is labelled as the vanity method of marketing.

"Based on its experience with clients, Inktomi has uncovered the secret to getting an executive's approval of a new search engine deliberately tune the results for a search on his or her name," Forrester said.

: Tech Tattle deals with issues in technology. Contact Ahmed at editoroffshoreon.com.