Desktop search
Desktop search options
News centre
ITHOUND
ADVERTISEMENT

Technology Trends in Search

IWR Labs:

We spotlight several key technology trends in the field of online search

By Davey Winder, Information World Review 02 May 2005

This article is a background briefing to the "Desktop Detectives" lab-test feature in the May 2005 issue, in which we look at the relative strengths and weaknesses of six leading desktop search offerings: Blinkx, Copernic, Google, Grokker, MSN and Yahoo!

Intelligent Clustering
New technologies must integrate with keyword search-driven interfaces to provide additional efficiency rather than try and bulldoze a new search paradigm upon an unwilling market. Vivisimo realises this, and it's Clustering Engine - which is used in Clusty - is a fine example. Using natural language, rather than embedded metadata, this technology applies a clustering algorithm to group results together through textual and linguistic similarities. It augments this with heuristic filtering based upon historical content choice, which is the rather complicated key to its success. And succeed it does, enabling the efficient discovery of data that would otherwise often lie buried and undiscovered 500 or 100 layers deep in a typical linear results listing.

The Deep Web
What exactly is the "deep web" you may be wondering? Essentially any data that is hidden from search spiders - such as that which is created dynamically, or documents in password-protected directories. Turbo10 is a UK-based search engine that collects and collates deep web content, and can boast in excess of 1,750 data sources ranging from educational facilities to government departments. By enabling the user to create "research groups" comprising different deep web database content it's relatively easy to build a library of highly personal subject-specific search sources. Google is treading into deep web territory with the Google Scholar initiative, providing access to peer reviewed academic content and technical reports.

Serialisation and Personalisation
The recent announcement by Amazon's A9.com of a syndicated search facility bringing content suppliers such as the British Library, PubMed and the New York Times together as sources for a personalised federated search environment is thrilling stuff. OpenSearch marks a move into RSS-based search and many pundits are already suggesting that this will be perhaps the most important of all new search trends over the course of the next 18 months.

Microsoft Research Labs
AskMSR returns plain English answers to plain English questions. Many have tried this before, and nobody has come close to perfecting the holy grail of search technologies. That is, until Microsoft Research Labs tackled it. Using a base of linguistic rules that have been learnt from a vast database of example sentences, AskMSR can re-write a search string instantly so that it resembles any number of potential "answers". These newly created phrase fragments are then used as the new and "real" search criteria. Usually they will point to myriad answers within which AskMSR then searches for frequently appearing keywords. If the keyword frequency level is high enough then it's a reasonable conclusion the answer is valid. The clever thing is the way that this technology uses redundancy to increase its accuracy, so the bigger the web becomes the more accurate AskMSR is. Unfortunately, Microsoft is giving no clues as to when or where we'll see this technology in action.


All

Like this story? Spread the news by clicking below:

Post this to Delicious del.icio.us    Post this to Digg Digg this    Post this to reddit reddit!

Permalink for this story
Other websites