SearchInform Technologies an information search, storage and processing vendor has announced SearchInform version 3.0. The new functionality should lead to faster search times with 50,000 to a 100,000 queries processed from half a terabyte of data in under an hour. The application is designed for searching large volumes of data, for example in archival work and will compete with dtSearch, iSYS and Google Desktop.
The document search functionality is based on the SoftInform search technology where the programme algorithms analyse a documents structure then selects similar words or a combination of words including text arrays. Based on these results, similar documents are then also cross-referenced. The tool recognises around 20% of search queries are unique, so only that percentage needs to be fully processed, thereby increasing the speed of returned results, even though the number of documents in an index increases, these are kept exempt by the caching system from being colleted numerous times.
The improvements to indexing and searching are based around the new algorithms for how a search request is cached, so only new information is added to a modified index and consequently the system spends less time re-coding all queries and results multiple times. SearchInform claim that not only does this improve the speed but also the quality of data returned, making the application three times faster than other existing search systems.
The programme is also able to search a variety of data sources such as text. Word documents, pdf, htm and html as well as any SQL supported database. It can also search mp3 and avi tags and logs of MSN and ICQ instant messaging programs. There is also the option to specify a search with an “important words” function which the application prioritises whilst engaging in a similarity analysis.