Building Search Systems for the Enterprise
Shivakumar Vaithyanathan, IBM Research
In contrast to the radical advances in Web search over the last several years, search over enterprise intranets has remained a difficult and largely unsolved problem. Among the reasons is the fact that content creators lack the economic incentive to make the relevance of their pages easily discoverable. Towards a solution, we developed the notion of a search database system, which is designed around two core necessities. First, auxiliary databases compiled from corpus-text analysis are crucial to understanding the intent of a search query, and the degree to which this intent is met by a given document. Second, administrators should have the ability to hook into the ranking logic in order to specify critical up-to-date knowledge, and thereby ensure high quality even as content, trends, and users change over time. In a search database system, such knowledge is specified by means of a simple yet expressive rule language for query rewriting. I will describe the efforts in developing a robust implementation of a search database system for the IBM intranet, which will go live in early August 2011. I will also discuss our theoretical modeling of search database systems, and some of the research directions worthwhile pursuing.