Building Search Systems for the Enterprise

Shivakumar Vaithyanathan, IBM Research

Abstract

In contrast to the radical advances in Web search over the last several years, search over enterprise intranets has remained a difficult and largely unsolved problem. Among the reasons is the fact that content creators lack the economic incentive to make the relevance of their pages easily discoverable.  Towards a solution, we developed the notion of a search database system, which is designed around two core necessities.  First, auxiliary databases compiled from corpus-text analysis are crucial to understanding the intent of a search query, and the degree to which this intent is met by a given document. Second, administrators should have the ability to hook into the ranking logic in order to specify critical up-to-date knowledge, and thereby ensure high quality even as content, trends, and users change over time.  In a search database system, such knowledge is specified by means of a simple yet expressive rule language for query rewriting.  I will describe the efforts in developing a robust implementation of a search database system for the IBM intranet, which will go live in early August 2011. I will also discuss our theoretical modeling of search database systems, and some of the research directions worthwhile pursuing.

Bio

Shivakumar Vaithyanathan