Machine Learning for IR (Afternoon): Emerging Learning Technologies for Information Retrieval

In recent years, we have witnessed successful application of machine learning techniques to a wide range of information retrieval problems, including Web search engines, recommendation systems, online advertising, etc. It is thus critical for researchers in the information retrieval community to understand the core machine learning techniques. In order to accommodate audiences with different levels of understanding of machine learning techniques, we divide this tutorial into two sessions: the first session will focus on basic machine learning concepts and tools; in the second session, we will introduce more advanced topics in machine learning, and will present recent developments in machine learning and its application to information retrieval. Each season is self-contained. One can either register for one session, or both sessions.

1. Machine Learning to IR: Core Learning Technologies for Information Retrieval

This session of the tutorial will cover core machine learning techniques, basic optimization techniques and key IR applications. In particular, it includes: 1). Core concepts in machine learning, such as supervised learning/unsupervised learning, bias and variance trade off, and probabilistic models; 2). Useful concepts and algorithms in optimization including the first and second order gradient methods, and Expectation and Maximization; 3). The application of machine learning methods to key information retrieval problems including text classification, collaborative filtering, clustering and learning to rank;

2. Machine Learning to IR: Emerging Learning Technologies for Information Retrieval

The second session of the tutorial will cover more advanced machine learning techniques that have started to be utilized in information retrieval applications. In particular, it will cover: 1). Advanced Optimization Techniques including stochastic optimization and smooth minimization; 2). Emerging Learning Techniques such as Multiple-Instance Learning, Active Learning and Semi-supervised Learning."

Presenters:

Dr. Si is an assistant professor at Computer Science Department and Statistic Department (by courtesy) at Purdue University. Dr. Si’s research interests include information retrieval, machine learning techniques and applications, and text mining techniques. Dr. Si is an associative editor of ACM Transactions on Information System and an editorial board member of Information Processing and Management. Dr. Si received NSF Career Award in 2008. Dr. Si obtained his Ph.D. degree from Carnegie Mellon University in 2006. Dr. Si has collaborated with industry companies including Yahoo! and Google.

Dr. Jin is an Associative professor of the Computer and Science Engineering Dept. of Michigan State University. He has been working in the areas of statistical machine learning and its application to information retrieval. Dr. Jin has extensive research experience in a variety of machine learning algorithms such as conditional exponential models, support vector machine, boosting and optimization for different applications including information retrieval. Dr. Jin is an associative editor of ACM Transactions on Knowledge Discovery from Data. Dr. Jin received NSF Career Award in 2006. Dr. Jin obtained his Ph.D. degree from Carnegie Mellon University in 2003. Dr. Jin has collaborated with industry companies including Yahoo!, Intel and NEC.