A large scale machine learning system for recommending heterogeneous content in social Network
Yanxin Shi, Facebook Inc.
The goal of Facebook recommendation engine is to compare and rank heterogeneous types of content in order to find the most relevant recommendations based on user preference and page context. The challenges for such a recommendation engine include several aspects: 1) the online queries being processed are at very large scale; 2) with new content types and new user-generated content constantly added to the system, the candidate object set and underlying data distribution change rapidly; 3) different types of content usually have very distinct characteristics, which makes generic feature engineering difficult; and 4) unlike a search engine that can capture intention of users based on their search queries, our recommendation engine needs to focus more on users’ profile and interests, past behaviors and current actions in order to infer their cognitive states. In this presentation, we would like to introduce an effective, scalable, online machine learning framework we developed in order to address the aforementioned challenges. We also want to discuss the insights, approaches and experiences we have accumulated during our research and development process.
Yanxin Shi joined Facebook in June, 2008, as a machine learning engineer in the recommendation engine team at Facebook. His main contributions include designing the recommendation engine architecture, implementing the machine learning system, and experimenting with a wide range of features. In academia, he received his Bachelor’s degree from Tsinghua University, Beijing in 2005, his Master’s degree in Computer Science from Carnegie Mellon University in 2007. He is currently on leave from the Computer Science PhD program in Stanford University. His research interests focus on machine learning with applications to natural language processing, system biology and information retrieval. He has published papers in, given presentations on and served as reviewer for a number of top journals and conferences.