Source author record

Suju Rajan

Suju Rajan appears in the imported research catalog. Authorship, coauthor and topic links are available while profile ownership is still unclaimed.

ResearcherUnclaimed source record

Information Retrieval Machine Learning

Catalog footprint

What is connected

4works

2topics

4close collaborators

Actions

Connect this record

Open graph Browse works

Inspect adjacent papers, topics, institutions and collaborators without losing the researcher page.

Building this map preview

BZPEER is loading the nearby papers, people, topics and institutions for this page.

preprint2016arXiv

Geometry Aware Mappings for High Dimensional Sparse Factors

While matrix factorisation models are ubiquitous in large scale recommendation and search, real time application of such models requires inner product computations over an intractably large set of item factors. In this manuscript we present a novel framework that uses the inverted index representation to exploit structural properties of sparse vectors to significantly reduce the run time computational cost of factorisation models. We develop techniques that use geometry aware permutation maps on a tessellated unit sphere to obtain high dimensional sparse embeddings for latent factors with sparsity patterns related to angular closeness of the original latent factors. We also design several efficient and deterministic realisations within this framework and demonstrate with experiments that our techniques lead to faster run time operation with minimal loss of accuracy.

preprint2016arXiv

Learning Optimal Card Ranking from Query Reformulation

Mobile search has recently been shown to be the major contributor to the growing search market. The key difference between mobile search and desktop search is that information presentation is limited to the screen space of the mobile device. Thus, major search engines have adopted a new type of search result presentation, known as \textit{information cards}, in which each card presents summarized results from one domain/vertical, for a given query, to augment the standard blue-links search results. While it has been widely acknowledged that information cards are particularly suited to mobile user experience, it is also challenging to optimize such result sets. Typically, user engagement metrics like query reformulation are based on whole ranked list of cards for each query and most traditional learning to rank algorithms require per-item relevance labels. In this paper, we investigate the possibility of interpreting query reformulation into effective relevance labels for query-card pairs. We inherit the concept of conventional learning-to-rank, and propose pointwise, pairwise and listwise interpretations for query reformulation. In addition, we propose a learning-to-label strategy that learns the contribution of each card, with respect to a query, where such contributions can be used as labels for training card ranking models. We utilize a state-of-the-art ranking model and demonstrate the effectiveness of proposed mechanisms on a large-scale mobile data from a major search engine, showing that models trained from labels derived from user engagement can significantly outperform ones trained from human judgment labels.

preprint2016arXiv

The Apps You Use Bring The Blogs to Follow

We tackle the blog recommendation problem in Tumblr for mobile users in this paper. Blog recommendation is challenging since most mobile users would suffer from the cold start when there are only a limited number of blogs followed by the user. Specifically to address this problem in the mobile domain, we take into account mobile apps, which typically provide rich information from the users. Based on the assumption that the user interests can be reflected from their app usage patterns, we propose to exploit the app usage data for improving blog recommendation. Building on the state-of-the-art recommendation framework, Factorization Machines (FM), we implement app-based FM that integrates app usage data with the user-blog follow relations. In this approach the blog recommendation is generated not only based on the blogs that the user followed before, but also the apps that the user has often used. We demonstrate in a series of experiments that app-based FM can outperform other alternative approaches to a significant extent. Our experimental results also show that exploiting app usage information is particularly effective for improving blog recommendation quality for cold start users.

preprint2015arXiv

Large-Scale Distributed Bayesian Matrix Factorization using Stochastic Gradient MCMC

Despite having various attractive qualities such as high prediction accuracy and the ability to quantify uncertainty and avoid over-fitting, Bayesian Matrix Factorization has not been widely adopted because of the prohibitive cost of inference. In this paper, we propose a scalable distributed Bayesian matrix factorization algorithm using stochastic gradient MCMC. Our algorithm, based on Distributed Stochastic Gradient Langevin Dynamics, can not only match the prediction accuracy of standard MCMC methods like Gibbs sampling, but at the same time is as fast and simple as stochastic gradient descent. In our experiments, we show that our algorithm can achieve the same level of prediction accuracy as Gibbs sampling an order of magnitude faster. We also show that our method reduces the prediction error as fast as distributed stochastic gradient descent, achieving a 4.1% improvement in RMSE for the Netflix dataset and an 1.8% for the Yahoo music dataset.

Suju Rajan

What is connected

Connect this record

See the researcher in context

Building this map preview

4 published item(s)

Geometry Aware Mappings for High Dimensional Sparse Factors

Learning Optimal Card Ranking from Query Reformulation

The Apps You Use Bring The Blogs to Follow

Large-Scale Distributed Bayesian Matrix Factorization using Stochastic Gradient MCMC