I am a product data scientist at Twitter, the service that enables users to inform and stay informed about what's happening.
Formerly, I was a data scientist at Paytm Labs, where I built data products for Paytm, a mobile recharge and payments platform. Before the Labs, I was a data scientist and early engineer at Rubikloud Technologies, a retail analytics company.
Prior to Rubikloud, I was a graduate student in Applied Mathematics at Ryerson University. At Ryerson, I worked with Prof. Anthony Bonato on the design of methods to assess the fit of random graph models to online social networks. Before Ryerson, I studied Artificial Intelligence at Amirkabir University of Technology and defended my thesis in July 2012. At Amirkabir, I was a member of the Image Processing and Pattern Recognition Lab led by Prof. Mohammad Rahmati. During 2011 to 2012, I had the great opportunity of working with Prof. Bob L. Sturm—then at Aalborg University Copenhagen—on a number of machine learning projects in feature learning and audio classification. I studied Software Engineering at University of Tehran and graduated in June 2009. [CV]
Abstract. Master's thesis defense slides.
Abstract. We propose sparse approximation weighted regression (SPARROW), a nonparametric method of regression that takes advantage of the sparse linear approximation of a query point. SPARROW employs weights based on sparse approximation in the context of locally constant, locally linear, and locally quadratic regression to generate better estimates than for e.g., k-nearest neighbor regression and more generally, kernel-weighted local polynomial regression. Our experimental results show that SPARROW performs competitively.
Abstract. Sparse coding is achieved by solving an under-determined system of linear equations under sparsity constraints. We briefly look at several algorithms that solve the resulting optimization problem (exactly or approximately). We then see how this optimization principle can be applied in both a supervised and unsupervised context: multiclass classification and feature learning, respectively. Next, we talk about dictionary learning and some of its well-known instances. Applications of dictionary learning include image denoising and inpainting.
Abstract. An image can be represented at different levels, starting from pixels, going on to edges, to parts, to objects, and beyond. Over the years, many attempts have been made at engineering useful descriptors that are able to extract low-to-high level features from images. But what if we could make this process automatic? What if we could "learn" to detect layer after layer of features of increasing abstraction and complexity? After all, it would be impossible for us to foresee and hard-code all the kinds of invariances necessary to build features for our ever more complicated tasks. In this talk, we go over several unsupervised feature learning methods that have been in the making since 2006.
Details. This is a brief tutorial on learning theory for a machine learning class.
Details. This is a brief tutorial on applying the EM algorithm for estimating the parameters of a Gaussian mixture model.
Abstract. Apart from raising computational costs, high-dimensional data behave in counterintuitive ways. In this seminar, we talk about why in some situations, more features fail to result in increased accuracy in clustering and classification tasks. To deal with the "curses of dimensionality", many dimensionality reduction (DR) methods have been proposed. These methods map the data points to a lower-dimensional space, while preserving the important properties of the data in its original space. We go over one linear and two nonlinear DR methods. Then, through some examples, we see how the prior assumptions and computational complexities of each method affects its application in reducing the dimensionality of certain datasets.