### ML pubs

• Efficient classification based on sparse regression [thesis, translation, slides]
MSc Thesis, Department of CEIT, Amirkabir University of Technology, July 2012.

• Regression with sparse approximations of data [paper, poster]
with B. L. Sturm
European Signal Processing Conference (EUSIPCO), 2012.

• On automatic music genre recognition by sparse representation classification using auditory temporal modulations [paper, discussion]
with B. L. Sturm
Computer Music Modeling and Retrieval: Lecture Notes in Computer Sciences Series. Springer, 2012.

### ML talks

• Efficient classification based on sparse regression
AUT, July 17, 2012. [slides]

Abstract. Master's thesis defense slides.

• SPARROW: SPARse appROximation Weighted regression
UdeM, March 12, 2012 and SUT, February 22, 2012. [slides]

Abstract. We propose sparse approximation weighted regression (SPARROW), a nonparametric method of regression that takes advantage of the sparse linear approximation of a query point. SPARROW employs weights based on sparse approximation in the context of locally constant, locally linear, and locally quadratic regression to generate better estimates than for e.g., k-nearest neighbor regression and more generally, kernel-weighted local polynomial regression. Our experimental results show that SPARROW performs competitively.

• Sparse coding and dictionary learning
SUT, October 5, 2011. [slides]

Abstract. Sparse coding is achieved by solving an under-determined system of linear equations under sparsity constraints. We briefly look at several algorithms that solve the resulting optimization problem (exactly or approximately). We then see how this optimization principle can be applied in both a supervised and unsupervised context: multiclass classification and feature learning, respectively. Next, we talk about dictionary learning and some of its well-known instances. Applications of dictionary learning include image denoising and inpainting.

• Feature learning with deep networks for image classification
SUT, May 18, 2011. [slides]

Abstract. An image can be represented at different levels, starting from pixels, going on to edges, to parts, to objects, and beyond. Over the years, many attempts have been made at engineering useful descriptors that are able to extract low-to-high level features from images. But what if we could make this process automatic? What if we could "learn" to detect layer after layer of features of increasing abstraction and complexity? After all, it would be impossible for us to foresee and hard-code all the kinds of invariances necessary to build features for our ever more complicated tasks. In this talk, we go over several unsupervised feature learning methods that have been in the making since 2006.

• Computational learning theory
AUT, April 26, 2011. [slides]

Details. This is a brief tutorial on learning theory for a machine learning class.

• Parametric density estimation using GMMs
AUT, February 1, 2011. [slides]

Details. This is a brief tutorial on applying the EM algorithm for estimating the parameters of a Gaussian mixture model.

• High dimensional data and dimensionality reduction
IPM, November 4, 2010. [slides]

Abstract. Apart from raising computational costs, high-dimensional data behave in counterintuitive ways. In this seminar, we talk about why in some situations, more features fail to result in increased accuracy in clustering and classification tasks. To deal with the "curses of dimensionality", many dimensionality reduction (DR) methods have been proposed. These methods map the data points to a lower-dimensional space, while preserving the important properties of the data in its original space. We go over one linear and two nonlinear DR methods. Then, through some examples, we see how the prior assumptions and computational complexities of each method affects its application in reducing the dimensionality of certain datasets.

• The split Bregman method for total variation denoising
AUT, May 30, 2010. [slides]

Details. This is an overview of the split Bregman method for solving an $\ell_1$-regularized problem arising from TV denoising.