Bio   Pubs   Gigs   Library  

Essays

Follow my blog or subscribe via RSS or email for updates.

Talks

  • General Folders: The first AI-powered data logistics company
    Demo Day: Techstars San Diego Powered by SDSU, December 7, 2023.
    📰 coverage 1, 2

    Abstract. Join us at Snapdragon Stadium for the first ever Techstars San Diego powered by San Diego State University Demo Day. Meet the incredible cohort of companies as they showcase their progress.

  • Cross-company data exchange for the cloud
    Scale By the Bay: Code and Data in the Age of AI, November 15, 2023.
    📹 video, 📰 coverage 1, 2

    Abstract. Data exchange is integral to business collaboration. However, data exchange pipelines are time consuming to build, prone to leaks, difficult to monitor, and costly to audit. In this talk, we present an overview of the methods companies use to exchange data. We then discuss solutions that better match the efficiency and security standards of today.

  • Rethinking B2B data exchange and collaboration
    Crunch Conference Budapest, October 6, 2023.
    📹 video, 📰 coverage

    Abstract. Data exchange is integral to business collaboration. However, data exchange pipelines are time consuming to build, prone to leaks, difficult to monitor, and costly to audit. In this talk, we present an overview of the methods companies use to exchange data. We then discuss solutions that better match the efficiency and security standards of today.

  • The state of cross-company data exchange
    Data Council Austin, March 30, 2023.
    📹 video, 📽 slides, 📃 blog post

    Abstract. Data exchange is integral to every business relationship. Yet data exchange practices are highly manual, prone to leaks, difficult to validate, impossible to monitor, and costly to audit. In this talk, we present an overview of the methods enterprises use to exchange data and the outstanding challenges. We conclude by enumerating the properties of a good solution.

  • Making an impact with data
    with Gorkem Yurtseven and Britt Allen, moderated by Elizabeth Dlha
    Data Mash #2, June 2, 2022.
    📽 slides

    Abstract. After introducing General Folders, we'll review three impactful data projects. First, the design of OKRs to encourage collaboration among product teams at Twitter; second, the feature creation pipeline for fraud detection at Paytm; and finally, sales enablement at Carbon Health via risk quantification.

  • Data transfer challenges in evaluating AI platforms
    apply(meetup), February 10, 2022.
    📹 video, 📃 blog post

    Abstract. Not so long ago, I met with over 30 AI companies to learn of their workflows at the very first step in the evaluation process — that of data collection and transfer. I had a hunch this part of the pipeline posed challenges. In this talk, I review the myriad roadblocks faced by companies in providing access to their data. Then I discuss potential solutions.

  • Data Science for tech-enabled healthcare
    with Rebekkah Ismakov
    The AI Summit, October 1, 2020.
    📹 video, 📃 blog post, 📊 data, 🎙 discussion

    Abstract. The first part of the talk is an overview of the Data Science team roadmap and infrastructure decisions, with a tour of the clinical decision support system and covidclinicaldata.org. The second part is a review of our efforts for the COVID-Ready program. We report on recommendations that can be made to employers, based on simulations surfacing how testing cadence and other policies affect outbreaks in the workplace.

  • DJing and the art of audio signal processing
    Twitter HQ, Sept. 6, 2017.

    Abstract. In this talk, we review concepts from the audio signal processing field. We then show how familiarity with these concepts allows for a better understanding of DJing tools and techniques, and vice versa.

Panels

  • Building teams and culture that support ML innovation
    with Ziad Asghar and Ameen Kazerouni, moderated by Sam Charrington
    TWIMLcon, January 22, 2021.
    📹 video

    Abstract. Traditional approaches to managing technical projects can be at odds with achieving success with machine learning. In this session, we discuss how ML and AI executives can build effective teams, support them with the right processes and tools, and shift the broader organizational culture in ways that reinforce innovation in machine learning.

  • Making an impact in data science: when traditional methods fail
    with Eric Glover, Halim Abbas, Kevin Stumpf, and Sean McPherson
    Branch HQ, February 27, 2020.
    📹 video

    Abstract. In this meetup, we hear about data science projects that succeeded in spite of the limitations of existing methodology.

  • Culture & organization for effective ML at scale
    with Eric Colson and Jennifer Prendki, moderated by Maribel Lopez
    TWIMLcon, Sep 27, 2019.

    Abstract. Hear from people that have experienced startups and large corporations in a range of industries reveal tips to work faster, more efficiently, and create an org-wide culture that supports effective ML.

  • Women in Data Science meetup: Growing a career in data science
    with Laura Pruitt, Alexandra Johnson, and Kasia Rachuta, moderated by Chloe Tseng
    Airbnb HQ, March 8, 2018.

    Abstract. Meet women in data science from all over the Bay Area at this WiDS post-conference screening. The event will be an opportunity to meet like-minded women as well as listen to the great lineup of panelists.

Podcasts

  • Making Cross-Company Data Exchange Easy
    with Kostas Pardalis and Eric Dodds
    The Data Stack Show, September 6, 2023.
    🎙 podcast episode

    Abstract. The conversation includes the importance of data collaboration and sharing, the challenges and complexities of data sharing in various industries, the need for efficient and secure solutions, and the underlying definitions and dimensions of the data exchange problem—including infrastructure, security, economics, user needs, and more!

  • Head of Data Science at Healthcare Tech #93
    with Grant Ingersoll
    Develomentor, June 29, 2020.
    🎙 podcast episode

    Abstract. Thanks to Grant, the episode has turned into a good review of my work history.

Academic talks

  • Modeling the Facebook social network: The memoryless GEO-P graph model
    SOGMSC, May 21, 2014.
    📽 slides

    Abstract. Online social networks are ubiquitous graphs. To test algorithms that scale with the size and order of these networks, we require synthetic samples. In this talk, we go over several methods for generating random graphs representative of online social networks. We are especially interested in the M-GEOP model (Bonato et al., 2014), and in assessing the fit of these models to the Facebook dataset.

  • Efficient classification based on sparse regression
    AUT, July 17, 2012.
    📽 slides

    Abstract. Master's thesis defense slides.

  • SPARROW: SPARse appROximation Weighted regression
    UdeM, March 12, 2012 and SUT, February 22, 2012.
    📽 slides

    Abstract. We propose sparse approximation weighted regression (SPARROW), a nonparametric method of regression that takes advantage of the sparse linear approximation of a query point. SPARROW employs weights based on sparse approximation in the context of locally constant, locally linear, and locally quadratic regression to generate better estimates than for e.g., k-nearest neighbor regression and more generally, kernel-weighted local polynomial regression. Our experimental results show that SPARROW performs competitively.

  • Sparse coding and dictionary learning
    SUT, October 5, 2011.
    📽 slides

    Abstract. Sparse coding is achieved by solving an under-determined system of linear equations under sparsity constraints. We briefly look at several algorithms that solve the resulting optimization problem (exactly or approximately). We then see how this optimization principle can be applied in both a supervised and unsupervised context: multiclass classification and feature learning, respectively. Next, we talk about dictionary learning and some of its well-known instances. Applications of dictionary learning include image denoising and inpainting.

  • Feature learning with deep networks for image classification
    SUT, May 18, 2011.
    📽 slides

    Abstract. An image can be represented at different levels, starting from pixels, going on to edges, to parts, to objects, and beyond. Over the years, many attempts have been made at engineering useful descriptors that are able to extract low-to-high level features from images. But what if we could make this process automatic? What if we could "learn" to detect layer after layer of features of increasing abstraction and complexity? After all, it would be impossible for us to foresee and hard-code all the kinds of invariances necessary to build features for our ever more complicated tasks. In this talk, we go over several unsupervised feature learning methods that have been in the making since 2006.

  • Computational learning theory
    AUT, April 26, 2011.
    📽 slides

    Details. This is a brief tutorial on learning theory for a machine learning class.

  • Parametric density estimation using GMMs
    AUT, February 1, 2011.
    📽 slides

    Details. This is a brief tutorial on applying the EM algorithm for estimating the parameters of a Gaussian mixture model.

  • High dimensional data and dimensionality reduction
    IPM, November 4, 2010.
    📽 slides

    Abstract. Apart from raising computational costs, high-dimensional data behave in counterintuitive ways. In this seminar, we talk about why in some situations, more features fail to result in increased accuracy in clustering and classification tasks. To deal with the "curses of dimensionality", many dimensionality reduction (DR) methods have been proposed. These methods map the data points to a lower-dimensional space, while preserving the important properties of the data in its original space. We go over one linear and two nonlinear DR methods. Then, through some examples, we see how the prior assumptions and computational complexities of each method affects its application in reducing the dimensionality of certain datasets.

  • The split Bregman method for total variation denoising
    AUT, May 30, 2010.
    📽 slides

    Details. This is an overview of the split Bregman method for solving an $\ell_1$-regularized problem arising from TV denoising.

Publications

  • Efficient classification based on sparse regression
    MSc Thesis, Amirkabir University of Technology, July 2012.
    📔 thesis, 📕 translation, 📽 slides

  • Regression with sparse approximations of data
    with Bob L. Sturm
    European Signal Processing Conference (EUSIPCO), 2012.
    📃 paper, 📰 poster

  • On automatic music genre recognition by sparse representation classification using auditory temporal modulations
    with Bob L. Sturm
    Computer Music Modeling and Retrieval: Lecture Notes in Computer Sciences (LNCS). Springer, 2012.
    📃 paper