CSE Dissertation Proposal Defense: Gustavo Malkomes

Apr 16, 2018
10 a.m.
Jolley Hall, Room 309

"Automated Active Learning for Gaussian Process"

Gustavo Malkomes
Adviser: Roman Garnett

For many scenarios, unlabeled data is abundant but acquiring labeled observations is expensive -- the latter might requires slow and costly experimentation or human intervention. Active learning is a machine learning paradigm in which a policy is designed to intelligently select data to achieve a goal. Some examples are learning a predictor with fewer examples, hyperparameter optimization and fast identification of elements of a given class. It is a standard concept in machine learning but has less popularity in other scientific communities. We believe that excessive machine learning expertise is required to use active learning tools in their current form.

In this work, we propose solutions that further automate active learning. Our core contributions are active learning tools that are easy for non-experts to use but that deliver results competitive with or better than human-expert solutions. We first introduce a novel model selection algorithm for fixed-size datasets, called Bayesian optimization for model selection (BOMS). Our proposed search method is based on Bayesian optimization in model space, where we reason about model evidence as a function to be maximized. We proceed by extending BOMS to active learning, creating a fully automatic active learning framework. Our approach works with any probabilistic model space, but here we focus on a class of Gaussian process models. We apply our framework to several applications developing complete automated tools for regression inference, Bayesian optimization, model selection and active search.