AutoML: What & Comparison & Concerns

Peng Liu June 10, 2018

What is AutoML?

Automated Machine Learning provides methods and processes to make Machine Learning available for non-Machine Learning experts, to improve efficiency of Machine Learning and to accelerate research on Machine Learning.

Machine learning (ML) has achieved considerable successes in recent years and an ever-growing number of disciplines rely on it. However, this success crucially relies on human machine learning experts to perform the following tasks:

Preprocess and clean the data.
Select and construct appropriate features.
Select an appropriate model family.
Optimize model hyperparameters.
Postprocess machine learning models.
Critically analyze the results obtained.

As the complexity of these tasks is often beyond non-ML-experts, the rapid growth of machine learning applications has created a demand for off-the-shelf machine learning methods that can be used easily and without expert knowledge. We call the resulting research area that targets progressive automation of machine learning AutoML.

Reference: http://www.ml4aad.org/automl/

AutoML comparison

Automatic Machine Learning (autoML) is a process of building Machine Learning models by algorithm with no human intervention. There are several autoML packages available for building predictive models:

Datasets

In this post we compare three autoML packages (auto-sklearn, h2o and mljar). The comparison is performed on binary classification task on 28 datasets from openml. Datasets are described below.

Methodology

Each dataset was divided into train and test sets (70% of samples for training and 30% of samples for testing). Packages were tested on the same data splits.
The autoML model was trained on train set, with 1 hour limit for training time.
Final autoML model was used to compute predictions on test set (on samples not used for training).
The logloss was used to assess performance of the model (the lower logloss the better model). The logloss was selected because is more accurate than accuracy metric.
The process was repeated 10 times (with different seeds used for splits). Final results are average over 10 repetition.

Results

The results are presented in table and chart below. The best approach for each dataset is bolded.

The average logloss for each method on test subset of data, computed with 10 times repetition.

AutoML packages comparison (the lower logloss the better algorithm).

Discussion

The poor performance of auto-sklearn algorithm can be explained with 1 hour limit for training time. Auto-sklearn is using bayesian optimization for hyper parameters tuning which has sequential nature and requires many iterations to find good solution. The 1 hour training limit was selected from business perspective — in my opinion, user that is going to use autoML package prefers to wait 1 hour than 72 hours for result. The h2o results compared to auto-sklearn are better on almost all datasets.

The best results were obtained by mljar package — it was the best algorithm on 26 from 28 datasets. On average it was by 47.15% better than auto-sklearn and 13.31% better than h2o autoML solution.

The useful feature of mljar is user interface, so all models after the optimization are available through web browser (mljar is saving all models obtained during optimization).

The view with all models trained during optimization.

The details of selected model. The information about used hyper parameters and learning curve for train and test folds is available.

The code used for comparison is available at github

The mljar package can be used in python or R or by web browser.

View at Medium.com

Reference: https://medium.com/@MLJARofficial/automl-comparison-4b01229fae5e

http://docs.h2o.ai/h2o/latest-stable/h2o-docs/welcome.html

AutoML Concerns

The rise of generally intelligent AI

Much of the AI in the world today was made to accomplish a single, narrow use case like the translation of a sentence from one language to another, but Dean said he wants Google to create more AI models that can achieve multiple tasks and achieve a kind of “common sense reasoning about the world.”

“I think in the future you’re going to see us move more towards models that can do many, many things and then build on that experience of doing those many, so that when we want to train a model to do something else, it can build on that set of skills and expertise that it already has,” he said.

For example, if a robot is asked to pick something up, it will understand things like how a hand works, how gravity works, and other understandings about the world.

“I think that’s going to be an important trend that you’ll see in the next few years,” he said.

AutoML’s bias and opacity challenges

Depending on whom you ask, AutoML, Google’s AI that can create other AI models is either exciting or terrifying.

Machines that train machines surely frighten AI naysayers. But AutoML, said Google Cloud chief scientist Fei-Fei Li, lowers barriers to creating custom AI models for everyone from high-end developers to a ramen shop owner in Tokyo.

Dean finds it exciting because it’s helping Google “automatically solve problems,” but the use of AutoML also presents unique issues.

“Because we’re using more learned systems than traditional sort of hand-coded software, I think that raises a lot of challenges for us that we’re tackling,” he said. “So one is if you learn from data and that data has biased decisions in it already, then the machine learning models who learn can themselves perpetuate those biases. And so there’s a lot of work that we’re doing, and others in the machine learning community, to figure out how we can train machine learning models that don’t have forms of bias.”

Another challenge: how to properly design safety-critical systems with AutoML to create AI for industries like health care. Decades of computer science best practices have been established for hand-coding such systems, and the same must be done for machines making machines.

It’s one thing to get something wrong when you’re classifying the species of a dog, Dean said; it’s another thing entirely to make mistakes in safety-critical systems.

“I think that’s a really interesting and important direction for us to apply, particularly as we start to get machine learning in more safety-critical kinds of systems, things that are making decisions about your health care or an autonomous car,” he said.

Safety-critical AI needs more transparency

Together with news that Google Assistant will soon make phone calls for you and the release of Android P in beta, on Tuesday CEO Sundar Pichai talked about how Google is applying AI to health care to predict the readmission of patients based on information drawn from electronic health records.

An article by Google researchers published Tuesday in the Nature of Digital Medicine explains examples of why its AI made certain decisions about a patient so that doctors could see the reasoning behind a recommendation in medical records. In the future, Dean hopes a developer or doctor who wants to know why an AI made a specific decision will be able to simply ask the AI model and get a response.

Today, the implementation of AI in Google products goes through an internal review process, Dean said. Google is currently developing a set of guidelines for how to assess whether or not an AI model contains bias.

“What you want is essentially, just like security review or privacy review for new features in products, you want an ML fairness review that’s part of integrating machine learning into our products,” he said.

Humans should also be part of the decision-making process, Dean said, when it comes to AI implemented by developers through tools like ML Kit or TensorFlow, which has been downloaded more than 13 million times.

Drawing the line at AI weaponry

In response to a question, Dean said he does not believe Google should be in the business of making autonomous weaponry.

In March, news broke that Google was working with the Department of Defense to improve its analysis of footage gathered by drones.

“I think there are a number of interesting ethical questions about machine learning and AI as we as a society start to develop more powerful techniques,” he said. “I personally have signed a letter, an open letter about six or nine months ago — don’t know exactly when — saying that I was opposed to using machine learning for autonomous weapons. I think obviously there’s a continuum of what decisions we want to make as a company, so should we offer Gmail to military services that want to use it? That seems fine to me. I think most people have qualms about using autonomous weapons systems.”

Thousands of Google employees, according to the New York Times, have signed a letter that states Google should stay out of the creation of “warfare technology” could cause irreparable damage to Google’s brand and trust between the company and the public. Dean did not specify if he signed the letter referenced in the New York Times reporting.

AI drives new projects and products

Alongside patient readmission AI and a Gboard designed to understand Morse code, Pichai also highlighted a previously released study of AI that accurately detected diabetic retinopathy and predicted problems, as well as highly trained ophthalmologists, did.

AI models with that level of intelligence are beginning to do more than imitate human activity They’re helping Google discover new products and services.

“By training these models on large amounts of data, we can actually make systems that can do things that we didn’t know we could do, and that’s a really fundamental advance,” Dean said. “We’re now creating entirely new kinds of tests and products proven by AI, rather than using AI to do things we think we want to be able to do but just need the training system.”

Reference: https://venturebeat.com/2018/05/09/googles-ai-chief-on-automl-autonomous-weapons-and-the-future/

Fast Bayesian Optimization of Machine Learning Hyperparameters on Large Datasets

Peng Liu June 10, 2018

Abstract

Bayesian optimization has become a successful tool for hyperparameter optimization of machine learning algorithms, such as support vector machines or deep neural networks. Despite its success, for large datasets, training and validating a single configuration often takes hours, days, or even weeks, which limits the achievable performance. To accelerate hyperparameter optimization, we propose a generative model for the validation error as a function of training set size, which is learned during the optimization process and allows exploration of preliminary configurations on small subsets, by extrapolating to the full dataset. We construct a Bayesian optimization procedure, dubbed FABOLAS, which models loss and training time as a function of dataset size and automatically trades off high information gain about the global optimum against computational cost. Experiments optimizing support vector machines and deep neural networks show that FABOLAS often finds high-quality solutions 10 to 100 times faster than other state-of-the-art Bayesian optimization methods or the recently proposed bandit strategy Hyperband.

http://proceedings.mlr.press/v54/klein17a/klein17a.pdf

Aaron Klein1 Stefan Falkner1 Simon Bartels2 Philipp Hennig2 Frank Hutter1

Large-Scale Evolution of Image Classifiers

Peng Liu June 9, 2018

Abstract

Neural networks have proven effective at solving difficult problems but designing their architectures can be challenging, even for image classification problems alone. Our goal is to minimize human participation, so we employ evolutionary algorithms to discover such networks automatically. Despite significant computational requirements, we show that it is now possible to evolve models with accuracies within the range of those published in the last year. Specifically, we employ simple evolutionary techniques at unprecedented scales to discover models for the CIFAR-10 and CIFAR-100 datasets, starting from trivial initial conditions and reaching accuracies of 94.6% (95.6% for ensemble) and 77.0%, respectively. To do this, we use novel and intuitive mutation operators that navigate large search spaces; we stress that no human participation is required once evolution starts and that the output is a fully-trained model. Throughout this work, we place special emphasis on the repeatability of results, the variability in the outcomes and the computational requirements.

https://arxiv.org/pdf/1703.01041.pdf

Key points:

Need many controllers, so-called Workers in the paper, to guild the evolution process (e.g., selection, mutation)
The controllers are working in a distributed way

Benefits: The proposed method is simple and it is able to generate a fully trained network requiring no post-processing.

concerns:

The paper is interesting in the way that it helps to discover a fully automated DNN architecture for solving complex tasks without human participation. Although the authors claim that their method is scalable, only companies owning large-scale platform can employ this method and as long as there is no more economical implementation, we cannot see it as a scalable solution. However, it is a good starting point for automating the scalable architectural design of DNNs besides other solutions such as reinforcement learning. [other comment]

Evolutionary Learning

AutoML: What & Comparison & Concerns

What is AutoML?

AutoML comparison

Datasets

Methodology

Results

Discussion

AutoML Concerns

The rise of generally intelligent AI

AutoML’s bias and opacity challenges

Safety-critical AI needs more transparency

Drawing the line at AI weaponry

AI drives new projects and products

Fast Bayesian Optimization of Machine Learning Hyperparameters on Large Datasets

Abstract

Large-Scale Evolution of Image Classifiers

Abstract