Bayesian Optimization

From Cornell University Computational Optimization Open Textbook - Optimization Wiki
Revision as of 23:03, 27 November 2021 by Deepakorani (talk | contribs)
Jump to navigation Jump to search

Author : By Deepa Korani (dmk333@cornell.edu)

Steward : Fenqgi You

Introduction

Bayesian Optimization is a sequential model-based approach to solving problems. In particular, it prescribes a prior belief over the possible objective functions, and then sequentially refine the model as data are observed via Bayesian posterior updating. [1]

Bayesian Optimization is useful in machine learning. Since Machine Learning consists of black box optimization problem where the objective function is a black box function[2], and the analytical expression for the function is unknown. Bayesian optimization can be useful here. They attempt to find the global optimum in a minimum number of steps.

Bayesian Optimization has shown tremendous solutions for a wide variety of design problems. Certain application include; robotics, environmental monitoring, combinatorial optimization, adaptive Monte Carlo, reinforcement learning. [3]

Theory, Methodology and Algorithm

Methodology

Bayesian Optimization Algorithm has two main components:

Probabilistic Model:

The other word for probabilistic model is called as the surrogate function or the posterior distribution. The posterior captures the updated belief about the unknown objective function. A better interpretation of this is to estimate the objective the function with a surrogate function. Common Surrogate function used to model is the Gaussian Process (GP)

Acquisition Function:

The role of the acquisition function is to guide the search for the optimum. Typically, acquisition functions are defined such that high acquisition corresponds to potentially high values of the objective function, whether because the prediction is high, the uncertainty is great, or both. Maximizing the acquisition function is used to select the next point at which to evaluate the function. That is, we wish to sample ƒ at argmaxxu(x|D), where u(·) is the generic symbol for an acquisition function.

A good acquisition function should trade off exploration and exploitation. A quick section of common acquisition functions are defined below:

Error creating thumbnail: File with dimensions greater than 12.5 MP
Four different Acquisition functions [4]
Upper Bound Confidence:

This acquisition function is defined as:

UCB[x*] = ℳ[x*] + β1/2σ[x*]

Probability of Improvement:

This acquisition function computes the likelihood that the function at x∗ will return a result higher than the current maximum f[^x]. For each point x∗, we integrate the part of the associated normal distribution that is above the current maximum (figure 4b) so that:

PI[x*] = Norm[ℳ[x*] ,σ[x*]]df[x*]





Algorithm [5]

Here follows a longer example of mathematical-style pseudocode, for the Bayesian Optimization:

1: Algorithm' Bayesian Optimization ' is
2:    for t == 1,2,.....do  
3:         Find xt by optimizing the the acquistion function over the GP: xt = argmaxxu(x|D1:t-1)
4:         Sample the objective function yt = f(xt) + εt
5:         Augment the data D1:t = {D1:t-1,(xt,yt)} and update the GP
6:    end for

Applications

Bayesian Optimization has shown a wide variety of interest in areas of data science and Chemical Engineering, Material Science domain. Some of the applications are stated below:

  • A/B Testing - Though the idea of A/B testing dates back to the early days of advertising in the form of so-called focus groups, the advent of the internet and smartphones has given web and app developers a new forum for implementing these tests at unprecedented scales. By redirecting small fractions of user traffic to experimental designs of an ad, app, game, or website, the developers can utilize noisy feedback to optimize any observable metric with respect to the product’s configuration. In fact, depending on the particular phase of a product’s life, new subscriptions may be more valuable than revenue or user retention, or vice versa; the click-through rate might be the relevant objective to optimize for an ad, whereas for a game it may be some measure of user engagement. The crucial problem is how to optimally query these subsets of users in order to find the best product with high probability within a predetermined query budget, or how to redirect traffic sequentially in order to optimize a cumulative metric while incurring the least opportunity cost.
  • Recommender Systems - Generally Recommender systems are used by online content providers to recommend products to their subscribers. Recommending the right product to subscribers is important in order to improvise the revenue in case for e-commerce sits, or consumption sites.
  • Natural Language Processing and Text- Bayesian Optimization has been applied to improve text extraction. [6]
  • Environmental Monitoring and Sensor Networks - Sensor Networks allow to monitor important environment conditions; such as temperature, concentration of pollutants in the atmosphere, soil and ocean. These networks make noise measurements that are interpolated to produce a global model of interest. In some cases, these sensors are expensive to activate but one can answer important questions like what is the hottest or coldest spot in a building by activating a relatively small number of sensors. Bayesian optimization was used for this task and the similar one of finding the location of greatest highway traffic congestion [7]
  • Materials - Bayesian Optimization is applied to design experiment of previous experiments to suggest features of the new materials whose performance is evaluated by simulation. Final goal being to improvise the design while reducing the number of experimentation. Bayesian Optimization are used to predict compounds with low thermal conductivity [8], optimal melting temperature [9], elastic properties[10]

Example

Here using the Hyperopt package; a simple optimization problem will be solved: The example is broken down into four steps:

Objective Function

The objective function for this example is a simple function that is defined that returns the value that needs to be minimized.

https://pyro.ai/examples/bo.html



Bayesian Optimization Packages

Here are a list of packages in Python, Java, C++that utelize Bayesian Optimization

Table - Packages Utilizing Bayesian Optimization [11]
Package License URL Language Model
SMAC Academic http://www.cs.ubc.ca/labs/beta/Projects/SMAC Java Random Forest
Hyperopt BSD https://github.com/hyperopt/hyperopt Python Tree Parzen Estimator
Spearmint Academic https://github.com/HIPS/Spearmint Python Gaussian Process
Bayesopt GPL https://github.com/rmcantin/bayesopt C++ Gaussian Process
PyBO BSD https://github.com/mwhoffman/pybo Python Gaussian Process
MOE Apache 2.0 https://github.com/Yelp/MOE Python/ C++ Gaussian Process

Conclusion

References