Reference Manual

The library is intended to be both fast and clear for development and research. At the same time, it does everything the "right way". For example:

- latin hypercube sampling is used for the preliminary design step,
- extensive use of Cholesky decomposition and related tecniques to improve numeral stability and reduce computational cost,
- kernels, criteria and parametric functions can be combined to produce more advanced functions,
- etc.

The library includes several test program that can be found in the *bin* folder. Furthermore, there are test programs respectively under the *python* and *matlab* folders. They provide examples of the different interfaces that BayesOpt provide.

First of all, make sure your system finds the correct libraries. In Windows, you can copy the dlls to your working folder or include the lib folder in the path.

In some *nix systems, including Ubuntu, Debian and Mac OS, the library is installed by default in /usr/local/lib/. However, this folder is not included by default in the linker, Python or Matlab paths by default. This is specially critical when building shared libraries (mandatory for Python usage). The script *exportlocalpaths.sh* makes sure that the folder with the libraries is included in all the necessary paths.

After that, there are 3 steps that should be follow:

- Define the function to optimize.
- Set or modify the parameters of the optimization process. The set of parameters and the default set can be found in parameters.h.
- Run the optimizer.

BayesOpt relies on a complex and highly configurable mathematical model. In theory, it should work reasonably well for many problems in its default configuration. However, Bayesian optimization shines when we can include as much knowledge as possible about the target function or about the problem. Or, if the knowledge is not available, keep the model as general as possible (to avoid bias). In this part, knowledge about Gaussian processes or nonparametric models in general might be useful.

For example, with the parameters we can select the kind of kernel, mean function or surrogate model that we want to use. With the kernel we can play with the smoothness of the function and it's derivatives. The mean function can be use to model the overall trend (flat, linear, etc.). If we know the overall signal variance we better use a Gaussian process, if we don't, we should use a Student's t process instead.

For that reason, the parameters are bundled in a structure (C/C++/Matlab/Octave) or dictionary (Python), depending on the API that we use. This is a brief explanation of every parameter

**n_iterations:**Maximum number of iterations of BayesOpt. Each iteration corresponds with a target function evaluation. This is related with the budget of the application [Default 300]**n_inner_iterations:**Maximum number of iterations of the inner optimization process. Each iteration corresponds with a criterion evaluation. The inner optimization results in the "most interest point" to evaluate the target function. In order to scale the process for increasing dimensional spaces, the actual number of iterations is this number times the number of dimensions. [Default 500]**n_init_samples:**BayesOpt requires an initial set of samples to learn a preliminary model of the target function. Each sample is also a target function evaluation. [Default 30]**n_iter_relearn:**Although most of the parameters of the model are updated after every iteration, the kernel parameters cannot be updated continuously as it has a very large computational overhead and it might introduce bias in the result. This represents the number of iterations between recomputing the kernel parameters. If it is set to 0, they are only learned after the initial set of samples. [Default 0]

**init_method:**(unsigned integer value) For continuous optimization, we can choose among different strategies for the initial design (1-Latin Hypercube Sampling (LHS), 2-Sobol sequences (if available, see Minimal installation (fast compilation)), Other-Uniform Sampling) [Default 1, LHS].**use_random_seed:**0-Fixed seed, 1-Time based (variable) seed. For debugging purposes, it might be useful to freeze the random seed. [Default 1, variable seed].

**verbose_level:**(unsigned integer value) Verbose level 0,3 -> warnings, 1,4 -> general information, 2,5 -> debug information, any other value -> only errors. Levels 0,1,2 -> send messages to stdout. Levels 3,4,5 -> send messages to a log file. [Default 1, general info->stdout].**log_filename:**Name of the log file (if applicable, verbose_level= 3, 4 or 5) [Default "bayesopt.log"]

**surr_name:**Name of the hierarchical surrogate function (nonparametric process and the hyperpriors on sigma and w). See Section Surrogate models for a detailed description. [Default "sGaussianProcess"]**sigma_s:**Signal variance (if known) [Default 1.0]**noise:**Observation noise/signal ratio. For computer simulations or deterministic functions, it should be close to 0. However, to avoid numerical instability due to model inaccuracies, make it always <0. [Default 0.0001]**alpha**,**beta:**Inverse-Gamma prior hyperparameters (if applicable) [Default 1.0, 1.0]

Although BayesOpt tries to build a full analytic Bayesian model for the surrogate function, some hyperparameters cannot be estimated in closed form. Currently, the only parameters of BayesOpt models that require special treatment are the kernel hyperparameters. See Section Methods for learning the kernel parameters for a detailed description

**l_type:**Learning method for the kernel hyperparameters. Currently, only L_FIXED and L_EMPIRICAL are implemented [Default L_EMPIRICAL]**sc_type:**Score function for the learning method. [Default SC_MAP]

**epsilon:**According to some authors, it is recommendable to include an epsilon-greedy strategy to achieve near optimal convergence rates. Epsilon is the probability of performing a random (blind) evaluation of the target function. Higher values implies forced exploration while lower values relies more on the exploration/exploitation policy of the criterion [Default 0.0 (disabled)]**crit_name:**Name of the sample selection criterion or a combination of them. It is used to select which points to evaluate for each iteration of the optimization process. Could be a combination of functions like "cHedge(cEI,cLCB,cPOI,cThompsonSampling)". See section critmod for the different possibilities. [Default: "cEI"]**crit_params**,**n_crit_params:**Array with the set of parameters for the selected criteria. If there are more than one criterion, the parameters are split among them according to the number of parameters required for each criterion. If n_crit_params is 0, then the default parameter is selected for each criteria. [Default: n_crit_params = 0]

**kernel.name:**Name of the kernel function. Could be a combination of functions like "kSum(kSEARD,kMaternARD3)". See section kermod for the different posibilities. [Default: "kMaternISO3"]**kernel.hp_mean**,**kernel.hp_std**,**kernel.n_hp:**Kernel hyperparameters prior in the log space. That is, if the hyperparameters are , this prior is . Any "ilegal" standard deviation (std<=0) results in a maximum likelihood estimate. Depends on the kernel selected. If there are more than one, the parameters are split among them according to the number of parameters required for each criterion. [Default: "1.0, 10.0, 1" ]

**mean.name:**Name of the mean function. Could be a combination of functions like "mSum(mOne, mLinear)". See Section parmod for the different posibilities. [Default: "mOne"]**mean.coef_mean**,**kernel.coef_std**,**kernel.n_coef:**Mean function coefficients. The standard deviation is only used if the surrogate model assumes a normal prior. If there are more than one, the parameters are split among them according to the number of parameters required for each criterion. [Default: "1.0, 30.0, 1" ]

Here we show a brief summary of the different ways to use the library:

This interface is the most standard approach. Due to the large compatibility with C code with other languages it could also be used for other languages such as Fortran, Ada, etc.

The function to optimize must agree with the template provided in bayesopt.h

double my_function (unsigned int n, const double *x, double *gradient, void *func_data);

Note that the gradient has been included for future compatibility, although in the current implementation, it is not used. You can just ignore it or send a NULL pointer.

The parameters are defined in the bopt_params struct. The easiest way to set the parameters is to use

bopt_params initialize_parameters_to_default(void);

and then, modify the necesary fields.

Once we have set the parameters and the function, we can called the optimizer

eval_func f, /* function to optimize */

void* f_data, /* extra data that is transfered directly to f */

const double *lb, const double *ub, /* bounds */

double *x, /* out: minimizer */

double *minf, /* out: minimum */

bopt_params parameters);

This is the most straighforward and complete method to use the library. The object that must be optimized must inherit from the BayesOptContinuous or BayesOptDiscrete classes. The objects can be found in bayesoptcont.hpp and bayesoptdisc.hpp respectively.

Then, we just need to override one of the virtual functions called evaluateSample, which can be called with C arrays and uBlas vectors. Since there is no pure virtual functions, you can just redefine your preferred interface.

**Experimental:** You can also override the checkReachability function to include nonlinear restrictions.

class MyOptimization: public BayesOptContinous

{

public:

MyOptimization(bopt_params param):

BayesOptContinous(param) {}

double evaluateSample( const boost::numeric::ublas::vector<double> &query )

{

// My function here

};

bool checkReachability( const boost::numeric::ublas::vector<double> &query )

{

// My restrictions here

};

};

As with the callback interface, the parameters are defined in the bopt_params struct. The easiest way to set the parameters is to use

bopt_params initialize_parameters_to_default(void);

and then, modify the necesary fields.

The file python/demo_quad.py provides examples of the two Python interfaces.

**Parameters:** For both interfaces, the parameters are defined as a Python dictionary with the same structure as the bopt_params struct in the C/C++ interface. The enumerate values are replaced by strings without the prefix. For example, the C_EI criteria is replaced by the string "EI" and the M_ZERO mean function is replaced by the string "ZERO".

The parameter dictionary can be initialized using

however, this is not necesary in general. If any of the parameter is not included in the dictionary, the default value is included instead.

**Callback:** The callback interface is just a wrapper of the C interface. In this case, the callback function should have the form

where *query* is a numpy array and the function returns a double scalar.

The optimization process can be called as

where the result is a tuple with the minimum as a numpy array (x_out), the value of the function at the minimum (y_out) and the error code.

**Inheritance:** The object oriented construction is similar to the C++ interface.

The BayesOptModule include atributes for the parameters (*params*), number of dimensions (*n*) and bounds (*lb* and *up*).

Then, the optimization process can be called as

wher the result is a tuple with the minimum as a numpy array (x_out), the value of the function at the minimum (y_out) and the error code.

The file matlab/runtest.m provides an example of the Matlab/Octave interface.

**Parameters:** The parameters are defined as a Matlab struct equivalent to bopt_params struct in the C/C++ interface, except for the *theta* and *mu* arrays which are replaced by Matlab vectors. Thus, the number of elements (*n_theta* and *n_mu*) are not needed. The enumerate values are replaced by strings without the prefix. For example, the C_EI criteria is replaced by the string "EI" and the M_ZERO mean function is replaced by the string "ZERO".

If any of the parameter is not included in the Matlab struct, the default value is automatically included instead.

**Callback:** The callback interface is just a wrapper of the C interface. In this case, the callback function should have the form

function y = my_function (query):

where *query* is a Matlab vector and the function returns a scalar.

The optimization process can be called (both in Matlab and Octave) as

[x_out, y_out] = bayesopt('my_function', n_dimensions, parameters, lower_bound, upper_bound)

where the result is the minimum as a vector (x_out) and the value of the function at the minimum (y_out).

In Matlab, but not in Octave, the optimization can also be called with function handlers

[x_out, y_out] = bayesopt(@my_function, n_dimensions, parameters, lower_bound, upper_bound)

Generated on Tue Mar 25 2014 23:31:59 for BayesOpt by 1.8.4