# Data Transformations¶

## Transforms¶

class gpflowopt.transforms.LinearTransform(A, b)

A simple linear transform of the form

$\mathbf Y = (\mathbf A \mathbf X^{T})^{T} + \mathbf b \otimes \mathbf 1_{N}^{T}$
Attributes: data_holders Return a list of all the child DataHolders fixed A boolean attribute to determine if all the child parameters of this node are fixed highest_parent A reference to the top of the tree, usually a Model instance long_name This is a unique identifier for a param object within a structure, made by concatenating the names through the tree. name An automatically generated name, given by the reference of the _parent to this instance. sorted_params Return a list of all the child parameters, sorted by id.

Methods

 assign(other) Assign the parameters of another LinearTransform. backward(Y) Overwrites the default backward approach, to avoid an explicit matrix inversion. build_backward(Y) TensorFlow implementation of the inverse mapping build_backward_variance(Yvar) Additional method for scaling variance backward (used in Normalizer). build_forward(X) Tensorflow graph for the transformation of U -> V build_prior() Build a tf expression for the prior by summing all child-parameter priors. forward(X) Performs the transformation of U -> V get_feed_dict_keys() Recursively generate a dictionary of {object: _tf_array} pairs that can be used in update_feed_dict get_free_state() Recurse get_free_state on all child parameters, and hstack them. get_param_index(param_to_index) Given a parameter, compute the position of that parameter on the free-state vector. get_samples_df(samples) Given a numpy array where each row is a valid free-state vector, return a pandas.DataFrame which contains the parameter name and associated samples in the correct form (e.g. make_tf_array(X) Distribute a flat tensorflow array amongst all the child parameter of this instance. randomize([distributions, skipfixed]) Calls randomize on all parameters in model hierarchy. set_state(x) Set the values of all the parameters by recursion tf_mode() A context for building models.
 get_parameter_dict set_parameter_dict update_feed_dict

## Normalizer¶

class gpflowopt.scaling.DataScaler(model, domain=None, normalize_Y=False)

Model-wrapping class, primarily intended to assure the data in GPflow models is scaled.

One DataScaler wraps one GPflow model, and can scale the input as well as the output data. By default, if any kind of object attribute is not found in the datascaler object, it is searched on the wrapped model.

The datascaler supports both input as well as output scaling, although both scalings are set up differently:

• For input, the transform is not automatically generated. By default, the input transform is the identity transform. The input transform can be set through the setter property, or by specifying a domain in the constructor. For the latter, the input transform will be initialized as the transform from the specified domain to a unit cube. When X is updated, the transform does not change.
• If enabled: for output the data is always scaled to zero mean and unit variance. This means that if the Y property is set, the output transform is first calculated, then the data is scaled.

By default, Acquisition objects will always wrap each model received. However, the input and output transforms will be the identity transforms, and output normalization is switched off. It is up to the user (or specialized classes such as the BayesianOptimizer) to correctly configure the datascalers involved.

By carrying out the scaling at such a deep level in the framework, it is possible to keep the scaling hidden throughout the rest of GPflowOpt. This means that, during implementation of acquisition functions it is safe to assume the data is not scaled, and is within the configured optimization domain. There is only one exception: the hyperparameters are determined on the scaled data, and are NOT automatically unscaled by this class because the datascaler does not know what model is wrapped and what kernels are used. Should hyperparameters of the model be required, it is the responsibility of the implementation to rescale the hyperparameters. Additionally, applying hyperpriors should anticipate for the scaled data.

Attributes:
X

Returns the input data of the model, unscaled.

Y

Returns the output data of the wrapped model, unscaled.

data_holders

Return a list of all the child DataHolders

fixed

A boolean attribute to determine if all the child parameters of this node are fixed

highest_parent

Returns an instance of the ParentHook instead of the usual reference to a Parentable.

input_transform

Get the current input transform

long_name

This is a unique identifier for a param object within a structure, made by concatenating the names through the tree.

name

An automatically generated name, given by the reference of the _parent to this instance.

normalize_output
return: boolean, indicating if output is automatically scaled to zero mean and unit variance.
output_transform

Get the current output transform

sorted_params

Return a list of all the child parameters, sorted by id.

Methods

 build_predict(Xnew[, full_cov]) build_predict builds the TensorFlow graph for prediction. build_prior() Build a tf expression for the prior by summing all child-parameter priors. get_feed_dict_keys() Recursively generate a dictionary of {object: _tf_array} pairs that can be used in update_feed_dict get_free_state() Recurse get_free_state on all child parameters, and hstack them. get_param_index(param_to_index) Given a parameter, compute the position of that parameter on the free-state vector. get_samples_df(samples) Given a numpy array where each row is a valid free-state vector, return a pandas.DataFrame which contains the parameter name and associated samples in the correct form (e.g. make_tf_array(X) Distribute a flat tensorflow array amongst all the child parameter of this instance. predict_density(Xnew, Ynew) Compute the (log) density of the data Ynew at the points Xnew predict_f(Xnew) Compute the mean and variance of held-out data at the points Xnew predict_f_full_cov(Xnew) Compute the mean and variance of held-out data at the points Xnew predict_y(Xnew) Compute the mean and variance of held-out data at the points Xnew randomize([distributions, skipfixed]) Calls randomize on all parameters in model hierarchy. set_state(x) Set the values of all the parameters by recursion tf_mode() A context for building models.
 get_parameter_dict set_parameter_dict update_feed_dict