# How to perform optimization

By default, all optimizers assume a centered and reduced prior at the beginning of the optimization (i.e. 0 mean and unitary standard deviation). They are however able to find solutions far from this initial prior.

## Basic example

Minimizing a function using an optimizer (here `NGOpt`, our adaptative optimization algorithm) can be easily run with:

```import nevergrad as ng

def square(x, y=12):
return sum((x - 0.5) ** 2) + abs(y)

# optimization on x as an array of shape (2,)
optimizer = ng.optimizers.NGOpt(parametrization=2, budget=100)
recommendation = optimizer.minimize(square)  # best value
print(recommendation.value)
# >>> [0.49971112 0.5002944 ]
```

`parametrization=n` is a shortcut to state that the function has only one variable, continuous, of dimension `n`: `ng.p.Array(shape=(n,))`.

Important: Make sure to check the Parametrization section for more complex parametrizations examples, and Parametrization API section for the full list of options. Below are a few more advanced cases.

Defining the parametrization (`instrum`) as follows in the code sample will instead optimize on both `x` (continuous, dimension 2, bounded between -12 and 12) and `y` (continuous, dimension 1).

```instrum = ng.p.Instrumentation(
ng.p.Array(shape=(2,)).set_bounds(lower=-12, upper=12),
y=ng.p.Scalar()
)
optimizer = ng.optimizers.NGOpt(parametrization=instrum, budget=100)
recommendation = optimizer.minimize(square)
print(recommendation.value)
# >>> ((array([0.52213095, 0.45030925]),), {'y': -0.0003603100877068604})
```

We can work in the discrete case as well, e.g. with the one-max function applied on `{0,1,2,3,4,5,6}^10`:

```import nevergrad as ng

def onemax(x):
return len(x) - x.count(1)

# Discrete, ordered
param = ng.p.TransitionChoice(range(7), repetitions=10)
optimizer = ng.optimizers.DiscreteOnePlusOne(parametrization=param, budget=100, num_workers=1)

recommendation = optimizer.provide_recommendation()
for _ in range(optimizer.budget):
# loss = onemax(*x.args, **x.kwargs)  # equivalent to x.value if not using Instrumentation
loss = onemax(x.value)
optimizer.tell(x, loss)

recommendation = optimizer.provide_recommendation()
print(recommendation.value)
# >>> (1, 1, 0, 1, 1, 4, 1, 1, 1, 1)
```

## Using several workers

Running the function evaluation in parallel with several workers is as easy as providing an executor:

```from concurrent import futures

optimizer = ng.optimizers.NGOpt(parametrization=instrum, budget=10, num_workers=2)
# use the line below:
#with futures.ProcessPoolExecutor(max_workers=optimizer.num_workers) as executor:
recommendation = optimizer.minimize(square, executor=executor, batch_mode=False)
```

With `batch_mode=True` it will ask the optimizer for `num_workers` points to evaluate, run the evaluations, then update the optimizer with the `num_workers` function outputs, and repeat until the budget is all spent. Since no executor is provided, the evaluations will be sequential. `num_workers > 1` with no executor is therefore suboptimal but nonetheless useful for evaluation purpose (i.e. we simulate parallelism but have no actual parallelism). `batch_mode=False` (steady state mode) will ask for a new evaluation whenever a worker is ready.

An ask and tell interface is also available. The 3 key methods for this interface are respectively:

• `ask`: suggest a candidate on which to evaluate the function to optimize.

• `tell`: for updated the optimizer with the value of the function for a candidate.

• `provide_recommendation`: returns the candidate the algorithms considers the best.

For most optimization algorithms in the platform, they can be called in arbitrary order - asynchronous optimization is OK. Some algorithms (with class attribute `no_parallelization=True` however do not support this.

The `Parameter` class holds attribute `value` which contain the actual value to evaluate through the function.

Here is a simpler example in the sequential case (this is what happens in the `optimize` method for `num_workers=1`):

```import nevergrad as ng

def square(x, y=12):
return sum((x - 0.5) ** 2) + abs(y)

instrum = ng.p.Instrumentation(ng.p.Array(shape=(2,)), y=ng.p.Scalar())  # We are working on R^2 x R.
optimizer = ng.optimizers.NGOpt(parametrization=instrum, budget=100, num_workers=1)

for _ in range(optimizer.budget):
loss = square(*x.args, **x.kwargs)
optimizer.tell(x, loss)

recommendation = optimizer.provide_recommendation()
print(recommendation.value)
```

Please make sure that your function returns a float, and that you indeed want to perform minimization and not maximization ;)

## Choosing an optimizer

`ng.optimizers.registry` is a `dict` of all optimizers, so you `ng.optimizers.NGOpt` is equivalent to `ng.optimizers.registry["NGOpt"]`. Also, you can print the full list of optimizers with:

```import nevergrad as ng

print(sorted(ng.optimizers.registry.keys()))
```

All algorithms have strengths and weaknesses. Questionable rules of thumb could be:

• `NGOpt` is “meta”-optimizer which adapts to the provided settings (budget, number of workers, parametrization) and should therefore be a good default.

• `TwoPointsDE` is excellent in many cases, including very high `num_workers`.

• `PortfolioDiscreteOnePlusOne` is excellent in discrete settings of mixed settings when high precision on parameters is not relevant; it’s possibly a good choice for hyperparameter choice.

• `OnePlusOne` is a simple robust method for continuous parameters with `num_workers` < 8.

• `CMA` is excellent for control (e.g. neurocontrol) when the environment is not very noisy (num_workers ~50 ok) and when the budget is large (e.g. 1000 x the dimension).

• `TBPSA` is excellent for problems corrupted by noise, in particular overparameterized (neural) ones; very high `num_workers` ok).

• `PSO` is excellent in terms of robustness, high `num_workers` ok.

• `ScrHammersleySearchPlusMiddlePoint` is excellent for super parallel cases (fully one-shot, i.e. `num_workers` = budget included) or for very multimodal cases (such as some of our MLDA problems); don’t use softmax with this optimizer.

• `RandomSearch` is the classical random search baseline; don’t use softmax with this optimizer.

## Telling non-asked points, or suggesting points

There are two ways to inoculate information you already have about some points:

• `optimizer.sugggest(*args, **kwargs)`: after suggesting a point, the next `ask` will be a point with the provided inputs. Make sure you call `optimizer.suggest` the same way (= with the same arguments) that you would call your function to optimize.

• `candidate = optimizer.parametrization.spawn_child(new_value=your_value)` which you can then use to `tell` the optimizer with the corresponding loss.

Examples:

• parametrized with an `ng.p.Instrumentation`

```param = ng.p.Instrumentation(ng.p.Choice(["a", "b", "c"]), lr=ng.p.Log(lower=0.001, upper=1.0))
optim = ng.optimizers.NGOpt(parametrization=param, budget=100)
optim.suggest("c", lr=0.02)
# equivalent to:
candidate = optim.parametrization.spawn_child(new_value=(("c",), {"lr": 0.02}))
# you can then use to tell the loss
optim.tell(candidate, 2.0)
```
• parametrized with an `Array`:

```optim = ng.optimizers.NGOpt(parametrization=2, budget=100)
optim.suggest([12, 12])
# equivalent to:
candidate = optim.parametrization.spawn_child(new_value=[12, 12])
# you can then use to tell the loss
optim.tell(candidate, 2.0)
```

Note: some optimizers do not support such inoculation. Those will raise a `TellNotAskedNotSupportedError`.

You can add callbacks to the `ask` and `tell` methods through the `register_callback` method. The functions/callbacks registered on `ask` must have signature `callback (optimizer)` and functions registered on `tell` must have signature `function(optimizer, candidate, value)`.

The example below shows a callback which prints `candidate` and `value` on `tell`:

```import nevergrad as ng

def my_function(x):
return abs(sum(x - 1))

def print_candidate_and_value(optimizer, candidate, value):
print(candidate, value)

optimizer = ng.optimizers.NGOpt(parametrization=2, budget=4)
optimizer.register_callback("tell", print_candidate_and_value)
optimizer.minimize(my_function)  # triggers a print at each tell within minimize
```

Two callbacks are available through `ng.callbacks`, see the callbacks module documentation.

## Optimization with constraints

Nevergrad has a mechanism for cheap constraints. “Cheap” means that we do not try to reduce the number of calls to such constraints. We basically repeat mutations until we get a satisfiable point.

Let us say that we want to minimize `(x-.5)**2 + (x-.5)**2` under the constraint `x >= 1`.

```import nevergrad as ng

def square(x):
return sum((x - 0.5) ** 2)

optimizer = ng.optimizers.NGOpt(parametrization=2, budget=100)
# define a constraint on first variable of x:
optimizer.parametrization.register_cheap_constraint(lambda x: x >= 1)

recommendation = optimizer.minimize(square, verbosity=2)
print(recommendation.value)
# >>> [1.00037625, 0.50683314]
```

Note that we can provide a richer information by using float-valued constraints (>= 0 if ok):

```import nevergrad as ng

def square(x):
return sum((x - 0.5) ** 2)

optimizer = ng.optimizers.NGOpt(parametrization=2, budget=100)
# define a constraint on first variable of x:
optimizer.parametrization.register_cheap_constraint(lambda x: x - 1)

recommendation = optimizer.minimize(square, verbosity=2)
print(recommendation.value)
# >>> [1.00037625, 0.50683314]
```

## Optimizing machine learning hyperparameters

When optimizing hyperparameters as e.g. in machine learning. If you don’t know what variables (see Parametrization to use:

• use `Choice` for discrete variables

• use `TwoPointsDE` with `num_workers` equal to the number of workers available to you. See the machine learning examples for more.

Or if you want something more aimed at robustly outperforming random search in highly parallel settings (one-shot):

• use `TransitionChoice` for discrete variables, taking care that the default value is in the middle.

• Use `ScrHammersleySearchPlusMiddlePoint` (`PlusMiddlePoint` only if you have continuous parameters or good default values for discrete parameters).

## Example with permutation

SimpleTSP and ComplexTSP are two cases of optimization on a domain of permutations: example here. This is relevant when you optimize a single big permutation. Also includes cases with many small permutations.

## Example of chaining, or inoculation, or initialization of an evolutionary algorithm

Chaining consists in running several algorithms in turn, information being forwarded from the first to the second and so on. More precisely, the budget is distributed over several algorithms, and when an objective function value is computed, all algorithms are informed.

Here is how to create such optimizers:

```# Running LHSSearch with budget num_workers and then DE:
DEwithLHS = Chaining([LHSSearch, DE], ["num_workers"])

# Runninng LHSSearch with budget the dimension and then DE:
DEwithLHSdim = Chaining([LHSSearch, DE], ["dimension"])

# Runnning LHSSearch with budget 30 and then DE:
DEwithLHS30 = Chaining([LHSSearch, DE], )

# Running LHS for 100 iterations, then DE for 60, then CMA:
LHSthenDEthenCMA = Chaining([LHSSearch, DE, CMA], [100, 60])
```

We can then minimize as usual:

```import nevergrad as ng

def square(x):
return sum((x - .5)**2)

optimizer = DEwithLHS30(parametrization=2, budget=300)
recommendation = optimizer.minimize(square)
print(recommendation.value)
>>> [0.50843113, 0.5104554]
```

Multiobjective minimization is a work in progress in `nevergrad`. It is:

• not stable: the API may be updated at any time, hopefully to make it simpler and more intuitive.

• not robust: there are probably corner cases we have not investigated yet.

• not scalable: it is not yet clear how the current version will work with large number of losses, or large budget. For now the features have been implemented without time complexity considerations.

• not optimal: this currently transforms multiobjective functions into monoobjective functions, hence losing some structure and making the function dynamic, which some optimizers are not designed to work on.

In other words, use it at your own risk ;) and provide feedbacks (both positive and negative) if you have any!

To perform multiobjective optimization, you can just provide `tell` with the results as an array or list of floats:

```import nevergrad as ng
import numpy as np

def multiobjective(x):
return [np.sum(x**2), np.sum((x - 1) ** 2)]

print("Example: ", multiobjective(np.array([1.0, 2.0, 0])))
# >>> Example: [5.0, 2.0]

optimizer = ng.optimizers.CMA(parametrization=3, budget=100)

# for all but DE optimizers, deriving a volume out of the losses,
# it's not strictly necessary but highly advised to provide an
# upper bound reference for the losses (if not provided, such upper
# bound is automatically inferred with the first few "tell")
optimizer.tell(ng.p.MultiobjectiveReference(), [5, 5])
# note that you can provide a Parameter to MultiobjectiveReference,
# which will be passed to the optimizer

optimizer.minimize(multiobjective, verbosity=2)

# The function embeds its Pareto-front:
print("Pareto front:")
for param in sorted(optimizer.pareto_front(), key=lambda p: p.losses):
print(f"{param} with losses {param.losses}")

# >>> Array{(3,)}:[0. 0. 0.] with loss [0. 3.]
#     Array{(3,)}:[0.39480968 0.98105712 0.55785803] with loss [1.42955333 0.56210368]
#     Array{(3,)}:[1.09901515 0.97673712 0.97153943] with loss [3.10573857 0.01115516]

# It can also provide subsets:
print("Random subset:", optimizer.pareto_front(2, subset="random"))
print("Loss-covering subset:", optimizer.pareto_front(2, subset="loss-covering"))
print("Domain-covering subset:", optimizer.pareto_front(2, subset="domain-covering"))
print("EPS subset:", optimizer.pareto_front(2, subset="EPS"))

```

Currently most optimizers only derive a volume float loss from the multiobjective loss and minimize it. `DE` and its variants have however been updated to make use of the full multi-objective losses [#789](https://github.com/facebookresearch/nevergrad/pull/789), which make them good candidates for multi-objective minimization (`NGOpt` will delegate to DE in the case of multi-objective functions).

## Reproducibility

Each parametrization has its own `random_state` for generating random numbers. All optimizers pull from it when they require stochastic behaviors. For reproducibility, this random state can be seeded in two ways:

• by setting `numpy`’s global random state seed (`np.random.seed(32)`) before the parametrization’s first use. Indeed, when first used, the parametrization’s random state is seeded with a seed drawn from the global random state.

• by manually seeding the parametrization random state (E.g.: `parametrization.random_state.seed(12)` or `optimizer.parametrization.random_state = np.random.RandomState(12)`)