Optimizers API Reference¶
Optimizer API¶
All the optimizers share the following common API:

class
nevergrad.optimizers.base.
Optimizer
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1)¶ Algorithm framework with 3 main functions:
ask()
which provides a candidate on which to evaluate the function to optimize.tell(candidate, loss)
which lets you provide the loss associated to points.provide_recommendation()
which provides the best final candidate.
Typically, one would call
ask()
num_workers times, evaluate the function on these num_workers points in parallel, update with the fitness value when the evaluations is finished, and iterate until the budget is over. At the very end, one would call provide_recommendation for the estimated optimum.This class is abstract, it provides internal equivalents for the 3 main functions, among which at least
_internal_ask_candidate
has to be overridden.Each optimizer instance should be used only once, with the initial provided budget
 Parameters
parametrization (int or Parameter) – either the dimension of the optimization space, or its parametrization
budget (int/None) – number of allowed evaluations
num_workers (int) – number of evaluations which will be run in parallel at once

ask
() → nevergrad.parametrization.core.Parameter¶ Provides a point to explore. This function can be called multiple times to explore several points in parallel
 Returns
The candidate to try on the objective function.
p.Parameter
have fieldargs
andkwargs
which can be directly used on the function (objective_function(*candidate.args, **candidate.kwargs)
). Return type

property
dimension
¶ Dimension of the optimization space.
 Type
int

dump
(filepath: Union[str, pathlib.Path]) → None¶ Pickles the optimizer into a file.

classmethod
load
(filepath: Union[str, pathlib.Path]) → X¶ Loads a pickle and checks that the class is correct.

minimize
(objective_function: Callable[[…], Union[float, Tuple[float, …], List[float], numpy.ndarray]], executor: Optional[nevergrad.common.typing.ExecutorLike] = None, batch_mode: bool = False, verbosity: int = 0) → nevergrad.parametrization.core.Parameter¶ Optimization (minimization) procedure
 Parameters
objective_function (callable) – A callable to optimize (minimize)
executor (Executor) – An executor object, with method
submit(callable, *args, **kwargs)
and returning a Futurelike object with methodsdone() > bool
andresult() > float
. The executor role is to dispatch the execution of the jobs locally/on a cluster/with multithreading depending on the implementation. Eg:concurrent.futures.ThreadPoolExecutor
batch_mode (bool) – when
num_workers = n > 1
, whether jobs are executed by batch (n
function evaluations are launched, we wait for all results and relaunch n evals) or not (whenever an evaluation is finished, we launch another one)verbosity (int) – print information about the optimization (0: None, 1: fitness values, 2: fitness values and recommendation)
 Returns
The candidate with minimal value.
p.Parameters
have fieldargs
andkwargs
which can be directly used on the function (objective_function(*candidate.args, **candidate.kwargs)
). Return type
Note
for evaluation purpose and with the current implementation, it is better to use batch_mode=True

property
num_ask
¶ Number of time the ask method was called.
 Type
int

property
num_tell
¶ Number of time the tell method was called.
 Type
int

property
num_tell_not_asked
¶ Number of time the
tell
method was called on candidates that were not asked for by the optimizer (or were suggested). Type
int

pareto_front
(size: Optional[int] = None, subset: str = 'random', subset_tentatives: int = 12) → List[nevergrad.parametrization.core.Parameter]¶ Pareto front, as a list of Parameter. The losses can be accessed through parameter.losses
 Parameters
size (int (optional)) – if provided, selects a subset of the full pareto front with the given maximum size
subset (str) – method for selecting the subset (“random, “losscovering”, “domaincovering”, “hypervolume”)
subset_tentatives (int) – number of random tentatives for finding a better subset
 Returns
the list of Parameter of the pareto front
 Return type
list
Note
During nonmultiobjective optimization, this returns the current pessimistic best

provide_recommendation
() → nevergrad.parametrization.core.Parameter¶ Provides the best point to use as a minimum, given the budget that was used
 Returns
The candidate with minimal value. p.Parameters have field
args
andkwargs
which can be directly used on the function (objective_function(*candidate.args, **candidate.kwargs)
). Return type

recommend
() → nevergrad.parametrization.core.Parameter¶ Provides the best candidate to use as a minimum, given the budget that was used.
 Returns
The candidate with minimal loss.
p.Parameters
have fieldargs
andkwargs
which can be directly used on the function (objective_function(*candidate.args, **candidate.kwargs)
). Return type

register_callback
(name: str, callback: Union[Callable[[Optimizer, p.Parameter, float], None], Callable[[Optimizer], None]]) → None¶ Add a callback method called either when tell or ask are called, with the same arguments (including the optimizer / self). This can be useful for custom logging.
 Parameters
name (str) – name of the method to register the callback for (either
ask
ortell
)callback (callable) – a callable taking the same parameters as the method it is registered upon (including self)

remove_all_callbacks
() → None¶ Removes all registered callables

suggest
(*args: Any, **kwargs: Any) → None¶ Suggests a new point to ask. It will be asked at the next call (last in first out).
 Parameters
*args (Any) – positional arguments matching the parametrization pattern.
*kwargs (Any) – keyword arguments matching the parametrization pattern.
Note
This relies on optmizers implementing a way to deal with unasked candidate. Some optimizers may not support it and will raise a
TellNotAskedNotSupportedError
attell
time.LIFO is used so as to be able to suggest and ask straightaway, as an alternative to creating a new candidate with
optimizer.parametrization.spawn_child(new_value)

tell
(candidate: nevergrad.parametrization.core.Parameter, loss: Union[float, Tuple[float, …], List[float], numpy.ndarray]) → None¶ Provides the optimizer with the evaluation of a fitness value for a candidate.
 Parameters
x (np.ndarray) – point where the function was evaluated
loss (float/list/np.ndarray) – loss of the function (or multiobjective function
Note
The candidate should generally be one provided by
ask()
, but can be also a nonasked candidate. To create a p.Parameter instance from args and kwargs, you can usecandidate = optimizer.parametrization.spawn_child(new_value=your_value)
:for an
Array(shape(2,))
:optimizer.parametrization.spawn_child(new_value=[12, 12])
for an
Instrumentation
:optimizer.parametrization.spawn_child(new_value=(args, kwargs))
Alternatively, you can provide a suggestion with
optimizer.suggest(*args, **kwargs)
, the nextask
will use this suggestion.
Callbacks¶
Callbacks can be registered through the optimizer.register_callback
for call on either ask
or tell
methods. Two of them are available through the
ng.callbacks namespace.

class
nevergrad.callbacks.
OptimizerDump
(filepath: Union[str, pathlib.Path])¶ Dumps the optimizer to a pickle file at every call.
 Parameters
filepath (str or Path) – path to the pickle file

class
nevergrad.callbacks.
ParametersLogger
(filepath: Union[str, pathlib.Path], append: bool = True, order: int = 1)¶ Logs parameter and run information throughout into a file during optimization.
 Parameters
filepath (str or pathlib.Path) – the path to dump data to
append (bool) – whether to append the file (otherwise it replaces it)
order (int) – order of the internal/model parameters to extract
Example
logger = ParametersLogger(filepath) optimizer.register_callback("tell", logger) optimizer.minimize() list_of_dict_of_data = logger.load()
Note
Arrays are converted to lists

load
() → List[Dict[str, Any]]¶ Loads data from the log file

load_flattened
(max_list_elements: int = 24) → List[Dict[str, Any]]¶ Loads data from the log file, and splits lists (arrays) into multiple arguments
 Parameters
max_list_elements (int) – Maximum number of elements displayed from the array, each element is given a unique id of type list_name#i0_i1_…

to_hiplot_experiment
(max_list_elements: int = 24) → Any¶ Converts the logs into an hiplot experiment for display.
 Parameters
max_list_elements (int) – maximum number of elements of list/arrays to export (only the first elements are extracted)
Example
exp = logs.to_hiplot_experiment() exp.display(force_full_width=True)
Note
You can easily change the axes of the XY plot:
exp.display_data(hip.Displays.XY).update({'axis_x': '0#0', 'axis_y': '0#1'})
For more context about hiplot, check:

class
nevergrad.callbacks.
ProgressBar
¶ Progress bar to register as callback in an optimizer
Configurable optimizers¶
Configurable optimizers share the following API to create optimizers instances:

class
nevergrad.optimizers.base.
ConfiguredOptimizer
(OptimizerClass: Type[nevergrad.optimization.base.Optimizer], config: Dict[str, Any], as_config: bool = False)¶ Creates optimizerlike instances with configuration.
 Parameters
OptimizerClass (type) – class of the optimizer to configure
config (dict) – dictionnary of all the configurations
as_config (bool) – whether to provide all config as kwargs to the optimizer instantiation (default, see ConfiguredCMA for an example), or through a config kwarg referencing self. (if True, see EvolutionStrategy for an example)
Note
This provides a default repr which can be bypassed through set_name

__call__
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1) → nevergrad.optimization.base.Optimizer¶ Creates an optimizer from the parametrization
 Parameters
instrumentation (int or Instrumentation) – either the dimension of the optimization space, or its instrumentation
budget (int/None) – number of allowed evaluations
num_workers (int) – number of evaluations which will be run in parallel at once

load
(filepath: Union[str, pathlib.Path]) → nevergrad.optimization.base.Optimizer¶ Loads a pickle and checks that it is an Optimizer.

set_name
(name: str, register: bool = False) → nevergrad.optimization.base.ConfiguredOptimizer¶ Set a new representation for the instance
Here is a list of the available configurable optimizers:
Parametrizable families of optimizers.
Caution
This module and its available classes are experimental and may change quickly in the near future.

class
nevergrad.families.
Chaining
(optimizers: Sequence[Union[nevergrad.optimization.base.ConfiguredOptimizer, Type[nevergrad.optimization.base.Optimizer]]], budgets: Sequence[Union[str, int]])¶ A chaining consists in running algorithm 1 during T1, then algorithm 2 during T2, then algorithm 3 during T3, etc. Each algorithm is fed with what happened before it.
 Parameters
optimizers (list of Optimizer classes) – the sequence of optimizers to use
budgets (list of int) – the corresponding budgets for each optimizer but the last one

class
nevergrad.families.
DifferentialEvolution
(*, initialization: str = 'gaussian', scale: Union[str, float] = 1.0, recommendation: str = 'optimistic', crossover: Union[str, float] = 0.5, F1: float = 0.8, F2: float = 0.8, popsize: Union[str, int] = 'standard', propagate_heritage: bool = False, multiobjective_adaptation: bool = True)¶ Differential evolution is typically used for continuous optimization. It uses differences between points in the population for doing mutations in fruitful directions; it is therefore a kind of covariance adaptation without any explicit covariance, making it super fast in high dimension. This class implements several variants of differential evolution, some of them adapted to genetic mutations as in Holland’s work), (this combination is termed
TwoPointsDE
in Nevergrad, corresponding tocrossover="twopoints"
), or to the noisy setting (coinedNoisyDE
, corresponding torecommendation="noisy"
). In that last case, the optimizer returns the mean of the individuals with fitness better than median, which might be stupid sometimes though.Default settings are CR =.5, F1=.8, F2=.8, currtobest, pop size is 30 Initial population: pure random.
 Parameters
initialization ("LHS", "QR" or "gaussian") – algorithm/distribution used for the initialization phase
scale (float or str) – scale of random component of the updates
recommendation ("pessimistic", "optimistic", "mean" or "noisy") – choice of the criterion for the best point to recommend
crossover (float or str) – crossover rate value, or strategy among:  “dimension”: crossover rate of 1 / dimension,  “random”: different random (uniform) crossover rate at each iteration  “onepoint”: one point crossover  “twopoints”: two points crossover  “parametrization”: use the parametrization recombine method
F1 (float) – differential weight #1
F2 (float) – differential weight #2
popsize (int, "standard", "dimension", "large") – size of the population to use. “standard” is max(num_workers, 30), “dimension” max(num_workers, 30, dimension +1) and “large” max(num_workers, 30, 7 * dimension).
multiobjective_adaptation (bool) – Automatically adapts to handle multiobjective case. This is a very basic experimental version, activated by default because the nonmultiobjective implementation is performing very badly.

class
nevergrad.families.
EMNA
(*, isotropic: bool = True, naive: bool = True, population_size_adaptation: bool = False, initial_popsize: Optional[int] = None)¶ Estimation of Multivariate Normal Algorithm This algorithm is quite efficient in a parallel context, i.e. when the population size is large.
 Parameters
isotropic (bool) – isotropic version on EMNA if True, i.e. we have an identity matrix for the Gaussian, else we here consider the separable version, meaning we have a diagonal matrix for the Gaussian (anisotropic)
naive (bool) – set to False for noisy problem, so that the best points will be an average of the final population.
population_size_adaptation (bool) – population size automatically adapts to the landscape
initial_popsize (Optional[int]) – initial (and minimal) population size (default: 4 x dimension)

class
nevergrad.families.
EvolutionStrategy
(*, recombination_ratio: float = 0, popsize: int = 40, offsprings: Optional[int] = None, only_offsprings: bool = False, ranker: str = 'simple')¶ Experimental evolutionstrategylike algorithm The API is going to evolve

class
nevergrad.families.
ParametrizedBO
(*, initialization: Optional[str] = None, init_budget: Optional[int] = None, middle_point: bool = False, utility_kind: str = 'ucb', utility_kappa: float = 2.576, utility_xi: float = 0.0, gp_parameters: Optional[Dict[str, Any]] = None)¶ Bayesian optimization. Hyperparameter tuning method, based on statistical modeling of the objective function. This class is a wrapper over the bayes_opt package.
 Parameters
initialization (str) – Initialization algorithms (None, “Hammersley”, “random” or “LHS”)
init_budget (int or None) – Number of initialization algorithm steps
middle_point (bool) – whether to sample the 0 point first
utility_kind (str) – Type of utility function to use among “ucb”, “ei” and “poi”
utility_kappa (float) – Kappa parameter for the utility function
utility_xi (float) – Xi parameter for the utility function
gp_parameters (dict) – dictionnary of parameters for the gaussian process

class
nevergrad.families.
ParametrizedCMA
(*, scale: float = 1.0, popsize: Optional[int] = None, diagonal: bool = False, fcmaes: bool = False, random_init: bool = False)¶ CMAES optimizer, This evolution strategy uses a Gaussian sampling, iteratively modified for searching in the best directions. This optimizer wraps an external implementation: https://github.com/CMAES/pycma
 Parameters
scale (float) – scale of the search
popsize (Optional[int] = None) – population size, should be n * self.num_workers for int n >= 1. default is max(self.num_workers, 4 + int(3 * np.log(self.dimension)))
diagonal (bool) – use the diagonal version of CMA (advised in big dimension)
fcmaes (bool = False) – use fast implementation, doesn’t support diagonal=True. produces equivalent results, preferable for high dimensions or if objective function evaluation is fast.

class
nevergrad.families.
ParametrizedOnePlusOne
(*, noise_handling: Optional[Union[str, Tuple[str, float]]] = None, mutation: str = 'gaussian', crossover: bool = False)¶ Simple but sometimes powerfull class of optimization algorithm. This use asynchronous updates, so that (1+1) can actually be parallel and even performs quite well in such a context  this is naturally close to (1+lambda).
 Parameters
noise_handling (str or Tuple[str, float]) –
Method for handling the noise. The name can be:
”random”: a random point is reevaluated regularly, this uses the onefifth adaptation rule, going back to Schumer and Steiglitz (1968). It was independently rediscovered by Devroye (1972) and Rechenberg (1973).
”optimistic”: the best optimistic point is reevaluated regularly, optimism in front of uncertainty
a coefficient can to tune the regularity of these reevaluations (default .05)
mutation (str) –
One of the available mutations from:
”gaussian”: standard mutation by adding a Gaussian random variable (with progressive widening) to the best pessimistic point
”cauchy”: same as Gaussian but with a Cauchy distribution.
 ”discrete”: when a variable is mutated (which happens with probability 1/d in dimension d), it’s just
randomly drawn. This means that on average, only one variable is mutated.
”discreteBSO”: as in brainstorm optimization, we slowly decrease the mutation rate from 1 to 1/d.
”fastga”: FastGA mutations from the current best
”doublefastga”: doubleFastGA mutations from the current best (Doerr et al, Fast Genetic Algorithms, 2017)
”portfolio”: Random number of mutated bits (called niform mixing in Dang & Lehre “Selfadaptation of Mutation Rates in Nonelitist Population”, 2016)
”lengler”: specific mutation rate chosen as a function of the dimension and iteration index.
crossover (bool) – whether to add a genetic crossover step every other iteration.
Notes
After many papers advocared the mutation rate 1/d in the discrete (1+1) for the discrete case, it was proposed to use of a randomly drawn mutation rate. Fast genetic algorithms are based on a similar idea These two simple methods perform quite well on a wide range of problems.

class
nevergrad.families.
ParametrizedTBPSA
(*, naive: bool = True, initial_popsize: Optional[int] = None)¶ Testbased populationsize adaptation This method, based on adapting the population size, performs the best in many noisy optimization problems, even in large dimension
 Parameters
naive (bool) – set to False for noisy problem, so that the best points will be an average of the final population.
initial_popsize (Optional[int]) – initial (and minimal) population size (default: 4 x dimension)
Note
Derived from: Hellwig, Michael & Beyer, HansGeorg. (2016). Evolution under Strong Noise: A SelfAdaptive Evolution Strategy Reaches the Lower Performance Bound – the pcCMSAES. https://homepages.fhv.at/hgb/NewPapers/PPSN16_HB16.pdf

class
nevergrad.families.
RandomSearchMaker
(*, middle_point: bool = False, stupid: bool = False, opposition_mode: Optional[str] = None, cauchy: bool = False, scale: Union[float, str] = 1.0, recommendation_rule: str = 'pessimistic')¶ Provides random suggestions.
 Parameters
stupid (bool) – Provides a random recommendation instead of the best point so far (for baseline)
middle_point (bool) – enforces that the first suggested point (ask) is zero.
opposition_mode (str or None) –
 symmetrizes exploration wrt the center: (e.g. https://ieeexplore.ieee.org/document/4424748)
full symmetry if “opposite”
random * symmetric if “quasi”
cauchy (bool) – use a Cauchy distribution instead of Gaussian distribution
scale (float or "random") –
 scalar for multiplying the suggested point values, or string:
”random”: uses a randomized pattern for the scale.
”auto”: scales in function of dimension and budget (version 1: sigma = (1+log(budget)) / (4log(dimension)) )
”autotune”: scales in function of dimension and budget (version 2: sigma = sqrt(log(budget) / dimension) )
recommendation_rule (str) – “average_of_best” or “pessimistic” or “average_of_exp_best”; “pessimistic” is the default and implies selecting the pessimistic best.

class
nevergrad.families.
SamplingSearch
(*, sampler: str = 'Halton', scrambled: bool = False, middle_point: bool = False, opposition_mode: Optional[str] = None, cauchy: bool = False, autorescale: Union[bool, str] = False, scale: float = 1.0, rescaled: bool = False, recommendation_rule: str = 'pessimistic')¶ This is a oneshot optimization method, hopefully better than random search by ensuring more uniformity.
 Parameters
sampler (str) – Choice of the sampler among “Halton”, “Hammersley” and “LHS”.
scrambled (bool) – Adds scrambling to the search; much better in high dimension and rarely worse than the original search.
middle_point (bool) – enforces that the first suggested point (ask) is zero.
cauchy (bool) – use Cauchy inverse distribution instead of Gaussian when fitting points to real space (instead of box).
scale (float or "random") – scalar for multiplying the suggested point values.
rescaled (bool or str) – rescales the sampling pattern to reach the boundaries and/or applies automatic rescaling.
recommendation_rule (str) – “average_of_best” or “pessimistic”; “pessimistic” is the default and implies selecting the pessimistic best.
Notes
Halton is a low quality sampling method when the dimension is high; it is usually better to use Halton with scrambling.
When the budget is known in advance, it is also better to replace Halton by Hammersley. Basically the key difference with Halton is adding one coordinate evenly spaced (the discrepancy is better). budget, low discrepancy sequences (e.g. scrambled Hammersley) have a better discrepancy.
Reference: Halton 1964: Algorithm 247: Radicalinverse quasirandom point sequence, ACM, p. 701. adds scrambling to the Halton search; much better in high dimension and rarely worse than the original Halton search.
About Latin Hypercube Sampling (LHS): Though partially incremental versions exist, this implementation needs the budget in advance. This can be great in terms of discrepancy when the budget is not very high.

class
nevergrad.families.
ScipyOptimizer
(*, method: str = 'NelderMead', random_restart: bool = False)¶ Wrapper over Scipy optimizer implementations, in standard ask and tell format. This is actually an import from scipyoptimize, including Sequential Quadratic Programming,
 Parameters
method (str) –
Name of the method to use among:
NelderMead
COBYLA
SQP (or SLSQP): very powerful e.g. in continuous noisy optimization. It is based on approximating the objective function by quadratic models.
Powell
random_restart (bool) – whether to restart at a random point if the optimizer converged but the budget is not entirely spent yet (otherwise, restarts from best point)
Note
These optimizers do not support asking several candidates in a row
Optimizers¶
Here are all the other optimizers available in nevergrad
:
Caution
Only nonfamilybased optimizers are listed in the documentation,
you can get a full list of available optimizers with sorted(nevergrad.optimizers.registry.keys())

class
nevergrad.optimization.optimizerlib.
ASCMA2PDEthird
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1)¶ Algorithm selection, with CMA and 2ptDE. Active selection at 1/3.

class
nevergrad.optimization.optimizerlib.
ASCMADEQRthird
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1)¶ Algorithm selection, with CMA, ScrHalton and LhsDE. Active selection at 1/3.

class
nevergrad.optimization.optimizerlib.
ASCMADEthird
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1)¶ Algorithm selection, with CMA and LhsDE. Active selection at 1/3.

class
nevergrad.optimization.optimizerlib.
CM
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1)¶ Competence map, simplest.

class
nevergrad.optimization.optimizerlib.
CMandAS
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1)¶ Competence map, with algorithm selection in one of the cases (2 CMAs).

class
nevergrad.optimization.optimizerlib.
CMandAS2
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1)¶ Competence map, with algorithm selection in one of the cases (3 CMAs).

class
nevergrad.optimization.optimizerlib.
CMandAS3
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1)¶ Competence map, with algorithm selection in one of the cases (3 CMAs).

class
nevergrad.optimization.optimizerlib.
Chaining
(optimizers: Sequence[Union[nevergrad.optimization.base.ConfiguredOptimizer, Type[nevergrad.optimization.base.Optimizer]]], budgets: Sequence[Union[str, int]])¶ A chaining consists in running algorithm 1 during T1, then algorithm 2 during T2, then algorithm 3 during T3, etc. Each algorithm is fed with what happened before it.
 Parameters
optimizers (list of Optimizer classes) – the sequence of optimizers to use
budgets (list of int) – the corresponding budgets for each optimizer but the last one

class
nevergrad.optimization.optimizerlib.
ConfSplitOptimizer
(*, num_optims: Optional[int] = None, num_vars: Optional[List[int]] = None, multivariate_optimizer: Union[ConfiguredOptimizer, Type[Optimizer]] = CMA, monovariate_optimizer: Union[ConfiguredOptimizer, Type[Optimizer]] = RandomSearch, progressive: bool = False, non_deterministic_descriptor: bool = True)¶ “Combines optimizers, each of them working on their own variables.
 Parameters
num_optims (int) – number of optimizers
num_vars (optional list of int) – number of variable per optimizer.
progressive (optional bool) – whether we progressively add optimizers.
non_deterministic_descriptor (bool) – subparts parametrization descriptor is set to noisy function. This can have an impact for optimizer selection for NGOpt optimizers.

class
nevergrad.optimization.optimizerlib.
ConfiguredPSO
(transform: str = 'identity', wide: bool = False, popsize: Optional[int] = None)¶ Particle Swarm Optimization is based on a set of particles with their inertia. Wikipedia provides a beautiful illustration ;) (see link)
 Parameters
transform (str) – name of the transform to use to map from PSO optimization space to Rspace.
wide (bool) – if True: legacy initialization in [1,1] box mapped to R
popsize (int) – population size of the particle swarm. Defaults to max(40, num_workers)
Note
Using nondefault “transform” and “wide” parameters can lead to extreme values
Implementation partially following SPSO2011. However, no randomization of the population order.
Reference: M. ZambranoBigiarini, M. Clerc and R. Rojas, Standard Particle Swarm Optimisation 2011 at CEC2013: A baseline for future PSO improvements, 2013 IEEE Congress on Evolutionary Computation, Cancun, 2013, pp. 23372344. https://ieeexplore.ieee.org/document/6557848

class
nevergrad.optimization.optimizerlib.
EDA
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1)¶ Testbased populationsize adaptation.
Populationsize equal to lambda = 4 x dimension. Test by comparing the first fifth and the last fifth of the 5lambda evaluations.
Caution
This optimizer is probably wrong.

class
nevergrad.optimization.optimizerlib.
EMNA
(*, isotropic: bool = True, naive: bool = True, population_size_adaptation: bool = False, initial_popsize: Optional[int] = None)¶ Estimation of Multivariate Normal Algorithm This algorithm is quite efficient in a parallel context, i.e. when the population size is large.
 Parameters
isotropic (bool) – isotropic version on EMNA if True, i.e. we have an identity matrix for the Gaussian, else we here consider the separable version, meaning we have a diagonal matrix for the Gaussian (anisotropic)
naive (bool) – set to False for noisy problem, so that the best points will be an average of the final population.
population_size_adaptation (bool) – population size automatically adapts to the landscape
initial_popsize (Optional[int]) – initial (and minimal) population size (default: 4 x dimension)

exception
nevergrad.optimization.optimizerlib.
InfiniteMetaModelOptimum
¶ Sometimes the optimum of the metamodel is at infinity.

class
nevergrad.optimization.optimizerlib.
MEDA
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1)¶

class
nevergrad.optimization.optimizerlib.
MPCEDA
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1)¶

class
nevergrad.optimization.optimizerlib.
ManyCMA
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1)¶ Combining 3 CMAs. Exactly identical. Active selection at 1/3 of the budget.

class
nevergrad.optimization.optimizerlib.
ManySmallCMA
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1)¶ Combining 3 CMAs. Exactly identical. Active selection at 1/3 of the budget.

class
nevergrad.optimization.optimizerlib.
MetaModel
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1, multivariate_optimizer: nevergrad.optimization.base.ConfiguredOptimizer = CMA)¶ Adding a metamodel into CMA.

class
nevergrad.optimization.optimizerlib.
MultiCMA
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1)¶ Combining 3 CMAs. Exactly identical. Active selection at 1/10 of the budget.

class
nevergrad.optimization.optimizerlib.
MultiDiscrete
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1)¶ Combining 3 Discrete(1+1). Exactly identical. Active selection at 1/10 of the budget.

class
nevergrad.optimization.optimizerlib.
MultiScaleCMA
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1)¶ Combining 3 CMAs with different init scale. Active selection at 1/3 of the budget.

class
nevergrad.optimization.optimizerlib.
NGO
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1)¶

class
nevergrad.optimization.optimizerlib.
NGOpt
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1)¶

class
nevergrad.optimization.optimizerlib.
NGOpt2
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1)¶ Nevergrad optimizer by competence map. You might modify this one for designing youe own competence map.

class
nevergrad.optimization.optimizerlib.
NGOpt4
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1)¶ Nevergrad optimizer by competence map. You might modify this one for designing youe own competence map.

class
nevergrad.optimization.optimizerlib.
NGOpt8
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1)¶ Nevergrad optimizer by competence map. You might modify this one for designing youe own competence map.

class
nevergrad.optimization.optimizerlib.
NGOptBase
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1)¶ Nevergrad optimizer by competence map.

property
optim
¶

recommend
() → nevergrad.parametrization.core.Parameter¶ Provides the best candidate to use as a minimum, given the budget that was used.
 Returns
The candidate with minimal loss.
p.Parameters
have fieldargs
andkwargs
which can be directly used on the function (objective_function(*candidate.args, **candidate.kwargs)
). Return type

property

class
nevergrad.optimization.optimizerlib.
NoisyBandit
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1)¶ UCB. This is upper confidence bound (adapted to minimization), with very poor parametrization; in particular, the logarithmic term is set to zero. Infinite arms: we add one arm when 20 * #ask >= #arms ** 3.

class
nevergrad.optimization.optimizerlib.
PCEDA
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1)¶

class
nevergrad.optimization.optimizerlib.
PSO
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1, transform: str = 'arctan', wide: bool = False, popsize: Optional[int] = None)¶

class
nevergrad.optimization.optimizerlib.
ParaPortfolio
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1)¶ Passive portfolio of CMA, 2pt DE, PSO, SQP and ScrHammersley.

class
nevergrad.optimization.optimizerlib.
ParametrizedBO
(*, initialization: Optional[str] = None, init_budget: Optional[int] = None, middle_point: bool = False, utility_kind: str = 'ucb', utility_kappa: float = 2.576, utility_xi: float = 0.0, gp_parameters: Optional[Dict[str, Any]] = None)¶ Bayesian optimization. Hyperparameter tuning method, based on statistical modeling of the objective function. This class is a wrapper over the bayes_opt package.
 Parameters
initialization (str) – Initialization algorithms (None, “Hammersley”, “random” or “LHS”)
init_budget (int or None) – Number of initialization algorithm steps
middle_point (bool) – whether to sample the 0 point first
utility_kind (str) – Type of utility function to use among “ucb”, “ei” and “poi”
utility_kappa (float) – Kappa parameter for the utility function
utility_xi (float) – Xi parameter for the utility function
gp_parameters (dict) – dictionnary of parameters for the gaussian process

no_parallelization
= True¶

class
nevergrad.optimization.optimizerlib.
ParametrizedCMA
(*, scale: float = 1.0, popsize: Optional[int] = None, diagonal: bool = False, fcmaes: bool = False, random_init: bool = False)¶ CMAES optimizer, This evolution strategy uses a Gaussian sampling, iteratively modified for searching in the best directions. This optimizer wraps an external implementation: https://github.com/CMAES/pycma
 Parameters
scale (float) – scale of the search
popsize (Optional[int] = None) – population size, should be n * self.num_workers for int n >= 1. default is max(self.num_workers, 4 + int(3 * np.log(self.dimension)))
diagonal (bool) – use the diagonal version of CMA (advised in big dimension)
fcmaes (bool = False) – use fast implementation, doesn’t support diagonal=True. produces equivalent results, preferable for high dimensions or if objective function evaluation is fast.

class
nevergrad.optimization.optimizerlib.
ParametrizedOnePlusOne
(*, noise_handling: Optional[Union[str, Tuple[str, float]]] = None, mutation: str = 'gaussian', crossover: bool = False)¶ Simple but sometimes powerfull class of optimization algorithm. This use asynchronous updates, so that (1+1) can actually be parallel and even performs quite well in such a context  this is naturally close to (1+lambda).
 Parameters
noise_handling (str or Tuple[str, float]) –
Method for handling the noise. The name can be:
”random”: a random point is reevaluated regularly, this uses the onefifth adaptation rule, going back to Schumer and Steiglitz (1968). It was independently rediscovered by Devroye (1972) and Rechenberg (1973).
”optimistic”: the best optimistic point is reevaluated regularly, optimism in front of uncertainty
a coefficient can to tune the regularity of these reevaluations (default .05)
mutation (str) –
One of the available mutations from:
”gaussian”: standard mutation by adding a Gaussian random variable (with progressive widening) to the best pessimistic point
”cauchy”: same as Gaussian but with a Cauchy distribution.
 ”discrete”: when a variable is mutated (which happens with probability 1/d in dimension d), it’s just
randomly drawn. This means that on average, only one variable is mutated.
”discreteBSO”: as in brainstorm optimization, we slowly decrease the mutation rate from 1 to 1/d.
”fastga”: FastGA mutations from the current best
”doublefastga”: doubleFastGA mutations from the current best (Doerr et al, Fast Genetic Algorithms, 2017)
”portfolio”: Random number of mutated bits (called niform mixing in Dang & Lehre “Selfadaptation of Mutation Rates in Nonelitist Population”, 2016)
”lengler”: specific mutation rate chosen as a function of the dimension and iteration index.
crossover (bool) – whether to add a genetic crossover step every other iteration.
Notes
After many papers advocared the mutation rate 1/d in the discrete (1+1) for the discrete case, it was proposed to use of a randomly drawn mutation rate. Fast genetic algorithms are based on a similar idea These two simple methods perform quite well on a wide range of problems.

class
nevergrad.optimization.optimizerlib.
ParametrizedTBPSA
(*, naive: bool = True, initial_popsize: Optional[int] = None)¶ Testbased populationsize adaptation This method, based on adapting the population size, performs the best in many noisy optimization problems, even in large dimension
 Parameters
naive (bool) – set to False for noisy problem, so that the best points will be an average of the final population.
initial_popsize (Optional[int]) – initial (and minimal) population size (default: 4 x dimension)
Note
Derived from: Hellwig, Michael & Beyer, HansGeorg. (2016). Evolution under Strong Noise: A SelfAdaptive Evolution Strategy Reaches the Lower Performance Bound – the pcCMSAES. https://homepages.fhv.at/hgb/NewPapers/PPSN16_HB16.pdf

class
nevergrad.optimization.optimizerlib.
PolyCMA
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1)¶ Combining 20 CMAs. Exactly identical. Active selection at 1/3 of the budget.

class
nevergrad.optimization.optimizerlib.
Portfolio
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1)¶ Passive portfolio of CMA, 2pt DE and ScrHammersley.

class
nevergrad.optimization.optimizerlib.
SPSA
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1)¶ The First order SPSA algorithm as shown in [1,2,3], with implementation details from [4,5].
https://en.wikipedia.org/wiki/Simultaneous_perturbation_stochastic_approximation
Spall, James C. “Multivariate stochastic approximation using a simultaneous perturbation gradient approximation.” IEEE transactions on automatic control 37.3 (1992): 332341.
Section 7.5.2 in “Introduction to Stochastic Search and Optimization: Estimation, Simulation and Control” by James C. Spall.
Pushpendre Rastogi, Jingyi Zhu, James C. Spall CISS (2016). Efficient implementation of Enhanced Adaptive Simultaneous Perturbation Algorithms.

no_parallelization
= True¶

class
nevergrad.optimization.optimizerlib.
SQPCMA
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1)¶ Passive portfolio of CMA and many SQP.

class
nevergrad.optimization.optimizerlib.
Shiwa
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1)¶ Nevergrad optimizer by competence map. You might modify this one for designing youe own competence map.

class
nevergrad.optimization.optimizerlib.
SplitOptimizer
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1, num_optims: Optional[int] = None, num_vars: Optional[List[int]] = None, multivariate_optimizer: Union[ConfiguredOptimizer, Type[Optimizer]] = CMA, monovariate_optimizer: Union[ConfiguredOptimizer, Type[Optimizer]] = RandomSearch, progressive: bool = False, non_deterministic_descriptor: bool = True)¶ Combines optimizers, each of them working on their own variables.
 Parameters
num_optims (int or None) – number of optimizers
num_vars (int or None) – number of variable per optimizer.
progressive (bool) – True if we want to progressively add optimizers during the optimization run. If progressive = True, the optimizer is forced at OptimisticNoisyOnePlusOne.
Example
for 5 optimizers, each of them working on 2 variables, one can use:
opt = SplitOptimizer(parametrization=10, num_workers=3, num_optims=5, num_vars=[2, 2, 2, 2, 2]) or equivalently: opt = SplitOptimizer(parametrization=10, num_workers=3, num_vars=[2, 2, 2, 2, 2]) Given that all optimizers have the same number of variables, one can also run: opt = SplitOptimizer(parametrization=10, num_workers=3, num_optims=5)
Note
By default, it uses CMA for multivariate groups and RandomSearch for monovariate groups.
Caution
The variables refer to the deep representation used by optimizers. For example, a categorical variable with 5 possible values becomes 5 continuous variables.

class
nevergrad.optimization.optimizerlib.
TripleCMA
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1)¶ Combining 3 CMAs. Exactly identical. Active selection at 1/3 of the budget.

class
nevergrad.optimization.optimizerlib.
cGA
(parametrization: Union[int, nevergrad.parametrization.core.Parameter], budget: Optional[int] = None, num_workers: int = 1, arity: Optional[int] = None)¶ Compact Genetic Algorithm. A discrete optimization algorithm, introduced in and often used as a first baseline.

nevergrad.optimization.optimizerlib.
learn_on_k_best
(archive: nevergrad.optimization.utils.Archive[nevergrad.optimization.utils.MultiValue], k: int) → Union[Tuple[float, …], List[float], numpy.ndarray]¶ Approximate optimum learnt from the k best.
 Parameters
archive (utils.Archive[utils.Value]) –