## optimization

Alan J. Lockett's Research Pages

# Meta-Optimization and Optimal Optimizers

Once we have a formal definition of optimizer performance and a prior distribution over objective functions (a function prior $\mathbb{P}_F$), then we can ask what the average performance of an optimizer is against a particular function prior. Mathematically, for a performance criterion $\phi$, this is $$\left<\mathcal{G},\mathbb{P}_F\right>_\phi = \mathbb{E}_{\mathbb{P}_F}\left[\phi\left(\mathcal{G},F\right)\right].$$ This equation itself may be optimized over some optimizer set $\mathfrak{G}$ to find the optimizer $\mathcal{G}$ that has optimal performance on $\mathbb{P}_F$, i.e. $$\mathcal{G}_{\mathrm{opt}} = \mathrm{argmax}_{\mathcal{G}\in\mathfrak{G}}\,\,\left<\mathcal{G},\mathbb{P}_F\right>_\phi.$$ The problem of finding the best optimizer under a given set of assumption about the uncertainty of the objective function is termed meta-optimization.

## Parameter Meta-Optimization

The optimizer set $\mathfrak{G}$ might be a space of parameterized versions of an optimization method, in which case meta-optimization seeks to discover the best parameters for a particular method. For example, Differential Evolution (DE) has three parameters: the population size $K$, the crossover rate $CR$, and the learning rate $F$. Meta-optimization over DE seeks to assign optimal values to these three parameters.

## Optimal Optimizers

Meta-optimization may also be considered in a more general setting. One might ask for the best possible trajectory-restricted optimizer, in which case $\mathfrak{G} = \mathcal{O}_{\mathrm{tr}}$. In that case, the result of meta-optimization is an optimal optimizer in a general sense, i.e. an optimizer that performs better than DE, CMA-ES, Nelder Mead, or any other trajectory-restricted method when tested on the function prior $\mathbb{P}_F$.