My research focuses on efficient optimization in complex spaces. Primarily, I am interested in Cognitive Robotics, or more simply, robot brains. I am working on general-purpose robotic controllers that will enable the creation of flexible robotic systems that are capable of responding dynamically to complex enviroments. In the future, such systems will guide high-precision robotic surgeons, route flying cars safely through urban areas, and manage life-support systems in space. Or perhaps they will just brew the perfect cup of espresso. In any case, to build more capable robots, better techniques for discovering dynamic high-level controllers are needed. And finding such controllers requires one to search and organize infinite-dimensional spaces using more than just the local information provided by gradients. Success in this search requires optimization beyond Newtonian principles.
The mathmatical space consisting of all optimization methods is closed, convex subset of a vector space. This surprising conclusion means that between any two optimization methods, there is an entire spectrum of optimizers that blend them, and it is highly possible, even likely, that the best performing optimizers lie somewhere within this spectrum rather than at the endpoints. Using the tools of modern functional analysis, it is possible to study these and other topics formally, leading to insights about the nature of optimization methods and their performance and even making it possible to state clear limits on the applicability of the No Free Lunch theorems. This line of research provides a formal framework for assessing the quality and success cases for non-gradient methods.
Given a particular random source of optimization problems, what is the most efficient approach to discovering the optimum? For example, how might one build a robot that efficiently learns to perform new manufacturing tasks? If the uncertainty surrounding a problem class is well understood, then it is possible to pose these questions formally for a given metric of performance. In fact, the problem classes can be very general without falling afoul of the No Free Lunch theorems, much more general than has previously been supposed. I conjecture that the optimal optimizer should optimize the conditional expectation of the performance metric at each step. This process may be thought of as Information Maximization, related to Shannon's Information Theory and to martingales in the theory of Stochastic Processes. In this research, I am looking for a proof of this conjecture or something similar to it.
Even if it is not possible to solve for the optimal optimizer, it may still possible to use the functional structure of the problem in order to optimize the configuration parameters of specific global optimization methods. This topic, called meta-optimization, tries to discover the best way to use existing optimization methods.
In line with my ideas on information maximization, I am developing optimization methods that explicitly seek to gather and retain as much information as possible about the problem being optimized in order to explore the most promising regions of the search domain. Such optimizers rely on martingales in order to search complex domains efficiently. Evolutionary annealing uses a mixture sieve model to sample the annealing distributions typically associated with simulated annealing. Its structure mirrors that of an evolutionary algorithm, and formally it is in fact and evolutionary algorithm, but with a much more mathematical flavor. Neuroannealing applies evolutionary annealing to search for neural network controllers. Experimentally, it can solve more complex tasks than some other popular methods for training neural network controllers, but at the cost of some extra overhead for more thorough searching. Additionally, I am working on optimizing over Brownian trajectories using a martingale method bsased directly on my Information Maximization conjecture. The main idea for the future of this research is that one should express the principles of uncertainty governing a class of problems one desires to solve, and then encode these principles inside of an algorithm that uses its observations to build increasingly accurate models of the task until a suitable solution is achieved.