10:00 
11:00 
Keynote: The Power of Tabulation Hashing
Mikkel Thorup (AT&T LabsResearch, USA)
Abstract
Randomized algorithms are often enjoyed for their simplicity,
but the hash functions used to yield the desired theoretical
guarantees are often neither simple nor practical. Here we show
that the simplest possible tabulation hashing provides
unexpectedly strong guarantees.
The scheme itself dates back to Carter and Wegman (STOC'77).
Keys are viewed as consisting of \(c\) characters. We initialize
\(c\) tables \(T_1, \dots, T_c\) mapping characters to random
hash codes. A key \(x=(x_1, \dots, x_c)\) is hashed to
\(T_1[x_1] \oplus \cdots \oplus T_c[x_c]\), where \(\oplus\)
denotes xor.
While this scheme is not even 4independent, we show that it
provides many of the guarantees that are normally obtained via
higher independence, e.g., Chernofftype concentration, minwise
hashing for estimating set intersection, and cuckoo
hashing.
We shall also discuss a twist to simple tabulation that leads to
extremely robust performance for linear probing with small buffers.

11:00 
11:15 
Coffee break



Session: Continuous Optimization and Information Geometry 
11:15 
12:15 
Tutorial: Basic Introduction to the Theoretical Analysis of RealValued Evolution Strategies
Jens Jägersküpper (German Aerospace Center, Germany)
Abstract
The basic search operator in classical realvalued evolutionary
algorithms, mainly evolution strategies (ES), is the so called
Gaussian mutation. This tutorial focuses on a rigorous
understanding of this basic search operator. With this, lower
bounds on the (expected) number of steps (or mutations)
necessary to obtain a given reduction of the approximation error
(in the search space) can be proved. In addition to these lower
bounds, the tutorial covers some simple scenarios, for which
upper bounds can be proved when a simple deterministic mechanism
for mutation adaptation is used, namely Rechberg's 1/5success
rule. Also a systemanalytical approach to the analysis of ES
that has been put forward by, e.g., HansGeorg Beyer is
introduced. Finally, the differing aspects of these two
approaches are discussed.

12:15 
14:00 
Lunch break (free sandwiches).

14:00 
14:30 
InformationGeometric Optimization: A New ViewPoint on Randomized Search Heuristics
Nikolaus Hansen (INRIA research centre Saclay, IledeFrance,
France)
Joint work with
Yann Ollivier, Anne Auger and Ludovic Arnold
Abstract
Estimation of distribution algorithms (EDAs), also known as
probabilistic modelbuilding algorithms, have given an
alternative viewpoint on randomized search heuristics: as
evolution of a probabilistic model (a probability distribution)
in the iteration sequence. A governing idea of EDAs is to
provide an explicite probabilistic model of promising candidate
solutions. The model parameters are usually chosen to achieve
maximum likelihood and new solutions are sampled from the
model.
More recently, a natural gradient ascent has been proposed for
updating the parameters of the multivariate normal distribution
in evolution strategies (natural evolution
strategies). Subsequently, it has been shown that also CMAES
conducts a natural gradient ascent on the manifold of
multivariate normal distributions. However, like for EDAs, the
idea of taking a natural gradient ascent on a probabilistic
model is independent of the underlying search domain. Only a
parametrized manifold of distributions is needed and we denote
this kind of domain independent randomized search algorithm as
informationgeometric optimization (IGO).
In contrast to EDAs, which focus on the question of how to build a
proper probabilistic model, the focus in IGO turns to the question of
how to change the model in an appropriate way. If the model updating
stepsize goes to zero, information geometry gives a conclusive answer
to this question: a natural gradient ascent must be
taken.
Surprisingly, we find that besides CMAES also PBIL, a classical EDA,
is an instantiation of IGO. This suggests that there is a formulation
of randomized search, where the natural gradient ascent and the
maximum likelihood notion actually coincide.

14:30 
15:00 
Rank\(\mu\) Update CMAES from the Viewpoint of Information Geometry
Youhei Akimoto (TAO Team, INRIA Saclay IledeFrance, France)
Abstract
The covariance matrix adaptation evolution strategy (CMAES) is
a stochastic and derivative free search algorithm for solving
continuous optimization problems. One of the simple variants of
CMAES is the socalled rank\(\mu\) update CMAES [3] that
repeats the following steps: sample \(\lambda\) independent
points form the multivariate normal distribution and update the
mean vector and the covariance matrix according to the fitness
values (objective function values) of sampled points.
Recent studies [1, 2] show that the rank\(\mu\) update CMAES ascent the
expectation of fitness function over the sampling distribution along the
natural gradient estimated from sampled points. The natural gradient
view of the CMAES suggest a way of theoretical investigation of the
CMAES.
From this viewpoint, one interesting study of the theory of the
CMAES is about the learning rates for update of the mean vector
and covariance matrix of the sampling distribution. When the
natural gradient can be estimated sufficiently well, an
infinitesimal step along the natural gradient must lead
improvement in the expected fitness. Then an interesting
question arise as to how long a step can lead improvement of the
expected fitness.
In this work we analyze the CMAES as a natural gradient method
on the expected fitness function. We prove monotone increase of
the expected fitness within a certain range of learning rates
when the exact natural gradient is given. Then, we show a
difference between the CMAES and EMNA, which is a variant of
estimation of distribution algorithms and is often compared with
the CMAES because of the similarity in algorithm from the point
of view of information geometry.

Y. Akimoto, Y. Nagata, I. Ono, and S. Kobayashi. Bidirectional
relation between CMA evolution strategies and natural
evolution strategies. In Parallel Problem Solving from Nature
 PPSN XI, pages 154163 (2010)

T. Glasmachers, T. Schaul, S. Yi, D. Wierstra, and
J. Schmidhuber. Exponential natural evolution strategies. In
Proceedings of Genetic and Evolutionary Computation
Conference, GECCO'10, pages 393400 (2010)

N. Hansen, S. D. Müller, and P. Koumoutsakos. Reducing
the time complexity of the derandomized evolution strategy
with covariance matrix adaptation (CMAES). Evolutionary
computation, 11(1):118 (2003)

15:00 
15:30 
Coffee break.



Session: Continuous Optimization and Open Problems 
15:30 
16:00 
Evolution Strategies in linearly constrained domains using efficient Gibbs sampling
Christian L. Müller (Department of Computer Science, ETH Zurich, Switzerland)
Abstract
We propose to combine stateofart continuous blackbox optimization
heuristics that use the multivariate normal distribution as
search operator with an efficient Gibbs sampler [1] for the
truncated normal distribution when the search domain is subject
to linear inequality constraints [2]. This synthesis provides a
generic way for constrained continuous blackbox optimization
because the optimizer is guaranteed to only operate on the
feasible domain. No problem or domainspecific
constrainthandling techniques have thus to be developed for the
given optimization task. The proposed sampler works for normal
distributions with arbitrary mean and covariance structure for
any number of linear constraints that form a nonempty
domain. Using the Gibbs sampling methodology, the computational
complexity of generating constrained samples is poly(\(n\)). As a
proof of concept we couple the sampler with two stateoftheart
search heuristics: (i) the Evolution Strategy with Covariance
Matrix Adaptation [3] and (ii) Gaussian Adaptation [4, 5]. We
present numerical examples that show the efficacy and efficiency of
our approach on the tangent problem from the field of
evolutionary computation [6] and on linear programs over certain
types of KleeMinty cubes [7].
 Stuart Geman and Donald Geman. Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration
of Images. IEEE Transactions On Pattern Analysis and Machine Intelligence, 6(6):721741, 1984.
 Gabriel RodriguezYam, Richard A. Davis, and Louis L. Scharf. Efficient Gibbs Sampling of Truncated Multi
variate Normal with Application to Constrained Linear Regression. submitted, 2004.
 Nikolaus Hansen. The CMA Evolution Strategy: A Tutorial, Nov 2007.
 Gregor Kjellström and Lars Taxen. Stochastic Optimization in System Design. IEEE Trans. Circ. and Syst.,
28(7), July 1981.
 Christian L. Müller and Ivo F. Sbalzarini. Gaussian adaptation revisited  an entropic view on covariance matrix
adaptation. In C. Di Chio et al., editor, EvoApplications, volume I of Lecture Notes in Computer Science, pages
432441. Springer, 2010.
 Oliver Kramer. A Review of ConstraintHandling Techniques for Evolution Strategies. Applied Computational
Intelligence and Soft Computing, Vol. 2010, 2010.
 Tomonari Kitahara and Shinji Mizuno. KleeMinty's LP and upper bounds for Dantzig's simplex method.
Operations Research Letters, 39(2):88  91, 2011.

16:00 
16:30 
Random derivativefree optimization of convex functions using a line search oracle
Sebastian U. Stich (Department of Computer Science, ETH Zurich, Switzerland)
Christian L. Müller (Department of Computer Science, ETH Zurich, Switzerland)
Bernd Gärtner (Department of Computer Science, ETH Zurich, Switzerland)
Abstract
We consider the general problem of finding the
minimum of a convex function \(f:\mathbb{R}^n\rightarrow
\mathbb{R}\). For firstorder methods, evaluating the gradients
or subgradients of the function \(f\), lower complexity bounds
and optimal methods are wellknown and studied, see Nesterov
[1]. In the field of derivativefree methods there are not many 
but still a few  rigorous performance bounds known and/or
proven. In 2011, Nesterov [2] presented several randomized
zerothorder methods for this optimization problem in the
cases when \(f\) is nonsmooth, smooth and strongly
convex. Nesterov showed complexity bounds for these methods that
do only differ by a factor of \(n\) iterations from the optimal
gradient methods. The search directions of these schemes were
normally distributed random Gaussian vectors. The key technique
in Nesterov's methods is to estimate directional derivatives of
the function by finite differences. Classical evolution
strategies, such as the \((1+1)\)evolution strategy (ES), adapt
a global step size in order to ensure a fixed acceptance rate.
Jägersküpper [3] showed that the \((1+1)\)ES with
1/5rule achieves a complexity bound that differs by a factor of
\(n\) from the standard gradient methods on strongly convex
quadratic functions.
We present a new algorithm, Random
Pursuit, that works by iteratively computing approximate
solutions to our convex optimization problem. In each iteration,
a new search direction \(u\) is selected uniformly at random
from all unit length vectors. Then the next iterate is
calculated as \(x_\text{new} = x_\text{old} + \arg\min_{h\in
\mathbb{R}} f (x_\text{old} + hu) \cdot u\). We prove complexity
bounds for this method, assuming the line search can be solved
efficiently. If the objective function has Lipschitzcontinuous
gradient, the expected rate of convergence of the algorithm is
of the order \(O(n/k)\), where \(k\) is the number of iterations
and \(n\) the dimension. If in addition the function is strongly
convex, we have a global linear rate of convergence, i.e. the
expected error in the function value decreases by a factor of
\((1  (\kappa n)^{1} )\) in each iteration, where \(\kappa\)
is the condition number of the function. These rates are
achieved by the algorithm even if the Lipschitz and strongly
convex parameters of the function are not known to the
algorithm. The algorithm and also the proof idea were inspired
by recent work of Kleiner et. al. [4]. They presented an
iterative optimization method for conic optimization problems.
Their method generates a new iterate by optimizing over random
twodimensional subcones.
Our bounds show that our zerothorder
algorithm needs \(n\) times more iteration than the standard
gradient method. It remains an open question whether also an
accelerated version of this algorithm  achieving the complexity
bounds of the fast gradient methods  exists.
 Y. Nesterov. Introductory Lectures on Convex Optimization. Kluwer, Boston, 2004.
 Y. Nesterov. Random GradientFree Minimization of Convex Functions. Technical report, ECORE, 2011.
 J. Jägersküpper. Rigorous runtime analysis of the (1+1) ES: \(1/5\)rule and ellipsoidal fitness landscapes.
In A. Wright, M. Vose, K. De Jong, and L. Schmitt, editors, Foundations of Genetic Algorithms, vol
ume 3469 of Lecture Notes in Computer Science, pages 356361. Springer Berlin / Heidelberg, 2005.
10.1007/11513575.14.
 A. Kleiner, A. Rahimi, and M. I. Jordan. Random conic pursuit for semidefinite programming. In Neural
Information Processing Systems, 2010.

16:30 
17:00 
An Open Question about Random Walks Pertaining to MCMC Sampling and the Coupling Method.
Boris Mitavskiy (Department of Computer Science, Aberystwyth University, UK)
Jonathan Rowe (School of Computer Science, University of Birmingham, UK)
Chris Cannings (School of Mathematics and Statistics, University of Sheffield, UK)
Abstract
Markov chains having symmetric transition matrices have been
frequently used to sample various combinatorial objects close to
uniformly at random. The notion of "closeness to uniform" is
defined in terms of the total variation distance from the
stationary (uniform in this case) distribution and mixing
time. Markov chain coupling method is frequently used to analyze
the mixing time. In this talk, I will introduce a very simple
irreducible Markov chain on labeled connected graphs over a
specified set of nodes having a specified number of edges with a
symmetric transition matrix. I will also introduce a simple and
natural coupling construction and demonstrate the difficulty of
using it in a classical manner. I will then present a rather
general theorem showing that it is sufficient to couple up to an
equivalence relation, (i.e. when the two copies of the chain
produce a pair of equivalent but not necessarily equal states
then the pair induced by the joint process in the next time step
stays equivalent but not necessarily equal with
probability one) in this case up to a graph isomorphism. Once
the expected time for the joint process to couple up to an
equivalence is known, one can simply sample uniformly at random
from the equivalence classes to obtain the desired outcome. It
is very easy to sample graphs uniformly from a specified
isomorphism class (just randomly relabel the nodes). This new
method allows to bypass the technical difficulty arising when
using the classical coupling. Nonetheless, there is
unfortunately yet another trap, namely a rather complicated
random walk on a graph with changing transition probabilities. I
will introduce this problem in hope that someone among the
audience may shed some light on how to estimate the desired
expected meeting time.

17:00 
17:15 
Closing
Christian Igel
Per Kristian Lehre
Carsten Witt
