Abstract:We consider the classic Multi-Armed Bandit setting to understand the exploration/exploitation tradeoffs made by different search heuristics. Since many search heuristics work by comparing different options (in evolutionary algorithms called "individuals"; in the Bandit literature called "arms"), we work with the "Dueling Bandits" setting. In each iteration, a comparison between different arms can be made; in the binary stochastic setting, each arm has a fixed winning probability against any other arm. A Condorcet winner is any arm that beats every other arm with a probability strictly higher than $1/2$. We show that evolutionary algorithms are rather bad at identifying the Condorcet winner: Even if the Condorcet winner beats every other arm with a probability $1-p$, the (1+1) EA, in its stationary distribution, chooses the Condorcet winner only with constant probability if $p=Ω(1/n)$. By contrast, we show that a simple EDA (based on the Max-Min Ant System with iteration-best update) will choose the Condorcet winner in its maintained distribution with probability $1-Θ(p)$. As a remedy for the (1+1) EA, we show how repeated duels can significantly boost the probability of the Condorcet winner in the stationary distribution.
Abstract:While most theoretical run time analyses of discrete randomized search heuristics provide bounds on the expected number of evaluations to find the global optimum, we consider the anytime performance of evolutionary and estimation-of-distribution algorithms. For this purpose, we analyze the fixed-target run time of various algorithms using BinVal as fitness function and bound the run time to optimize the most significant $k \in o(n)$ bits of a bit string with length $n$. We analyze the run times such that they hold not only for a fixed $k$, but simultaneously for all $k \in o(n)$. For the standard (1+1) EA with fixed mutation rate $1/n$, we show that the fixed-target run time for all $k \in o(n)$ is in $Θ(n \log k)$. Using an EDA instead, we get an expected number of evaluations of $Θ(k \log n)$ for the sig-cGA. Replacing in the standard (1+1) EA the fixed mutation rate with a self-adjusting rate, we show that the fixed-target run time for $k \in o(n)$ and a constant $\varepsilon >0$ arbitrarily close to zero is in $\mathcal{O}\left(k^{1+\varepsilon}\right)$ for this algorithm. In particular, this run time is independent of $n$, holds simultaneously for all $k \in o(n)$, and is close to the run time of $Θ(k \log k)$ for the (1+1) EA with the best fixed mutation rate if $k$ is known.