Evolving Evolvability in a Simple GA Setup

Andrés Moreira
Center for Mathematical Modeling and Departamento de Ingeniería Matemática,
FCFM, Universidad de Chile, Casilla 170/3-Correo 3, Santiago, Chile
anmoreir@dim.uchile.cl

Abstract
I consider the evolution of the evolvability in a changing environment, through a simple GA setup in which populations of individuals are selected and reproduced according to their performance on a variable fitness function. The relation between speed of change of that fitness function and the mutation rate chosen by the winning populations is explored.

The Evolution of Evolvability

Species (biological, digital or whatever) evolve, and the main responsible is (natural) selection. Features that vary between different individuals give them different degrees of fitness, and these in turn cause a bias in the generation of the population's offspring, which eventually leads to the spread of certain traits and a change in the species. A very important property of species, then, is their variability: on one hand, robustness is needed (to be viable, offspring must resemble the parents), and on the other, flexibility is needed if adaptation is wanted. The variability of a species will be a tradeoff between these two aspects; in the language of genetic algorithms, there is a tradeoff between the ability to exploit known candidate solutions, and the need to explore unknown regions of a fitness landscape.

How is this degree of variability selected? Variability is, clearly, a feature that is also subject to evolution. But there are two important differences between the evolution of evolvability and the evolution of other species' traits. First, evolvability is not so much a feature of the individual organisms, as it is a feature of the population: the degree of recombination, for instance, depends on the interactions between organisms of the species. After all, it is the "gene pool" of the species that will be rigid or flexible; the genome of individual organisms does not change. Second: the evolution of "usual" traits considers the current advantages of different versions of the organism, shifting the genome pool along the optimizing gradient in a fitness landscape (depending only on the "local" aspect of this landscape). Unlike these traits, evolvability must be tuned to global features, like the topology (ruggedness, for instance) of the landscape and its evolution will happen in a larger time-scale. Besides the topology, another important feature of the fitness landscape is its own variability: species do not, in general, adapt to a constant world, but to a changing one, and the variability of the species will be necessarily linked with the variability of the environment in which the species evolve.

Here we use simple genetic algorithms to explore the relation between the speed of change and the ruggedness of the environment and the evolved evolvability of a population. The "environment" will be a real function (which varies over time), and the populations will be usual GA ones.

Figure 1 The Drifting Fitness Function: To define the fitness function, we will use R points, regularly placed in [0,1] (thus, Xi=i/(R-1), for i=0, ... ,R-1). To each of them, we will assign a random value Yi in [0,1]. The fitness function, f:[0,1]® R, will be defined as the natural spline interpolation of these points (i.e., a piece-wise polynomial function which verifies f(Xi)=Yi for all i). As time goes by, all Yi will be allowed to drift at random, and f will change accordingly, as Figure 1 explains. The result of this construction is a "dancing polynomial", on which the "individuals" will have to climb. The ruggedness of this landscape will be determined by the number of points, R. The speed of change of the environment will be the speed of the random drift of the Yi, denoted by S. More precisely, S will be the number of "drifts" of each point per time step; it may be any positive number.

The first attempt: The first setup I tried failed; I describe it anyway, since it illustrates the point stated above about evolvability of a feature of populations and not of individuals. I considered a population of individuals, each of them consisting in three binary chromosomes, encoding real numbers xi, mi, ci. The first was the "candidate solution", the second was its mutation rate, and the third was a probability of mating. In each round, the fitness function changed (according to the speed S), the individuals mutated and mated their xi-chromosomes (according to their mi and ci), the fitness of all individuals was calculated, and then a new generation was Figure 2 produced, this time mutating and crossing the chromosomes of mi and ci. I wanted mi and ci to converge towards some value, but they did not. The explanation is simple: first, mating is a feature of populations (and selection here was acting on individuals) and second, higher mutation was better for individuals places near a maximum, and lower mutation was better for the rest. Since maxima were moving, there was nothing like a convergence of the mutation rate. "Good" solutions turned bad when the function changed, and then their mutation rates were already low, and "bad" individuals, with higher mutation rates, reached the new maxima faster. I still believe that there are interesting things to observe in this dynamics, but this will be an open problem for a while.

The final setup: There were two ways to go, after trying the first scheme. One was to drop the GA setup, and take some other kind of climbing (an approach like the so-called "simulated annealing", for instance) whose flexibility could be tuned to the change of the environment (in the same way described above). This is a path I still have to try.

The second direction, which I followed, was to divide the population into reproductively closed subpopulations, each of them with a mutation rate and a crossover rate. Selection was divided in two levels: first, there is a lower one, were populations evolve their members to maximize the changing fitness function (using their mutation and crossover rates; this is the SMALL GA). Then there is the higher level, operating on a larger time-scale, were the populations are judged for the average fitness they get, and a new "population of populations" is generated, with mutation and crossover rates generated from the previous ones through, again, mutation and recombination (the BIG GA). A third genetic operator was added (for a while), the "reverse" operator, which reverses the complete binary string . The new scheme is shown in figure 2. Figure 3

First exploration: Figure 3 shows the resulting values, over time, of the crossover, mutation and reversing rates; 3b is a close-up of the lower part of 3a. Two things can be noticed: first, that there is no definite convergence to a specific point of the 3-dimensional parameter space. Second, that there is a convergence towards some kind of surface in that space: in the small graph, we notice that mutation and reversion rates tend to compensate each other. In order to explore at least something in this space, two of the parameters Figure 4 were fixed (crossover and reversion rates), to see what was happening with the third (mutation rate). Figure 4 displays the mutation rate along a certain run. In this run, the speed of change in the environment (S) was increased at iteration 70; it is easy to see that the mutation rate did adjust after the change in S, and hence, there was some hope in studying this relation.

Figure 5 Results: Having fixed crossover rate, and eliminated reversion, the only quantity to be optimized by the GA was the mutation rate. In addition, I fixed the number of points in 7 (so, ruggedness was not an issue in the results presented here). For a given mutation rate, and a given speed of change of the "environment", there is some average value of the fitness. In other words, there is a relation of the form
< F > = g ( S , mut )
where mut is the mutation rate, S is the speed of the drift, and <F> is the average value of the fitness function; when we iterate the GA, we expect it to find mut(S), so as to maximize <F>. For a better understanding, I first looked at the form of the function "g" above, by fixing S or fixing mut, and seeing the relation between the remaining variable and <F>. Figure 5, for instance, shows the average fitness as a function of S, for four different values of the mutation probability. Notice the logarithmic scale for S, in the x-axis of this figure and the next.

Figure 6 We can see that, as expected, fitness decreases as S increases. In Figure 6 the curve for mut=0.1 is compared with the curves for other numbers of points (different ruggedness), and we can see the similarity between all of them (which was a reason for not worrying about R).

Figure 7 On the other hand, Figure 7 presents the average fitness as a function of the mutation probability mut, for a number of different values of the speed of change. In other words, this is the function that the big GA has to climb. As we see, fitness decreases with too low or too high mutation rates; the "aurea mediocritas" is located at a different point, depending on S. The increase of fitness for very high values of mut is related to some details of the implementation; mutation is done by flipping bits, rather than tossing coins, and this can be exploited at high rates. (That peak disappeared when the form of the mutation way was changed, but the changed had other inconvenient consequences and was discarded). The scale is logarithmic again in the x axis.

Finally, the GA was run for several values of S, and the resulting -evolved- mutation rates (and the associated fitnesses) are shown in Figure 8.

Discussion: The setup proved appropriate to observe the adaptation of a population's evolvability to the speed of change of its "environment". However, a number of issues of interest could not be addressed in the current form of the system. It would be interesting, for instance, to draw curves similar to the ones presented here, but for a fixed mutation probability and a varying -and evolving- crossover rate; this could not be done, since the effect of changes in the crossover rate turned out to be very small. Another interesting question was whether the system evolves towards a mutation rate that sets it "on the edge of chaos"; this was not the case, since there was no such edge: a forced increase of the mutation rate over the (evolved) optimal value had smooth consequences in the fitness of the population. Furthermore, the setup turned out to be very robust, and the qualitative results were the same when the fitness function was changed to be piecewise linear, when the jumps of the drift were made more violent, etc.

Conclusion: Not all the questions that could be stated could be answered, but the simplicity of the setup and the fact that the most elementary test did work, suggest that it may be useful to explore some other issues. Possible directions to go are changes in the function (Make it more abrupt? Try trigonometric functions?), changes in the reproduction of the populations (Sex? How good is it, depending on S?), etc. Demonstrative applets and any further work can be found at: http://www.dim.uchile.cl/~anmoreir/csss