Menu

Changes introduced in Camellia

Jany Belluz

Changes introduced in Camellia

The following page lists all the improvements brought by the latest version of µGP³, Camellia. You can beneficiate from these new features by updating your project configuration files, and modifying specific settings to activate new algorithms as described in this page.

TL;DR: How to use an existing project with µGP³ Camellia

  1. Download the source distribution of µGP³ Camellia
  2. Make sure you have the xsltproc command on your system
  3. Update your population configuration:

    /path/to/Camellia/Contrib/XMLUpdater/update_Palmtree_to_Camellia.sh your.population.xml
    
  4. Run µGP³ as usual, from the base directory of your project

  5. (optional) Modify your configuration files to activate new algorithms.

Codebase improvements

Standardization of concepts

Thanks to a reworked class hierarchy, all the advanced concepts present in µGP³ such as clone detection and scaling, aging, elitism, zombies, fitness sharing, delta entropy, selection schemes... can be applied to all kinds of populations in a uniform manner.

Better hash computation

µGP³ uses a hash function for two purposes: the hash values of single nodes are used to compute the entropy of individuals, and the hash values of whole individuals are used to speed up clone detection. The problem is that the requirements for these two applications are different. Entropic hashes rely on real value quantization to extract meaningful information from floating point parameters, but this quantization step creates a lot of collisions in the hash values, which make them useless for clone detection. In the new version, we refactored the hashing code in order to compute two different hash values for individuals and nodes, depending on the intended use of the results.

New evaluator with caching and Lua integration

See [Evaluator] and [Lua Evaluator] for details.

Benchmarking tool

See [Benchmarking] for more information.

Group Evolution

Zombies

The best group is always kept in the population and will be output by the algorithm, even if it died during the evolution, thanks to opportune zombifications.

Population initialization

The population setup can be controlled using the nu and groupNu parameters. µGP will first create nu random individuals, then build groupNu random groups using these individuals. The previous parameters minGroups and maxGroups are deprecated and must be replaced by groupNu and groupMu.

Individuals without group

The value of mu has been given a meaning in group evolution: the group population always keeps all individuals that belong to at least one group, and now also keeps mu individuals without group. You can deactivate this feature and keep only individuals that have at least one group by setting mu to zero.

generateSonsGroups

This function places individuals generated by genetic operators into new groups. When it has to select into which to insert the individuals, it now uses the selection function that is defined in the population configuration file (instead of a hard-coded tournament selection). It also handles orphans by inserting them into groups or by creating a group especially for them.

And now, even if generateSonsGroups does not manage to create a good group for one of the individuals, this orphan might be kept in the population anyway as part of the mu individuals without group.

Group entropy

We added a definition of entropy and delta entropy for groups, based on the aggregation of the entropic symbols of its individuals. Thus a group which contains an individual with unique traits will have a higher delta entropy.

Fitness sharing

Fitness sharing is now available during group evolution, for individuals using

  <fitnessSharing distance="hamming" radius="32" />

and for groups using

  <groupFitnessSharing radius="50" />

Fitness sharing for groups can only be performed using the entropic distance.

Individual scaling depending on group contributions

A new scaling mechanism has been introduced in group evolution: contribution-based scaling. The goal is to augment the fitness of individuals that participate in good groups. In order to do that, we compute the following scaling factor for each fitness coefficient:

  • Let lower and upper be the lower and upper bounds of the given fitness coefficient in the whole group population
  • If lower == upper we don't perform any scaling for this coefficient
  • If lower < upper: for each individual i
    • Find the best group rawBest to which i belongs, by comparing the raw fitness
    • Let max be the value of the current fitness coefficient in the raw fitness of rawBest
    • The scaling factor is then given by the formula:
      exp(kappa * (max - lower) / (upper - lower))

The resulting scaling factor is in the range [1, exp(kappa)], where kappa can be configured using the new individualContributionScalingFactor population parameter. In order to disable the scaling altogether, set kappa to zero (the default value). Higher values lead to a more pronounced scaling.

  <individualContributionScalingFactor value="1" />

For individuals that belong to no group, the scaling stays at 1. Individuals that belong only to weak groups get a scaling near 1, and the individuals of the best group get the highest possible scaling exp(kappa) on the coefficient that made it the best.

Two-step tournament selection

We introduced a new selection mechanism designed to preserve diversity when selecting groups of candidates. This two-step tournament works like a normal tournament with fitness hole when selecting only one individual, but changes its behavior when selecting a group. In that case, it follows this algorithm:

  • Select one candidate c using fitness-only tournament selection, let selected = {c}
  • For each subsequent candidate to select:
    • Select tau candidates cand = {c_1, ..., c_tau} using fitness-only tournaments
    • Select the candidate that is furthest away from all already selected candidates, i.e.

      argmax_{c \in cand}(sum_{s \in selected}(distance(c, s))/card(selected)))

      and put it in selected.

Every new candidate is supposed to be good (because of the fitness-based tournament selection) and the group as a whole should have a high genetic diversity, since the distance function that we maximize is the entropic distance.

During latter stages of evolution, when individuals are concentrated around a few local optima, this operator will tend to team up one individual from each peak. While this is usually a bad way of selecting parents for reproduction[1], it looks like a great method for building groups. That's why this new selection technique is used internally for the groupDreamTeam operator, regardless of the selection mechanism chosen in other contexts.

Genetic and group operators

Operator selection with DMAB and Russian roulette

See the main article about DMAB in µGP³ for more details. The DMAB was already present in previous versions of µGP³, albeit not giving satisfactory results. Its implementation has been modified in several ways:

  • Genetic operators and group operators are treated equally, which means that the DMAB algorithm should be able to focus on group or individual evolution automatically depending on what leads to the highest fitness improvement.
  • Operator selection during the generation is now performed using Russian roulette, in order to get a fairer selection when we apply several operators in a row without updating the rewards.
  • A Page-Hinkley reset is now properly applied to all operators, not only the one that triggered it.
  • Operators that fail repeatedly because they are not adapted to the current problem will be pseudo-deactivated until they prove themselves useful by producing at least one valid child. While pseudo-deactivated, an operator will be called at most once every ten generations. This feature most notably enables the following configuration simplification: operator blacklisting.

Operator failure

As we just said, the new operator selection scheme is built around the idea that it's okay for an operator to fail, and even better than producing uninteresting offspring. That's why operator failures are not counted anymore as part of the lambda operator applications. The algorithm now performs lambda successful operator calls each generation.

We also modified some of the built-in operators in µGP so that they fail early instead of duplicating functionality from other operators, thus making it easier for the DMAB to distinguish and choose wisely between them:

  • onePointImpreciseCrossover and twoPointImpreciseCrossover: when the selected individuals have no outer labels, these two operator are equivalent to their "precise" counterparts. They now detect this case and fail. You should prefer onePointCrossover and twoPointCrossover* in your configuration file when you don't use outer labels.
  • subGraphReplacementMutation: when the parent individual only has one subgraph, this operator completely randomizes the genome (by replacing this only subgraph). Instead of behaving just like randomizer, this operator now fails.

Default operators and blacklisting

Configuration file

Previously, µGP³ users had to define a list of operators and specify their weight before starting the tool. Since µGP uses the DMAB operator selection strategy, you don't have to specify operator weights anymore (and if you do they will be silently ignored). To underline this change, the <operatorsStatistics> element was renamed <operators>.

From now on, µGP³ comes with a list of enabled-by-default operators. If the user does not specify anything in the population configuration file, µGP³ will use all available stable operators and automatically focus on those that are applicable to the given problem. To use this default configuration, just specify in you population configuration file:

  <operators default="smart">
  </operators>

In order to disable one of the default operators, you can blacklist it by specifying enabled="0" in your population configuration file. Conversely, you can enable scan or experimental operators with enabled="1".

  <operators> <!-- equivalent to default="smart" -->
    <operator ref="onePointImpreciseCrossover" enabled="0" />
    <operator ref="scanMutationINTEGER" enabled="1" />
    <operator ref="groupKFoldScanRemoval" /> <!-- equivalent to enabled="1" -->
  </operators>

Three kinds of operators are disabled by default: randomizer, scan and experimental.

  • randomizer is only used as a basis for benchmarks. It brings no useful evolutionary contribution on normal problems.
  • Scan operators tend to produce many children, so they require numerous evaluations. If the evaluator is computationally intensive, they can significantly slow down the evolution. On the other hand, if your evaluator is fast, they could prove useful. In that case, you must enable them explicitly.
  • Experimental operators are disabled because they behave poorly, are not fully implemented or might delete your holiday pictures.

In order to enable all those operators by default, just specify:

  <operators default="all">
  </operators>

You can still blacklist some of them using enable="0".

  <operators default="all">
    <operator ref="randomizer" enabled="0" />
  </operators>

Finally, if you want to get back the old behavior (i.e. µGP uses only the listed operators), use default="none". For example:

  <operators default="none">
    <!-- allow only mutation and crossover -->
    <operator ref="singleParameterAlterationMutation" />
    <operator ref="twoPointCrossover" />
  </operators>

Last-minute tweaking with the command line

You can change from the file-defined default enabling mode to "all" or "none" using the command-line switches --allOperatorsEnabled and --allOperatorsDisabled. If you restart a previous evolution using --recoveryInput status.xml, these command-line switches will respectively enable all operators that are not explicitly blacklisted and disable all operators that were not explicitly enabled.

New group operators

Several group operators were added or stabilized. All of those are only enabled during group evolution.

  • groupRandomInsertionMutation takes a group, and creates a child group by adding a random individual. Then, a random value in [0,1] is generated. If the value is lower than sigma, the process is repeated.

  • groupRandomRemovalMutation takes a group, and creates a child group by removing a random individual. Then, a random value in [0,1] is generated. If the value is lower than sigma, the process is repeated.

  • groupBalancedCrossover selects two groups and performs a random number of individual exchanges between them, whitout changing the size of the resulting groups.

  • groupUnbalancedCrossover selects two groups and moves randomly some individuals between them, changing their compositions and sizes.

  • groupKFoldScanCrossover selects two groups, slices the biggest in K parts, and moves each part to a new clone of the other group.

  • groupKFoldScanRemoval selects one group, slices it in K parts, and creates K subgroups each missing one of the parts. The goal is to eliminate useless individuals from the given group.

  • groupUnionIntersection selects two groups and tries to build their intersection and their union.

  • groupDreamTeam builds new groups by selecting the best and most diverse individuals.

References

[1] K. Deb, Genetic algorithms in multimodal function optimization, Ph.D. thesis, Clearinghouse for Ge-
netic Algorithms, Department of Engineering Mechanics, University of Alabama (1989).


Related

Wiki: Benchmarking
Wiki: DMAB
Wiki: Evaluator
Wiki: Lua Evaluator
Wiki: Population Settings
Wiki: Versions