Empirical Investigation

of the Benefits of Partial Lamarckianism

Christopher R. Houck, Jeffery A. Joins,
Michael G. Kay, James R. Wilson
 

    Genetic algorithms are very efficient at exploring the entire search space, but are relatively poor at finding the precise local optimal solution in the region in which the algorithm converges. Hybrid GAs use local improvement procedures as a part of the evaluation of the individuals of the population. These procedures complement the global search strategy of the GA. This article presents an empirical analysis of the use of GAs utilizing local improvement procedures to solve a variety of test problems.

    There are two basic models of evolution that can be used to incorporate learning into a GA: the Baldwin effect and Lamarckian evolution. The Baldwin effect allows an individual's fitness (phenotype) to be determined based on learning (i.e., the application of local improvement). Like natural evolution, the result of the improvement does not change the genetic structure (genotype) of the individual. Lamarckian evolution, on the other hand, does change the genetic structure of the individual to reflect the result of the learning.

    Both models are not optimal for all problems for several reasons. One of the most known problems that were found in tests of Lamarckian evolution is the problem of premature convergence of the best individuals of the population to a local optimum. Baldwinian learning, on the other hand, certainly aggravates the problem of multiple genotype to phenotype mapping. In a direct mapping, there is no distinction between genotypes and phenotypes, but such mapping is not always possible neither desired.

    The most common example of this case is the TSP problem with ordinal representation. The genotype is represented by an ordered list of n cities to visit, for example (c1, c2,...., cn). However, the phenotype of the TSP is a tour, and any rotation of the chromosomes yields the same tour. That is, any rotation of the genotype results in the same phenotype. For example, the two tours (1,2,3,4) and (3,4,1,2) have different genotypes because their gene strings are different, but both strings represent the same tour and thus have the same phenotype.

    Hybrid GAs should not be restricted to operating in either a pure Baldwinian or pure Lamarckian manner. Instead, a mix of both strategies, mostly named as Partial Lamarckianism, could be employed together. The "partial" term here refers to the percentage of the population of which the genotype will be updated. Thus, 50% partial Lamarckianim means updating the genotype of half of the population in order to reflect the resulting phenotype. This work examines the use of partial Lamarckianism with regard to the use of local search on a set of bounded nonlinear and combinatorial optimization test problems.

    7 different tests were applied in this research. For each test, the following search strategies were used:
a) Partial Lamarckianism - 5, 10, 20, 40, 50, 60, 80, 90, 95 percent of the population
    updated each time (phenotype -> genotype).
b) Pure genetic approach - Using only evolution to influence the population (no learning).
    This model was denoted in the tests as "-1".
c) No Lamarckianism - Pure Baldwinian search strategy (evolution & learning). This
    model was denoted as "0" (0 percent Lamarckianism).
d) Pure Lamarckianism - All the population's genotype was reflected by its phenotype.
    This model was denoted as "100".

    The GA was replicated 30 times with random initializations for each strategy. The termination criteria of the GA for each of the 7 test cases was either performing over 1 million function evaluations (including the learning function evaluation) or finding the optimal solution for that test. For each test case, 2 results were checked:
a) Rank of solutions - each solution for each of the models described above was
    ranked according to the actual optimal solution of the test case ("1" being the highest
    rank that was given to a GA which converged to the actual optimal solution).
b) Rank of speed convergence - each solution was ranked according to the no. of
    function evaluations it took for the GA to converge to that solution (without taking into
   account whether it was a "bad" solution or a "good" one according to result (a)).

    The GA that was used in all test cases is composed of the following general steps:
1. Supply a population P0 of N individuals and respective function values.
2. i = 1.
3. P'i = selection_function(Pi-1) - select the individuals for the next generation.
4. Pi = reproduction_function(P'i) - reproduce a new generation out of the selected
    individuals.
5. Evaluate(Pi) - Rank the solution of the generated population.
6. i = i+1.
7. Repeat steps 3-6 until termination.
8. Print out best solution found.

    The GA was supplied with the following parameter values:
 
Parameter
Parameter Value
Uniform mutation
4
Multiuniform mutation
4
Nonuniform mutation
4
Multi-nonuniform mutation
6
Boundary mutation
4
Simple crossover
2
Arithmetic crossover
2
Heuristic crossover
2
Probability of selecting the best, q
0.08
Population size, N
80
Maximum number of function evaluations
1,000,000
Maximum number of iterations for the local improvement procedure
25
When the numbers for mutations refer to the no. of single individuals of the population that were selected for mutations each generation, and the numbers for crossovers refer to the no. of pairs in the population that  were selected for crossover each generation.
 
    Tests 1-5 are nonlinear optimization tests. For each test, a different nonlinear function was chosen, and the goal of the GA was to find the local optima in a bounded range. Each function was parameterized with the parameter n, and 3 instances were created by assigning n the values 2, 10 and 20. The functions that were chosen for the tests are:
1) Brown's Almost Linear Function.
2) Corana Function.
3) Griewank Function.
4) Rastrigin Function.
5) Schwefel Function.

    Test 6 is the Location-Allocation Problem. This is a multifacility location problem in which both the location of n new facilities (NFs) and the allocation of the flow requirements of m existing facilities (EFs) to the new facilities are determined so that the total transportation costs are minimal. The GA computes the location of each NF and the satisfied flow from each NF to its EF. Each individual in the GA is a vector of real values representing the starting locations for the NFs. The cost of each solution is the weighted sum of the distances from each EF to its allocated NF. 2 instances of the problem were investigated:
1) 200-EF by a 20-NF instance (LA-200).
2) 250-EF by a 25-NF instance (LA-250).

    Test 7 is the Cell Formation Problem. The test's purpose is to assign a group of machines to a set of cells and assigning a series of parts that should be processed by these machines into families. The GA should assign each machine and each part to one, and only one, cell and family, respectively. That is, no part will be processed by a machine outside its assigned cell. Each individual in the GA is represented as a vector of integers representing the machine and parts assignment (x1, x2, xm,....,y1, y2,....,yn),
where:
 xi = l if machine i is assigned to cell l
 yj = l if part j is assigned to family l
2 instances of the problem were investigated:
1) 22 machine by a 148-part instance (Cell-1)
2) 40 machine by a 100-part instance (Cell-2)

The following tables present the results of the tests:

Rank of solutions
 
-1
0
5
10
20
40
50
60
80
90
95
100
Brown 2
1
1 1 1 1 1 1 1 1 1 1 1
Brown 10
3 2 1 1 1 1 1 1 1 1 1 1
Brown 20
3 2 1 1 1 1 1 1 1 1 1 1
Corana 2
1 1 1 1 1 1 1 1 1 1 1 1
Corana 10
2 4 3 2 1 1 1 1 1 1 1 1
Corana 20
4 2 8 7 6 5 4 3 2 1 1 1
Griewank 2
2 1 1 1 1 1 1 1 1 1 1 1
Griewank 10
2 1 1 1 1 1 1 1 1 1 1 1
Griewank 20
2 1 1 1 1 1 1 1 1 1 1 1
Rastrigin 2
1 1 1 1 1 1 1 1 1 1 1 1
Rastrigin 10
2 1 1 1 1 1 1 1 1 1 1 1
Rastrigin 20
2 1 1 1 1 1 1 1 1 1 1 1
Schwefel 2
1 1 1 1 1 1 1 1 1 1 1 1
Schwefel 10
1 2 1 1 1 1 1 1 1 1 1 1
Schwefel 20
1 2 1 1 1 1 1 1 1 1 1 1
LA 200
3 2 1 1 1 1 1 1 1 1 1 1
LA 250
3 2 1 1 1 1 1 1 1 1 1 1
Cell 1
3 2 1 1 1 1 1 1 1 1 1 1
Cell 2
2 2 1 1 1 1 1 1 1 1 1 1

Rank of convergence speed
 
-1
0
5
10
20
40
50
60
80
90
95
100
Brown 2
2
1 1 1 1 1 1 1 1 1 1 1
Brown 10
6 5 4 3 2 1 1 1 1 1 1 1
Brown 20
4 4 3 2 1 1 1 1 1 1 1 1
Corana 2
2 1 1 1 1 1 1 1 1 1 1 1
Corana 10
4 4 4 4 3 2 2 2 2 1 1 1
Corana 20
3 3 3 3 3 3 3 2 2 1 1 2
Griewank 2
1 1 1 1 1 1 1 1 1 1 1 1
Griewank 10
7 1 1 1 2 3 4 4 5 5 5 6
Griewank 20
5 1 1 1 1 2 3 3 3 4 4 4
Rastrigin 2
2 1 1 1 1 1 1 1 1 1 1 1
Rastrigin 10
3 5 4 4 3 2 1 1 1 1 1 1
Rastrigin 20
2 6 5 4 3 1 1 1 1 1 1 1
Schwefel 2
1 1 1 1 1 1 1 1 1 1 1 1
Schwefel 10
1 3 2 2 1 1 1 1 1 1 1 1
Schwefel 20
3 4 2 2 1 1 1 1 1 1 1 1
LA 200
3 2 1 1 1 1 1 1 1 1 1 1
LA 250
3 2 1 1 1 1 1 1 1 1 1 1
Cell 1
2 2 1 1 1 1 1 1 1 1 1 1
Cell 2
2 2 1 1 1 1 1 1 1 1 1 1
 

    From the results above it can be concluded that all the search strategies which employed at least 20% Lamarckianism consistently found the optimal solution. When taking a minimax approach (choosing the minimal of the worst solutions), the best search strategies employed 20% and 40% Lamarckianism. It can be also noticed that neither pure Lamarckianism (100%) nor pure Baldwinian (0%) strategies performed well regarding the convergence speed. The causes of the results denoting the best results to 20% and 40% Lamarckianism are not clear, and can be the subject for further investigation.