Artificial life Seminar 1998

Designing Neural Networks Using Genetic Algorithms with Graph Generation System

Hiroaki Kitano

Complex Systems, 4(4), 1990

Presented by: Yehuda (Udi) Shemer 013215348

Summary

The paper presents a novel method for designing ANN topology, using Genetic Algorithms. As ANN design is application dependent, this provides an automatic system, capable of producing optimized network topology, a task previously carried out manually by the researchers.

The scheme encodes into the chromosome, the ANN Connectivity Matrix using a graph generating grammar. The encoding method is designed to reduce the scalability problem of previous encoding methods.

The Connectivity Matrix represents the connections between the neurons in the ANN.

Given an N neuron ANN, with neurons numbered from the input neurons to the outputs, the corresponding matrix A, is a NxN binary matrix, in which a “1” in Aij correlates to a connection between neuron i and neuron j.

In previous methods the GA chromosome was a string of bits, holding the rows of the matrix concatenated to each other (“Direct” encoding). A variant for this method is the “Weak” encoding scheme, in which the ANN is divided into a few directly encoded clusters, with the connections between the clusters. Both these methods suffered from scalability problems: convergence performance greatly degrades as the size of the network grows.

Another issue is “Genetic Plausibility” – as biological human DNA is not large enough in order to directly encode, for example, the brain.

The proposed method uses a grammar in order to create rules for producing the connectivity matrix. The grammar used is an augmented version of Lindenmayer’s L-system. (For experimental reasons, a context free and deterministic grammar is used).

The symbols in the grammar are 2x2 matrices, and the terminals “0” and “1”. There are pre-assigned unique symbols for all 16 2x2 binary matrices. The rest of the symbols are 2x2 matrices of terminals and non-terminals. A rule is defined by a left-hand side symbol, and four symbols corresponding to the 2x2 matrix elements. In order to create a square matrix at the end of the process, terminals appearing in intermediate production cycles, are written as matrices of full connections (all “1”), or matrices of empty connections (all “0”). Also, at the final cycle all non-terminals are set to “0”.

The chromosome consists for a variable and a constant part. The constant part consists of the starting rule symbol, and the 16 binary matrices. The variable part, which undergoes the genetic operations, holds the rest of the grammatical rules.

The compactness of the coding, and its weak connection to network size, will result in faster convergence of the genetic algorithm as opposed to the direct encoding method.

The experiments tested GA convergence and ANN performance for direct encoding and the grammatical encoding. The task the ANN had to solve was the Encoder/Decoder problem, chosen because of it is easily scalable, and it is accessible to other researchers.

The ANN tested is a feed-Forward network. The GA fitness function is the inverse sum of squares error, which is calculated after 10 epochs of back-propagation. Tests where run on 4-X-4 and 8-X-8 encoder/decoder problems, with population sizes of 10-100.

Results clearly showed faster GA convergence, and better ANN performance. Also, as expected, these results were much less sensitive to network size.

Additional advantages:

The grammar is much more capable of copying meaningful sub-circuits. This allows whole reusable sub-circuits to appear in several areas of the network.

The encoding method generates much more regular patterns. This is important as in complex problems we need several modules with similar functionality but tuned differently.

Chromosome growth rate is almost independent of ANN size.

Discussion

No considerations were given as to why specifically this grammar was selected. The only reference was L-Systems’ “Genetic Plausibility” – It successfully described morphogenesis of several real life creatures, which makes this more biologically plausible than other methods. I strongly oppose such rationale, as, for one, we are later seeking optimality of networks, where to my knowledge, there is no proof of biology being optimal.

Only solution of the Encoder/Decoder problem was tested. This may be biased toward the types of ANN topologies this scheme produces.

There is no study of the effect of the GA parameters and operators, extreme conditions etc.

I oppose the claim that ANNs with more regular patterns are “better”. Most of these networks can be made “irregular” by changing the numbering of the neurons without changing functionality.

Methodology: Some of the tests were carried out on very small populations (10) and for a small number of generations (7-10).

Conclusion

Grammar encoding seems promising. Automatic ANN topology design may yield better ANN performance, although a grammar (and in particular a context free one), may not be the best choice for the task, but it looks like a good choice, which can easily be studied and tested.