Holland's Schema Theorem

From Wikipedia, the free encyclopedia

There are very few or no other articles that link to this one.
Please help introduce links in articles on related topics. After links have been created, remove this message.
This article has been tagged since September 2006.

Holland's Schema Theorem is widely taken to be the foundation for explanations of the power of genetic algorithms.

A schema is a template that identifies a subset of strings with similarities at certain string positions.

[edit] Description

For example, consider binary strings of length 6. The schema 1**0*1 describes the set of all strings of length 6 with 1's at positions 1 and 6 and a 0 at position 4. The * is a "don't care" symbol, which means that positions 2, 3 and 5 can have a value of either 1 or 0. The order of a schema is defined as the number of fixed positions in the template, while the defining length is the distance between the first and last specific positions. The order of 1**0*1 is 3 and its defining length is 5. The fitness of a schema is the average fitness of all strings matching the schema. The fitness of a string is a measure of the value of the encoded problem solution, as computed by a problem-specific evaluation function. With the genetic operators as defined above, the schema theorem states that short, low-order, schemata with above-average fitness increase exponentially in successive generations. Expressed as an equation:

$m(h,t+1) \geq {m(h,t) f(h) \over a_t}[1-p]$

Here m(h,t) is the number of strings belonging to schema h at generation t, f(h) is the observed fitness of schema h and a_t is the observed average fitness at generation t. The probability of disruption p is the probability that crossover or mutation will destroy the schema h.
An often misunderstood point is why the Schema Theorem is an inequality rather than an equality. The answer is in fact simple: the Theorem neglects the small, yet non-zero probability, that a string belonging to the schema h will be created "from scratch" by mutation of a string that did not belong to h in the previous generation.