Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Скачиваний:
3
Добавлен:
05.03.2016
Размер:
287.74 Кб
Скачать

8.3 Discussion

The Dyna-XCS system we implemented is still under experimentation. However, the initial results show that there is almost no difference in performance between XCST and Dyna-XCS using the three environments employed in this paper. We expected this result. As observed previously, these environments are quite small. Therefore, after the system has solved a few hundred problems, it has also tried almost every condition/action pair and has an almost a complete description of the environment. Accordingly, the exploration with the model becomes almost identical to the exploration in the environment.

These initial results highlight the major problem with the implementation based on experience quadruples: the memory required to store the model by quadruples dramatically grows as the exploration in the environment proceeds because a complete description of the environment is likely to be produced. Therefore, this solution is only feasible in small environments. More complex environments instead require algorithms to produce a compact representation of the model.

A possible solution may consist of introducing a hybrid architecture in which XCS is used for learning, and a different type of algorithm used for building the environment model, for instance, a neural network. However, this type of solution would introduce elements which are not related to the philosophy underlying XCS. A more elegant solution can be suggested.

XCS is a learning algorithm which may be used for learning the environment model. We propose a solution in which the XCS system that has to learn to reach food in the environment is coupled with an XCS system that is employed to learn the environment model. The second system should have classifiers whose: (1) conditions represent a state/action pair (8, a); (2) actions represent the prediction of the next sensory state (s') and the immediate reward (r) XCS expects to gain when getting to s'. This version of XCS learns a predictive model of the environment, an extension already proposed by Wilson (1995) in the original XCS paper. Recently, Stolzmann (1997) introduced an Anticipatory Classifier System that is designed to learn an environmental model.

9 Evolving a Compact Representation

Previously, we discussed how the generalization mechanism of XCS and the structure of the environment influence system performance. Another important aspect of generalization in XCS concerns the capability of XCS to evolve a compact representation of the learned task.

9.1 Generalization and Task Representation in xcs

Results reported in the literature show that XCS can evolve near minimal populations of accurate and maximally general classifiers (Wilson, 1997a). Recently, Kovacs (1997) proposed an optimality hypothesis which states that XCS tends to evolve the minimal population with respect to the Boolean multiplexer function.

With respect to animat problems, we now discuss how XCS develops a tendency to evolve near minimal populations. We show how, in certain environments, XCS may fail to evolve a compact representation and may produce redundant representations of certain tasks.

Consider again Woods14 (Figure 6). Every position in Woods14 is uniquely determined by the position of the two adjacent free cells. Therefore, in each classifier condition only two bits are sufficient to characterize a specific environmental niche. Since classifiers in Woods14 are 16 bits long, for each niche there are 2[sup 14] possible classifiers belonging to that niche only. According to Wilson's hypothesis, general classifiers should reproduce more than specific ones since general classifiers appear in more match sets. Unfortunately, the last statement is not always true.

For example, consider the two conditions 1010001010001010 and ####0#####0#####. Although the second condition has many more don't care symbols, both conditions match only the third free position in Wood14. We can say that the latter condition is formally more general than the former condition because it has more # symbols. However, the latter condition is not concretely more general than the former because it matches the same number of niches. Note that in XCS the pressure toward more general classifiers is effective only if the generality of the classifiers is concretely exploited in the environment (i.e., general classifiers match more niches). Accordingly, in environments that offer few chances of building concrete generalizations, like Woods14, the pressure toward concretely more general classifiers is lost because Wilson's generalization hypothesis does not apply.(n5) In such situations XCS rapidly evolves a set of classifiers that exploit the maximum generalization offered by the environmental states. Then, by recombination and mutation of these classifiers, the system can start producing classifiers that are formally more general (contain more # symbols) but that, in practice, do not match more niches (they are not concretely more general). As a consequence the representation of the task can become redundant.

As an example, we apply XCST to Maze6 with a population of 1600 classifiers. Figure 14 reports the number of macroclassifiers in the population, and the curve is averaged over ten runs. Notice that the number of macroclassifiers grows immediately and then reaches an equilibrium value which depends on the genetic pressure. The analysis of final populations shows that only a few of the macroclassifiers represent more than one microclassifier.

Соседние файлы в папке 3