Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
Скачиваний:
3
Добавлен:
05.03.2016
Размер:
287.74 Кб
Скачать

1 Introduction

Autonomous agents are not, in general, able to deal with the complexity of real environments. The ability of an agent to generalize over the different situations it experiences is essential in order to learn tasks in real environments. In fact, an agent which generalizes properly is able to synthesize, in a compact way, the knowledge it acquires so as to manipulate the concepts it learns.

Generalization is a very important feature of XCS, the classifier system introduced by Wilson (1995). XCS has been shown to evolve near-minimal populations of classifiers that are accurate and maximally general (Kovacs, 1997; Wilson, 1997a). Recently, Kovacs (1997) proposed an optimality hypothesis for XCS and presented experimental evidence of his hypothesis with respect to the Boolean multiplexer, a known testbed for studying generalization in learning classifier systems (Wilson, 1987; Wilson, 1995).

In taking XCS beyond its very first environments and parameter settings, Lanzi (1997) reported experimental results for problems involving artificial animals, animats (Wilson, 1987), showing that in difficult sequential problems XCS performance may fail dramatically. The author observed that in these kinds of tasks the generalization mechanism of XCS can be too slow to delete overly general classifiers before they proliferate in the population. In order to avoid this problem, Lanzi (1997) introduced a new operator, called specify, which helps XCS delete overly general classifiers by replacing them with more specific offspring. An alternate solution was suggested by Wilson in which the random exploration strategy employed in his first experiments with XCS was replaced with biased exploration (Wilson, 1997b).

Until recently (Kovacs, 1996; Lanzi, 1997b), the analysis of the generalization capabilities of XCS has been presented without considering the relation between XCS's performance and the environment structure. As a result it is not clear why one environment is easy to solve, while a similar one can be much more difficult.

The aim of this paper is to suggest an answer to this question enabling a better understanding of the generalization mechanism of XCS, while giving a unified view of the observations in Lanzi (1997) and Wilson (1997). First, we extend the results presented by Lanzi comparing the performance of XCS when it uses specify and when it employs the biased exploration strategy. The comparison is done in two new environments, Maze5 and Maze6, and then in Woods 14, the ribbon problem introduced by Cliff and Ross (1994). The results we present demonstrate that specify can adapt to all the three test environments while XCS with biased exploration may fail to converge to optimal solutions as the complexity of the environment increases. Although these results are interesting, they simply report experimental evidence and do not explain XCS's behavior which is our major goal. In order to explain XCS's behavior, we analyze the assumptions which underlie generalization in XCS and Wilson's generalization hypothesis (Wilson, 1995). We study XCS's generalization mechanism in depth and formulate a specific hypothesis. We verify our hypothesis by introducing a meta-exploration strategy, teletransportation, which we use as a validation tool.

We end the paper discussing another important aspect of generalization within XCS-the capability of XCS to evolve a maximally compact representation of the learned task. We show that, in particularly difficult environments, where few generalizations are admissible, XCS evolves generalizations right up to the limit of the instances actually offered by the environment.

The remainder of this paper is organized as follows: Section 2 gives a brief overview of the current version of XCS, and Section 3 presents the design of the experiments we employed in this paper. XCS with specify, referred to as XCSS, and XCS with biased exploration are compared in Section 4 using Maze5 and Maze6. In Section 5, the same comparison is done in the Woods14 environment. The results described in the previous sections are discussed in Section 6 where we formulate a hypothesis in order to explain why XCS may fail to converge to an optimal solution and discuss the implications introduced by our hypothesis. We verify our hypothesis in Section 7 by introducing teletransportation. We suggest how the ideas underlying teletransportation might be implemented in real-world applications in Section 8. Section 9 addresses the conditions under which XCS evolves a compact representation of a learned task, and Section 10 summarizes the results.

Соседние файлы в папке 3