Добавил:
Upload Опубликованный материал нарушает ваши авторские права? Сообщите нам.
Вуз: Предмет: Файл:
framework_en.pdf
Скачиваний:
68
Добавлен:
14.02.2016
Размер:
1.16 Mб
Скачать

Appendix B: The illustrative scales of descriptors

This appendix contains a description of the Swiss project which developed the illustrative descriptors for the CEF. Categories scaled are also listed, with references to the pages where they can be found in the main document. The descriptors in this project were scaled and used to create the CEF levels with Method No 12c (Rasch modelling) outlined at the end of Appendix A.

The Swiss research project

Origin and Context

The scales of descriptors included in Chapters 3, 4 and 5 have been drawn up on the basis of the results of a Swiss National Science Research Council project which took place between 1993 and 1996. This project was undertaken as a follow-up to the 1991 Rüschlikon Symposium. The aim was to develop transparent statements of proficiency of different aspects of the CEF descriptive scheme, which might also contribute to the development of a European Language Portfolio.

A 1994 survey concentrated on Interaction and Production and was confined to English as a Foreign Language and to teacher assessment. A 1995 survey was a partial replication of the 1994 study, with the addition of Reception, but French and German proficiency were surveyed as well as English. Self-assessment and some examination information (Cambridge; Goethe; DELF/DALF) were also added to the teacher assessment.

Altogether almost 300 teachers and some 2,800 learners representing approximately 500 classes were involved in the two surveys. Learners from lower secondary, upper secondary, vocational and adult education, were represented in the following proportions:

 

Lower secondary

Upper secondary

Vocational

Adult

 

 

 

 

 

1994

35%

19%

15%

31%

 

 

 

 

 

1995

24%

31%

17%

28%

 

 

 

 

 

Teachers from the GermanFrenchItalianand Romansch-speaking language regions of Switzerland were involved, though the numbers involved from the Italianand

217

Appendix B: The illustrative scales of descriptors

Romansch-speaking regions was very limited. In each year about a quarter of the teachers were teaching their mother tongue. Teachers completed questionnaires in the target language. Thus in 1994 the descriptors were used just in English, whilst in 1995 they were completed in English, French and German.

Methodology

Briefly, the methodology of the project was as follows:

Intuitive phase:

1.Detailed analysis of those scales of language proficiency in the public domain or obtainable through Council of Europe contacts in 1993; a list is given at the end of this summary.

2.Deconstruction of those scales into descriptive categories related those outlined in Chapters 4 and 5 to create an initial pool of classified, edited descriptors.

Qualitative phase:

3.Category analysis of recordings of teachers discussing and comparing the language proficiency demonstrated in video performances in order to check that the metalanguage used by practitioners was adequately represented.

4.32 workshops with teachers (a) sorting descriptors into categories they purported to describe; (b) making qualitative judgements about clarity, accuracy and relevance of the description; (c) sorting descriptors into bands of proficiency.

Quantitative phase:

5.Teacher assessment of representative learners at the end of a school year using an overlapping series of questionnaires made up of the descriptors found by teachers in the workshops to be the clearest, most focused and most relevant. In the first year a series of 7 questionnaires each made up of 50 descriptors was used to cover the range of proficiency from learners with 80 hours English to advanced speakers.

6.In the second year a different series of five questionnaires was used. The two surveys were linked by the fact that descriptors for spoken interaction were reused in the second year. Learners were assessed for each descriptor on a 0–4 scale describing the relation to performance conditions under which they could be expected to perform as described in the descriptor. The way the descriptors were interpreted by teachers was analysed using the Rasch rating scale model. This analysis had two aims:

(a)to mathematically scale a ‘difficulty value’ for each descriptor.

(b)to identify statistically significant variation in the interpretation of the

218

Appendix B: The illustrative scales of descriptors

descriptors in relation to different educational sectors, language regions and target languages in order to identify descriptors with a very high stability of values across different contexts to use in constructing holistic scales summarising the Common Reference Levels.

7.Performance assessment by all participating teachers of videos of some of the learners in the survey. The aim of this assessment was to quantify differences in severity of participating teachers in order to take such variation in severity into account in identifying the range of achievement in educational sectors in Switzerland.

Interpretation phase:

8.Identification of ‘cut-points’ on the scale of descriptors to produce the set of Common Reference Levels introduced in Chapter 3. Summary of those levels in a holistic scale (Table 1), a self-assessment grid describing language activities (Table

2)and a performance assessment grid describing different aspects of communicative language competence (Table 3).

9.Presentation of illustrative scales in Chapters 4 and 5 for those categories that proved scaleable.

10.Adaptation of the descriptors to self-assessment format in order to produce a Swiss trial version of the European Language Portfolio. This includes: (a) a selfassessment grid for Listening, Speaking, Spoken Interaction, Spoken Production, Writing (Table 2); (b) a self-assessment checklist for each of the Common Reference Levels.

11.A final conference in which research results were presented, experience with the Portfolio was discussed and teachers were introduced to the Common Reference Levels.

Results

Scaling descriptors for different skills and for different kinds of competences (linguistic, pragmatic, sociocultural) is complicated by the question of whether or not assessments of these different features will combine in a single measurement dimension. This is not a problem caused by or exclusively associated with Rasch modelling, it applies to all statistical analysis. Rasch, however, is less forgiving if a problem emerges. Test data, teacher assessment data and self-assessment data may behave differently in this regard. With assessment by teachers in this project, certain categories were less successful and had to be removed from the analysis in order to safeguard the accuracy of the results. Categories lost from the original descriptor pool included the following:

219

Appendix B: The illustrative scales of descriptors

a)Sociocultural competence

Those descriptors explicitly describing sociocultural and sociolinguistic competence. It is not clear how much this problem was caused (a) by this being a separate construct from language proficiency; (b) by rather vague descriptors identified as problematic in the workshops, or (c) by inconsistent responses by teachers lacking the necessary knowledge of their students. This problem extended to descriptors of ability to read and appreciate fiction and literature.

b) Work-related

Those descriptors asking teachers to guess about activities (generally work-related) beyond those they could observe directly in class, for example telephoning; attending formal meetings; giving formal presentations; writing reports & essays; formal correspondence. This was despite the fact that the adult and vocational sectors were well represented.

c)Negative concept

Those descriptors relating to need for simplification; need to get repetition or clarification, which are implicitly negative concepts. Such aspects worked better as provisos in positively worded statements, for example:

Can generally understand clear, standard speech on familiar matters directed at him/her, provided he/she can ask for repetition or reformulation from time to time.

Reading proved to be on a separate measurement dimension to spoken interaction and production for these teachers. However, the data collection design made it possible to scale reading separately and then to equate the reading scale to the main scale after the event. Writing was not a major focus of the study, and the descriptors for written production included in Chapter 4 were mainly developed from those for spoken production. The relatively high stability of the scale values for descriptors for reading and writing taken from the CEF being reported by both DIALANG and ALTE (see Appendices C and D respectively), however, suggests that the approaches taken to reading and to writing were reasonably effective.

The complications with the categories discussed above are all related to the scaling issue of unias opposed to multi-dimensionality. Multi-dimensionality shows itself in a second way in relation to the population of learners whose proficiency is being described. There were a number of cases in which the difficulty of a descriptor was dependent on the educational sector concerned. For example, adult beginners are considered by their teachers to find ‘real life’ tasks significantly easier than 14 year olds. This seems intuitively sensible. Such variation is known as ‘Differential Item Function (DIF)’. In as far as this was feasible, descriptors showing DIF were avoided when constructing the summaries of the Common Reference Levels introduced in Tables 1 and 2 in Chapter 3. There were very few significant effects by target language, and none by mother tongue, other than a suggestion that native speaker teachers may

220

Соседние файлы в предмете [НЕСОРТИРОВАННОЕ]