- •Testing as a means of assessment. Why, when, what
- •Tests used at secondary and tertiary levels
- •Approaches to testing and main kinds of testing
- •The main characteristics of tests
- •The best-known test-techniques and their analysis.
- •Testing use of English.
- •Testing reading.
- •Testing writing.
- •Testing speaking.
Testing speaking.
When analyzing different test tasks one should consider the following points:
number of test-takers involved in the speaking test;
what does it test: dialogue skills (interaction) or monologue speech;
role of a teacher ( as participant or as an assessor or combining both roles);
recorded or not recorded;
ease or difficulty of administration;
criteria of assessment.
Possible formats:
Picture description (interpretation): you may use photos or drawings. It gives tester time to listen and students something concrete to speak about. But it is an artificial task and there is no interaction.
Giving instructions/explanations, e.g. how to make a dish. Realistic, concrete, but no interaction.
Oral presentations: students prepare and give short talks. Realistic and gives the teacher time to assess performance. In case of a prepared monologue, it becomes retelling a written speech, trying to recall the text, not to speak.
Information transfer: information gap is created through notes or pictures (jig saw reading). Realistic – need for communication. Tests key interactive strategies (questioning, clarifying, eliciting information). Can be problems when one speaker is a lot weaker than the other.
Role play: students assume roles with or without cued information). It is excellent for testing interaction and commonly used task in most materials. But it can test the ability to act. Much depends on psychological characteristics of students, not on their language abilities.
Free interviews: chat to students in groups or as individuals. It is realistic and can reduce stress for students in case of the right organization (procedure, teacher’s behaviour etc). But it is very difficult to rate performance. Again personality factors may be at work (shy/outgoing, laconic/blabbering etc). It is difficult if a teacher has to maintain conversation at the same time as rating the students’ performance. In internationally recognized tests raters and interlocutors are usually two different people.
Types of spoken tests
The most commonly used spoken test types are these:
Interviews – these are relatively easy to set up, especially if there is a room apart from the classroom where learners can be interviewed. The class can be set some writing or reading task (or even the written component of the examination) while individuals are called out, one by one, for their interview. Such interviews are not without their problems though. The rather formal nature of interviews … means that the situation is hardly conducive to testing more informal, conversational speaking styles. Not surprisingly, students often underperform in interview-type conditions. It is also difficult to eliminate the effects of the interviewer – his or her questioning style, for example – on the interviewee’s performance. Finally, if the interviewer is also the assessor, it may be difficult to maintain the flow of the talk while at the same time making objective judgments about the interviewee’s speaking ability. Nevertheless, there are ways of circumventing some of these problems. A casual chat at the beginning can help put candidates at their ease. The use of pictures or a pre-selected topic as a focus for the interview can help, especially if candidates are given one or two minutes to prepare themselves in advance. If the questions are the same for each interview, the interviewer effect is at least the same for all candidates. And having a third party present to co-assess the candidate can help ensure a degree of objectivity.
Live monologues – the candidates prepare and present a short talk on a pre-selected topic. This eliminates the interviewer effect and provides evidence of the candidates’ ability to handle an extended turn, which is not always possible in interviews. If other students take the role of the audience, a question-and-answer stage can be included, which will provide some evidence of the speaker’s ability to speak interactively and spontaneously. Buy giving a talk or presentation is only really a valid test if these are skills that learners are likely to need, e.g. if their purpose for learning English is business, law, or education.
Recorded monologues – these are perhaps less stressful than a more public performance and, for informal testing, they are also more practicable in a way that live monologues are not. Learners can take turns to record themselves talking about a favourite sport or pastime, for example, in a room adjacent to the classroom, with minimal disruption to the lesson. The advantage of recorded tests is that the assessment can be done after the event, and results can be ‘triangulated’ – that is, other examiners can rate the recording and their ratings can be compared to ensure standardization.
Role-plays – most students will be used to doing at least simple role-plays in class, so the same format can be used for testing. The other ‘role’ can be played either by the tester or another student, but again, the influence of the interlocutor is hard to control. The role-play should not require sophisticated performance skills or a lot of imagination. Situations grounded in everyday reality are best. They might involve using data that has been provided in advance. For example, students could use the information in a travel brochure to make a booking at a travel agency. This kind of test is particularly valid if it closely matches the learners’ needs. One problem, though, with basing the test around written data is that it then becomes a partial test of reading skills as well.
Collaborative tasks and discussions – these are similar to role-plays except that the learners are not required to assume a role but simply to be themselves. For example, two candidates might be set the task of choosing between a selection of job applicants on the basis of their CVs. Or the learners simply respond with their own opinions to a set of statements relevant to a theme. Of course, as with role-plays, the performance of one candidate is likely to affect that of the others, but at least the learners’ interactive skills can be observed in circumstances that closely approximate real-life language use.
Speaking is judged according to the following criteria: problem-solving or relevance to the topic, fluency, interactive communication (how active the student is in reacting, keeping the conversation going, displaying the initiative etc), discourse management (making relevant and logical contributions to conversations, linking together own words and with what other candidates and examiner say), accuracy: pronunciation, grammar and vocabulary.
Assessment criteria
Having obtained a sample of the learner’s speaking ability, how does one go about assessing it? There are two main ways: either giving it a single score on the basis of an overall impression (called holistic scoring) or giving a separate score for different aspects of the task (analytic scoring). Holistic scoring (e.g. giving an overall mark out of, say, 20) has the advantage of being quicker, and is probably adequate for informal testing of progress. Ideally, though, more than one scorer should be enlisted, and any significant differences in scoring should be discussed and a joint score negotiated.
Analytic scoring takes longer, but compels testers to take a variety of factors into account and, if these factors are well chosen, is probably both fairer and more reliable. One disadvantage is that the scorer may be distracted by all the categories and lose sight of the overall picture – a woods-and-trees situation. Four or five categories seems to be the maximum that even trained scorers can handle at one time.
For the CELS Test of Speaking there are four categories: ‘Grammar and Vocabulary’, ‘Discourse Management’, ‘Pronunciation’, and ‘Interactive Communication’. They are described in the following terms:
Grammar and Vocabulary – on this scale, candidates are awarded marks for the accurate and appropriate use of syntactic forms and vocabulary in order to meet the task requirements at each level. The range and appropriate use of vocabulary are also assessed here.
Discourse Management – on this scale, examiners are looking for evidence of the candidate’s ability to express ideas and opinions in coherent, connected speech. The CELS tasks require candidates to construct sentences and produce utterances (extended as appropriate) in order to convey information and to express or justify opinions. The candidate’s ability to maintain a coherent flow of language with an appropriate range of linguistic resources over several utterances is assessed here.
Pronunciation – this refers to the candidate’s ability to produce comprehensible utterances to fulfil the task requirements, i.e. it refers to the production of individual sounds, the appropriate linking of words, and the use of stress and intonation to convey the intended meaning. L1 accents are acceptable provided communication is not impeded.
Interactive Communication – this refers to the candidate’s ability to interact with the interlocutor and the other candidate by initiating and responding appropriately and at the required speed and rhythm to fulfil the task requirements. It includes the ability to use functional language and strategies to maintain or repair interaction, e.g. in conversational turn-taking, and a willingness to develop the conversation and move the task towards a conclusion. Candidates should be able to maintain the coherence of the discussion and may, if necessary, ask the interlocutor or the other candidate for clarification.
It is worth emphasizing that grammatical accuracy is only one of several factors, and teachers need to remind themselves when assessing speaking that even native speakers produce non-grammatical forms in fast, unmonitored speech. It would be unfair, therefore, to expect a higher degree of precision in learners than native speakers are capable of.
