- •Тульский государственный педагогический университет им. Л.Н. Толстого
- •Tuning In r ead the statements below and define them as true (t) or false (f)
- •Testing and elt
- •Introduction Two basic principles for test design and evaluation
- •Part I. Reasons for testing and test types
- •Finding out about progress
- •Finding out about learning difficulties
- •Finding out about achievement
- •Placing students
- •Finding out about proficiency
- •The main types of tests are
- •R ound Up
- •Defining the content
- •Sampling
- •Test item types selection
- •Instruction
- •Trialling
- •6. Marking
- •According to the writer, what did Tom immediately do?
- •(C) a complete statement
- •The stem should usually contain those words or phrases which would otherwise have to be repeated in each option. T he word ‘astronauts’ is used in the passage to refer to
- •The word ‘astronauts’ is used in the passage to refer to travellers in
- •The correct answer
- •Can you make the item correct? The distractors
- •Each distractor should be grammatically correct when it stands by itself and when it is inserted in the test. The present tax reforms have benefited ___________ poor.
- •Matching
- •Tennis is one of the sport where youngsters can play against their elders with more
- •Reordering
- •Word formation items
- •Items involving synonyms
- •Essential Glossary
Defining the content
You must have a clear idea of the content and the sub-skills that you want to test. For progress tests this should be the same as you have taught . In this case the coursebook and your lesson plans will be your major guidelines in test construction. For achievement tests you must be guided by the syllabus requirements for a certain form/level. In diagnostic test design it is the syllabus which defines the test content.
It is at this stage that you decide what skills and language areas will be included in the test format.
Sampling
In a progress test it may be possible to test everything that the students have done since the last test, but it becomes more difficult for achievement tests. You should test a representative sample of the items taught. If the sample is too small the test becomes invalid and if the sample is too big the test will take too long. But what is a good sample size? With language areas which may be easily quantified, e.g. vocabulary, choosing the correct sample size may be done statistically; sampling becomes less ‘scientific’ when we consider skills and here we often have to rely on our intuition.
So at this stage you should decide what has been taught taking into account considerations what to test and then take a representative sample. Your outline may look something like this:
Test item types selection
The next thing to do is to decide which test items you will use and these should be very similar to the exercise types that the students have already done in class.
What we now need is clarification of the term ‘item’ itself. ‘Item’ is synonymous with ‘question’ somehow. Just look at the instruction: The test consists of 50 items. Each correct answer will give you 1 (one) point/credit. The maximum number of points/credits is therefore 50.
There are different item types. The most famous are multiple-choice, matching, gap-filling and some others you deal with in teaching and which we will consider a bit later dealing with testing.
The items simply put together do not comprise a test. As you know from your student experience tests usually consist of some parts and each part comprises several tasks. In this case we are talking about test format. Here is an example of test format description:
… The test consists of 3 parts, i.e. Part I Listening Comprehension, Part II English in Use and Part III Reading Comprehension. Part I tests listening for gist skills and is based on multiple choice. Part II checks grammar and vocabulary and comprises multiple choice, matching and error-correction items…. etc.
In other words, format describes WHAT to test and by what MEANS.
Test format is a crucial issue but by itself it does not check anything. Here we are coming to the stage of test construction proper. It means that we must construct some tasks which consist of several items. Here is a typical task:
