‘Your test isn’t accurate!’ — A test designer responds

by | 21 November 2018

The issue

Your student took a placement test. The result was B2. You believe he is B1. What went wrong?

Why might you think this?

First of all, there are many factors that influence how our skills develop and it’s not unusual for someone to have different levels of ability across the four skills. If you’re living in another country, communicating orally with people in the new language on a daily basis, your listening and speaking will improve. If you seldom write in the target language, you’ll find that this skill is weaker. If you enjoy reading comics, but seldom read news articles, then you may have difficulties with longer texts, following arguments and understanding nuances. These differences can also be cultural. For example, many East Asian students have listening and reading skills at a higher level than their speaking skills.

The problem is that when a student like this is graded C1, the teacher may think, “But hang on, her speaking is nowhere near C1.” It would, of course, be easier if our levels did correlate across skills, but unfortunately, language learning tends to be more complicated than that.

What’s problematic for a test developer is that a teacher’s reaction may be based on professional opinion, live observations and internal evaluation systems rather than statistical evidence. There is little we can do with this feedback, as we work from test data. The possibility also exists, of course, that when a teacher is talking about their student’s level, they are talking about different aspects or factors than we are.

Bringing order to this chaos

We’ve learned from the feedback we’ve had that some teachers see a CEFR level as a holistic evaluation of ability — they seem to feel that if a students is labelled B2, they should be B2 across the board. The Dynamic Placement Test (DPT) places the test-takers in a CEFR level, A1-C2, and refines this with a Relative Numeric, a relative indicator within the level, to help differentiate large numbers of students in the same CEFR level. The test sets out to evaluate receptive skills within a 30 minute time frame.

If the DPT just used, as many placement tests do, its own scoring mechanism, in this case the Relative Numeric, people might not think about this notion of an overall score. It might be easier for them to keep the actual purpose of the test in mind — i.e. placing students into the right groups.

Instead we have people speaking broadly about skills, saying “it says he’s a B2, but his speaking is not great so I think he’s a B1.”  Of course, as the DPT doesn’t specifically test speaking, a grade for this productive skill cannot be included in the overall evaluation.

The purpose of the test

As my colleague Sean McDonald wrote last month, different tests can have have different purposes. A 30-minute placement test focused on receptive skills is not supposed to fulfill the same function as a holistic all-skill evaluation carried out over a longer period of time. The purpose of a placement test is not to ask “What is this student’s overall level?”, it is to ask a different question: “Which group should this student be placed in to ensure they get the teaching they need to progress?”

Now, when we are considering where a student should be placed, we start with the accepted truth that reception comes before production.

A teacher might say they have a student whose comprehension and grammar is B1 but whose speaking is B2.  What factors is the teacher taking into account here when they’re thinking about speaking? Many of us focus on fluency, but in the CEFR, the aspects of spoken language use are Range, Accuracy, Fluency, Interaction, Coherence. So, in addition to sounding quite fluent, a real B2 speaker would possess a wide range of vocabulary, high grammatical accuracy and the ability to understand and interact with someone speaking to them at a B2 level.

When we were designing the Dynamic Placement Test, we took the approach that this student with good speaking skills (maybe a confident and relatively fluent speaker who has a good range of idioms or slang), but lacking in grammar and comprehension ability, would need to be placed in a B1 level class to build on those foundations.

With the same logic, if a student is at a C1 level in reading and listening, grammar and vocabulary but B1 in speaking (perhaps due to their educational tradition), we would say this student has the ability and just needs practice, so this student would be put in a C1 class.

That is why the primary purpose of a placement test is to focus on input.

Laura Edwards, Test Expert and Materials Writer, telc Language Tests

Laura Edwards, Test Expert and Materials Writer, telc Language Tests