Dr Sean McDonald is a Project Manager with telc – language tests in Frankfurt. He creates, manages and delivers tests for language learners. In this post he explains why the CEFR is the golden thread running through testing, teaching and learning.
If the ultimate objective of language teaching is effective language learning, then our main concern must be the learning outcome (Stern, 1983).
As both a teacher and test developer I often have the feeling that the seemingly related fields of teaching and testing are in reality worlds apart. Being active in both fields, I try to mediate, to put it in CEFR terms. On one hand, as a test developer I want to measure skills and collect empirical data on my subjects. On the other hand, I am a compassionate teacher (I think) and I really do want all my students to do well.
As a test developer I need a profound understanding of what I am testing and why, and I carry responsibility for the test scores I report. As a teacher I really want to get students excited about testing in the same way amateur or professional athletes look forward to the next competition. As a test developer I am not a policeman and we shouldn’t be enforcing an examination from my ivory tower! As a teacher I need to evaluate my students and to be able to place them in an appropriate learning environment.
To bridge this gap, language examination boards must demonstrate that tests are fair, valid and reliable. Furthermore, we must have complete transparency in development, delivery and reporting. This is a tall order, but luckily we can rely on some excellent resources.
First, let’s look at reliability. Reliability in testing means consistency: a test with reliable scores produces the same or similar result on repeated use. To ensure reliability, we need a common scale to measure test taker ability, namely the Common European Framework of Reference for Languages: Learning, Teaching, Assessment (CEFR).
But high reliability does not necessarily imply that a test is good or interpretations of the results are valid. We need to consider validity. Does the test really measure what we intend it to measure? If the test is intended to measure language skills, and people score systematically higher or lower due to their language ability, then the test is valid. Validity relates to the degree to which evidence and theory support the interpretations of test scores entailed by proposed uses of tests.
Language tests should support inference to some domain of target language use. In essence, we must first state what we expect a test taker to be able to do using language in the real world, and then decide whether the test provides good evidence of their ability to do so. The CEFR outlines a useful approach to define achievement in specific domains of use with the illustrative descriptors.
Together with the CEFR we have additional resources to help determine expectations and interpret results: English Profile, Relating Language Examinations to the Common European Framework of Reference for Languages: Learning, Teaching, Assessment (CEFR), the Companion Volume with New Descriptors and the Manual for Language Test Development and Examining.
So, using this toolkit EFL/ESL professionals can develop a wide range of language assessment exams which can accurately and reliably measure a test taker’s ability. But just because we can assess language skills doesn’t mean we must.
This leads us then to the fundamental consideration: why are we testing? And more importantly, how do we make the test palatable for all stakeholders? The development of a test is a complex and detailed process, which begins with the decision to provide a test. This decision is influenced either by a third-party sponsor asking for a specific test format or when we identify an emerging necessity for evaluation.
We can place the Dynamic Placement Test (DPT) within this context. ClarityEnglish, together with telc – language tests have identified the necessity and usefulness of placement testing. This not only because more and more institutions implement a test for placement but also because studies have shown us that the use of a placement test contributes dramatically to a student’s success in language learning. Our goal? Developing a fair and standardized test that is useful for teachers, valid and reliable for test developers and, most importantly, transparent for test takers.
Placement tests are meant to determine a student’s language skill level so that with the test results in hand, an adviser and student can sit down and determine a course that would best suit the student. A class below the student’s ability would not benefit their education, and a class far above their ability could prove frustrating. The test scores help find a course that will challenge the student without seeming impossible to understand. At least that’s the idea.
But a problem arises. How can tests accurately predict a student’s language level and put that student into the appropriate learning environment? In other words, what is the connection between what was tested and what is being taught?
Here we come across an issue facing many institutions, which is not always entirely evident. Does the language test accurately reflect the requirements of the institution? This goes back to the fundamental question posed above when developing a test.
Developing the DPT, we consulted a wide range of stakeholders, including (but not limited to) ministries and government bodies, publishers, language schools, parents, experts, employers, educational institutions and administrative centers. In this stage we were looking for answers to questions such as:
- What are the characteristics of the test takers to be tested? (Age, gender, mother tongue, etc.)
- What is the purpose of the test? (Immigration, university admission, professional requirements, etc.)
- How does the test relate to an educational context? (A curriculum, a methodological approach, learning objectives, etc.)
- How will the results be used?
The CEFR has become the standard guideline used to describe achievements of learners of foreign languages and was put together by the Council of Europe between 1989 and 1996. The CEFR has since become accepted across the world, sometimes with local variations, as the standard for grading an individual’s language proficiency. Its main aim is to provide a standard method of learning, teaching and assessing language skills.
Looking at the areas of learning, teaching and assessing we found that more and more institutions are implementing these scales. For example, the largest publishers of textbooks (Macmillan, Oxford University Press, Pearson) base their EFL/ESL material on the CEFR. Looking at assessment, the major test boards (IELTS, Pearson, Cambridge, telc) are also using the CEFR for test development. The CEFR is the standard of measurement in language learning and assessment.
Although most institutions, wherever they are in the world, already implement their own placement tests, they are neither standardised nor CEFR based. This can be problematic especially if students are assessed with CEFR scales once they get into the classroom. Why? Our experience has revealed that most “home-brewed” placement tests are not based on a standardised scale (CEFR) and are highly subjective. In terms of fairness and to ensure transparency, all test takers must be assessed with a uniform scale – the same scales they will face in the classroom or with standardized tests (such as IELTS, Cambridge or telc).
A placement test is an invaluable benefit contributing to the students’ academic success. Firstly, it gives students a reliable standardised measurement of their language skills. Secondly, as it is based on the CEFR, it is an accurate reflection of classroom expectations. Thus, the student can effectively be placed in a productive learning environment. Thirdly, the placement test is in essence a truncated certificate or proficiency test, using the same measurement system and the same methodology.
A well-designed placement test, such as the DPT, gives the student the best opportunities for success – in the classroom, in a proficiency exam and later in practice.