Unit 10. Assessment In Efl

Transcription

UNIT 10. ASSESSMENT IN EFL1. Introduction and objectivesAs EFL professionals, many of us will have experienced EFL assessment either as astudent, teacher or both. We have passed through that process assuming for the mostpart that the exams and the resulting grades are fair, objective and reliable.“Language testing, like all educational measurement, assumes that theconditions should be standardised and optimal” (Fulcher, 2010, p. 205).For these conditions to be meet, we generally need to recur to standardised or normreferenced tests. It is also likely, especially in Europe, that those tests are somehowlinked to the Common European Framework of Languages (CEFR).In this unit, we will consider the CEFR as an assessment tool, the characteristics ofstandardised EFL testing, and the different standardised EFL tests availableinternationally.At the end of this unit, students should be able to: Understand the characteristics of the Common European Framework ofLanguages (CEFR) as a tool in assessment.Recognise the opportunities offered by standardised EFL tests.Understand the limitations of standardised EFL testing.Choose the appropriate standardised EFL test depending on the learnersneeds.

2. CEFR and assessmentThe Common European Framework of Languages (CEFR) was first published in 2001with the goal to provide “a common basis for the elaboration of language syllabuses,curriculum guidelines, examinations, textbooks, etc. across Europe” (Council ofEurope, 2001). In Europe, it is very widely used as the main reference for languageproficiency and assessment. Outside Europe, the majority of norm referenced EFL testsprovide at least a corresponding CEFR equivalent to which their grades can becompared.The CEFR divides language proficiency into five skills (speaking is broken into twosubskills):oooooListening.Reading.Spoken interaction.Spoken production.Writing.Each of the five skills are rated accorded to six levels: A1-A2 (Basic User); B1-B2(Independent User); C1-C2 (Proficient User). These can be seen below:Figure 1. Breakdown of the six levels of language proficiency according to the CEFR.The CEFR provides level descriptors that are designed to enable learners and assessorsto identify the correct proficiency according to each skill. An example is the selfassessment grid in table º, which is designed to enable language learners to self-assesstheir proficiency in each skill in relation to the CEFR.

Table 1.Self-assessment grid. Source: /2.1. CEFR and EFL assessmentAccording to Cambridge University, one of the main producers of high stakes EFLexaminations developed in alignment with the CEFR, test developers should adhere tothe following “good practices” during test design (2011, p. 17): Does the test provider adequately explain how CEFR-related results may beused?Is there appropriate evidence to support these recommendations?Can the test provider show that they have built CEFR-related good practice intotheir routine?Can the test provider show that they maintain CEFR-related standardsappropriately?Let us now look at the characteristics of standardised tests.3. Considering standardised testingWe briefly saw at the beginning of the subject that speaking assessment can be carriedout both formally and informally. Standardised testing clearly falls under the umbrellaof formal assessment.Why standardized tests? they are general in nature because students vary greatly in the abilitiesmeasuredthey’re designed for purposes of deciding who should be admitted into aprogram or placed in a groupthey’re usually made up of subtests

they have scores that are interpreted in terms of each student’s relative positionin the distribution of scores for all students in a populationthey are designed in order to make teachers accountable (politics):Examples of standardized tests EBAU (college entrance exams)English level exams: TOEFL, TOEIC, Cambridge Exams (First, Advanced ), TrinityCollege, etc.State exams (“oposiciones”)Other? Are these standardized tests also high stakes tests?Consider the following:– Tester bias (the way we formulate the question)– Distractors– Validity: are we testing what we intend to test (e.g. grammar or vocabulary?)– sudden death3.1. What is standardised testing?A standardised test refers to those tests that have clearly defined and fixed: Test items. Test scoring. Conditions of administration.

“A crucial element of robust assessment is standardisation. A standardised test isone that has been trialled with a nationally representative sample of pupils. Theoutcomes from standardised tests enable you to benchmark your pupils againstthe national average and to compare performance on different tests.” (NFER, n. d.)To achieve standardisation, extraneous variables need to be controlled. Testdevelopment follows a series of rigorous predetermined stages from inception to end.Most standardised tests include at least test specification, item writing, trialling,standardisation and setting (NFER, n. d.):Figure 2. Understanding tests. Source: NEFR (n. d.).These need to be fixed for all the examiners and test administrators to minimisevariability. It is precisely the minimised variability that leads to the termstandardisation.3.2. ScorabilityThe “scorability” refers to the ease and accuracy involved in marking tests. Commonways to increase the scorability of tests are: Using separate marking sheets. This make it easier for the marker to award themarks quickly and accurately. This is particularly true of standardised testing, but isa technique that can also be used in classrooms. In standardised testing, (listening and reading) the answer sheets are markedeither by: Templates. This can also be used in class, though it may be time-consuming toelaborate. Automatic marking machines.Many teachers now incorporate the use of digital apps to create tests that scoreautomatically and avoid manual calculations. While this can take more initialpreparation time, many teachers say this saves time in the long run.

Automated scoring in EFL testing.Most well-known standardised language tests rely on automated scoring for at leastsome of the parts. Cambridge, for example, uses automated scoring for the readingand listening papers, with the test taker having to mark the box in pencil. Test takesare required to answer on an answer sheet which is separate to the question sheet.This may negatively affect some test takers, who may make mistakes whentransferring information from one sheet to the next, either writing the wronginformation in each box or running out of time (even though extra time is allotted forthis).Figure 3. An example of an automated scoring sheet. Source: https://www.topuniversities.com/TOEFL use automated scoring in conjunction with human scoring for their writing, withthe automated scoring near replicating those of human scorers (Fulcher, 2010, p. 216).According to the OECD (2013, p. 174), these standardised assessment criteria andscoring rubrics “contribute to facilitate the assessment of student work againstnational curricula, standards or learning progressions”: Performance criteria refer to guidelines, rules or principles by which studentperformances or products are judged. They describe what to look for instudents’ work. Performances or products to judge quality. Rubrics refer to scoring toolscontaining performance criteria and a performance scale with all score pointsdescribed and defined. Exemplars refer to examples of actual products or performances to illustratethe various score points on a scale.3.3. Why use standardised testing?Despite the obvious drawbacks of the is type of high-stakes summative assessment,such as washback and sudden death (already seen in the subject), there is a need forscores and certificates obtained through standardised testing, as we will see in thefollowing section.Standardised testing normally occurs at set time periods throughout the year, mostfrequently towards the end of the school year, either when school districts and schools

require a systematic approach to evaluation to track learning trends, and/or parents orpupils who desire or require a language certification.This information can provide: The student's growth over time.The student's performance as compared with his or her grade-level peers.This information enables: Test takers to see their performance change from one test to the next.Teachers to see individual learning trends, which enables them to spotdifficulties and modify instruction.Teachers to know the average performance for the class.Schools to see learning trends per class/teacher and spot strengths andpotential weaknesses.School districts to compare performance between schools and other schooldistricts.3.4. The need for standardized testing as proof of languageIn many countries, language learners need to certify their language level. Forexample, teachers within Spain that teach content in an additional language need toprove they have obtained a B2 or C1 depending on the region where they work. Here,certain organisations responsible for standardised testing have been certified by theEducational authorities to certify test takers language levels as proof of language levelnecessary to teacher in state and chartered schools.For example, the Madrid Educational Authorities cite the following certificates ofstandardised testing accepted as proof of a certain language level (BOCM-20170109-4,22nd December): Qualifications that can be consulted electronically by the Educational Authority:o Degree in English philology; Translation and Interpreting in English, EnglishStudies.o Certificate of C1 from an Official Language School (Escuelas Oficiales deIdiomas). For qualifications that cannot be consulted electronically by the EducationalAuthority, the applicant should bring:o University Degree or Master undertaken in the English Language in an Englishspeaking country.o Cambridge Certificate of Proficiency (CPE) or Advanced Certificate (CAE).o Certificate E III or ISE IV of Trinity College London.

o GESE 8-12 of Trinity College London.o TOEFL iBT, with a minimum score of 26 in listening, 28 in reading, 28 inspeaking, 28 in writing and a total minimum score of 110.o IELTS, with a total score of 7.o TOEIC with a minimum score of 490 in listening, 455 in reading and 200 inspeaking, 200 in writing and a minimum total score of 1345.o APTIS for Teachers/APTIS General from British Council, certifying a C1.o Business Language Testing Service (BULATS), with a total minimum score of89.o Certificates issued by institutions of recognised prestige which certify thecandidate has reached a communicative proficiency of at least C1 (CEFR) in allfour skills.o University Master in Applied Linguistics; English Translation; TEFL; EnglishStudies; PhD. related to English Studies.o Pearson Test of English General Level 4 (C1) or Level 5 (C2).The use of standardised test scores can be very useful for organisations such asuniversities and educational authorities to set a “minimum requirement” for languageproficiency, which seems highly sensible for professions where language is a centralrequirement to the ability to complete the job satisfactorily. After all, how can youteach English effectively without having obtained at least a certain proficiency in thelanguage?Possible negative effects of using standardised testing as proof of proficiencyNevertheless, language requirements can also be set somewhat arbitrary. It iscommon in jobs advertisements to see a language requirement where that workerwould only very infrequently use the language. But as so many candidates apply,potential employers may have the temptation to think “well, let us ask for a languagecertificate just in case”.Fulcher (2010) tells of a case where language requirements were used as a criterion forimmigration control in Australia. In 2009, the standard was raised from 4.5 to 5 onIELTS because the economic crisis had reduced the need for imported labour. Fulchercomments that moving the standard to 5.0 had the effect of reducing immigration,with some organisations lobbying for the standard to be raised even further to IELTS6.0. While it would appear likely that there is a connection between languagecompetence and the ability to successfully work as an Engineer in Australia, “there isno evidence to suggest that a band 6 on IELTS is ‘the standard’ that is mostappropriate” (2010, p. 227).

3.5. Cambridge EnglishIn Europe, Cambridge English is one the one the most frequently used standardisedlanguage tests, both in schools and language centres. Tests are closely aligned to theCEFR, ranging from pre-A1 for very basic language skills, such as young childrenstarting to learn an L2 in school (i. e. the Young Flyers English series —YLE—, thelowest of which is Starters), right up to C2 Proficiency level.The range of Cambridge tests and their alignment to the CEFR can be appreciated infigure 4:Figure 4. Cambridge English Assessment according to the CEFR.For the Cambridge English language exams, each exam level has different tasksadapted specifically to that level. In tables 1 and 2 we can compare possible writingtasks from the B1Preliminary test to the C1 Advanced.Table 2. Summary of Cambridge Preliminary B1 Writing tasks.

Table 3. Summary Cambridge Advanced C1 Writing tasks.IELTSWhile Cambridge is very well known for the range of exams catering to schools andhigher education, Business Vantage and International English Language TestingSystem (IELTS) also belong to Cambridge. The latter is an examination “designed toassess the English language ability of people whose first language is not English andwho need to study, work or live where English is used as the language ofcommunication” (IELTS, n. d.). Again, both exams are closely linked to the CEFR:Figure 5. IELTS according to the CEFR. Source: oneuropean-framework

3.6. AptisAptis is an EFL test designed by the British Council. It is designed alongside theCommon European Framework for Languages and tests from A1-A2-B1-B2-C.It includes testing the four skills: speaking, listening, reading and writing. Below, wecan see how these four skills are all connected with the core component of grammarand vocabulary. When a candidate’s score falls in between two levels, Aptis uses thescore obtained in the core grammar and vocabulary component to obtain either thehigher or the lower level (Dunn, 2019, p. 8).It is worth noting that a major difference with APTIS as compared to some of the mostwell-known tests on offer is that test takers do not need to have all four skillsassessed. Depending on the needs of the organisation or test taker, they can apply fordifferent skills to be tested. The core component of grammar and vocabulary is alwaystested.Figure 6. The Aptis core component feeds into CEFR level allocation for all four skill areas. /files/aptis scoring system layout final.pdfAptis includes the Aptis General exam, which is designed for general languagecertification, and variants such as Aptis for teachers and Aptis Advanced. Below wecan see the different tasks for Aptis General.Skills and Tasks in Aptis EFL testTESTCOMPONENTSLENGTH(minutes)SCORINGTASKS

Grammar andvocabulary250 - 5060Speaking12CEFR A1 - C/0- 504Writing50CEFR A1 C/0 - 503Listening50CEFR A1 - C/0- 5028 ( /- 4)Reading30CEFR A1 C/0 - 504Table 4. Aptis General. Adapted from: tis variantsAs mentioned, Aptis produces variants which have been designed to be used in morespecific cases. These include Aptis for teachers and Aptis Advanced. In the table belowwe can see the different CEFR language levels each variant has been designed for.Table 5. CEFR level coverage for major Aptis variants. /files/aptis scoring system layout final.pdfAptis for teachersThe variant Aptis for teachers was specifically designed for use in education. Accordingto the Aptis website (n. d.), Aptis for teachers can be used by: English teachers working in schools, colleges or universities around the world.Teachers of other subjects in schools, colleges or universities around the world.

English teachers in large scale language programs, including commercially runprograms.Students on teacher training or university programs.Other professionals working in educational contexts.The structure of the exam replicates that for Aptis General (see table 4). For teacherswithin Spain, Aptis has been certified by the Educational authorities as proof oflanguage level necessary to teacher in state and chartered schools (British Council,2020).Aptis AdvancedFor higher levels, students can opt for the Aptis Advanced. Aptis Advanced has theoptimal point of discrimination at levels C1 and C2 (British Council, 2020).Skills and Tasks in Aptis EFL ar andvocabulary250 - 5050Speaking10CEFR B1 - C2/0-503Writing45CEFR B1 - C2/0-503Listening30CEFR B1 - C2/0-5028 ( /- 4)Reading60CEFR B1 - C2/0-504Table 6. Aptis Advanced. Adapted from: ed

3.7. TrinityThe exams commonly known as “Trinity” are a branch of Trinity College London.Trinity have many different types of EFL exams for different ages and needs. Theseinclude (Trinity College, 2020): Trinity Stars: for children from ages three to twelve. English language learningthrough drama, music and performance.GESE: graded exams in speaking. Tests English speaking and listening skills. 12grades (pre-A1 to C2) available for learners of all ages from beginners to advanced.These are frequently used by different Education authorities in Spain in BilingualPrimary Education.ISE: integrated skills in English. Tests speaking, listening, reading and writing. Fivelevels (A2 to C2) available for people who need English skills for study and work.ESOL Skills for Life (UK): tests speaking, listening, reading and writing. Pre-entrycertificates Entry Level 1 to Level 2 qualifications for adult learners in the UK.ESOL step 1 and step 2 (UK): Tests speaking and listening skills of adults in the UKaged 16 who are not yet ready to take an Entry level ESOL Skills for Lifequalification.SELTS: Secure English Language Tests (UK). Home Office UKVI-approved GESE A1,A2, B1 and ISE B1, B2, C1 SELTs for visa, citizenship, settlement and other SELTpurposes.English Language Teaching Qualifications: CertTESOL, TYLEC and DipTESOL. TrinityCertificates and Diplomas in Teaching English to Speakers of Other Languages(TESOL).Figure 7. Trinity Exams according to the CEFR. Source:https://www.trinitycollege.com/resource/?id 6733

3.8. TOEFLThe Test of English as a Foreign language (TOEFL) iBT is designed to measure thestudent’s ability to use and understand English. It was primarily designed for nonnative applicants to US universities (Buck, 2001, p. 216). It is designed to evaluate howwell test takers combine reading, listening, speaking, and writing skills to performacademic tasks. The test was redesigned in August 2019.Below we can see a summary of the new exam format:TOEFL iBT Test SectionsSECTION TIME nsRead 3 or 4 passages from academic textsand answer n to lectures, classroom discussions andconversations, then answer questions.Break10minutes——4 tasksExpress an opinion on a familiar topic; speakbased on reading and listening tasks.2 tasksWrite essay responses based on readingand listening tasks; support an opinion inwriting.Speaking 17 minutesWriting50minutesTable 7. TOEFL iBT test sections. Source: https://www.ets.org/toefl/ibt/about/content/TOEFL uses performance descriptors for each skill which correlated to the CEFR levelsA2 to C1. We can see an example of the performance descriptors used in speakingbelow (table 8).

Table 8. Performance descriptors for the TOEFL iBT test. Source: ICThe Test of English for International Communication (TOEEIC), like TOFEL, is astandardised test that belongs to Educational Testing Service (ETS). Whereas TOEFL isaimed at students wanting to access International Universities, TOEIC is aimed atprofessional users and the workplace, and has a huge testing programme with 1.8million candidates in 2000, mainly in Asia (Buck, 2001, p. 210).The uniqueness of the TOEIC exams in comparison with other widely knownstandardised EFL tests is highlighted when we consider the sources, settings,situations and formats that are found in TOEIC exams (Educational Testing Service,2017, p. 3): Corporate development: research, product development. Dining out: business and informal lunches, banquets, receptions, restaurantreservations. Entertainment: cinema, theatre, music, art, exhibitions, museums, media. Finance and budgeting: banking, investments, taxes, accounting, billing. General business: contracts, negotiations, mergers, marketing, sales, warranties,business planning, conferences, labour relations. Health: medical insurance, visiting doctors, dentists, clinics, hospitals. Housing/corporate property: construction, specifications, buying and renting,electric and gas services. Manufacturing: assembly lines, plant management, quality control. Offices: board meetings, committees, letters, memoranda, telephone, fax and email messages, office equipment and furniture, office procedures.

Personnel: recruiting, hiring, retiring, salaries, promotions, job applications, jobadvertisements, pensions, awards. Purchasing: shopping, ordering supplies, shipping, invoices. Technical areas: electronics, technology, computers, laboratories and relatedequipment, technical specifications. Travel: trains, airplanes, taxis, buses, ships, ferries, tickets, schedules, station andairport announcements, car rentals, hotels, reservations, delays, and cancellations.The TOFEL separates each of the four skills, allowing test takers to combine the paperswhich they wish paper to sit. The TOEIC was updated in 2017 with numerous changesto its format. Below we can see the format for the writing paper:TOEIC Writing testQUESTIONS1-56-78TASKWrite a sentencebased on apictureRespond to awritten requestWrite an opinionessayDESCRIPTION You will write one sentence that is basedon a picture. With each picture, you will be given twowords or phrases that you must use inyour sentence. You can change the forms of the wordsand you can use the words in any order. You will show how well you can write aresponse to an e-mail. You will have 10 minutes to read andanswer each e-mail. You will write an essay in response to aquestion that asks you to state, explainand support your opinion on an issue. Typically, an effective essay will contain aminimum of 300 words.Table 9. TOEIC writing test. Source: ing/about/content-format

Standardized tests in EFL Have you ever taken/ administered any of these standardized tests?If so, what was your impression regarding fairness (i.e. ensuring equity andattending to student’s needs)

1. Introduction and objectives As EFL professionals, many of us will have experienced EFL assessment either as a student, teacher or both. We have passed through that process assuming for the most part that the exams and the resulting grades are fair, objective and reliable. "Language testing, like all educational measurement, assumes that the