Linking The Texas STAAR Assessments To NWEA MAP Growth Tests

Transcription

Linking the Texas STAAR Assessments to NWEAMAP Growth Tests**As of June 2017 Measures of Academic Progress (MAP ) is known as MAP Growth .March 2016

IntroductionNorthwest Evaluation Association (NWEA ) is committed to providing partners withuseful tools to help make inferences from the Measures of Academic Progress (MAP ) interimassessment scores. One important tool is the concordance table between MAP and statesummative assessments. Concordance tables have been used for decades to relate scores ondifferent tests measuring similar but distinct constructs. These tables, typically derived fromstatistical linking procedures, provide a direct link between scores on different tests and servevarious purposes. Aside from describing how a score on one test relates to performance onanother test, they can also be used to identify benchmark scores on one test corresponding toperformance categories on another test, or to maintain continuity of scores on a test after thetest is redesigned or changed. Concordance tables are helpful for educators, parents,administrators, researchers, and policy makers to evaluate and formulate academic standing andgrowth.Recently, NWEA completed a concordance study to connect the scales of the State ofTexas Assessments of Academic Readiness (STAAR ) reading and math with those of the MAPReading and MAP for Mathematics assessments. In this report, we present the 3rd through 8thgrade cut scores on MAP reading and mathematics scales that correspond to the benchmarks onthe STAAR reading and math tests. Information about the consistency rate of classification basedon the estimated MAP cut scores is also provided, along with a series of tables that predict theprobability of receiving a Level II (i.e., “Satisfactory”) or higher performance designation on theSTAAR assessments, based on the observed MAP scores taken during the same school year. Adetailed description of the data and analysis method used in this study is provided in theAppendix.Overview of AssessmentsSTAAR includes a series of vertically scaled achievement tests aligned to the Texas statecurriculum, the Texas Essential Knowledge and Skills (TEKS) in math and reading for grades 3-8,writing for grades 4 and 7, science for grades 5 and 8, social science for grades 5 and 8 and endof-course assessments for English I, English II, Algebra I, biology, and U.S history. STAAR tests canbe delivered online or in the paper-and-pencil form. For each grade and subject, there are twocut scores that distinguish between performance levels: Level I: Unsatisfactory AcademicPerformance, Level II: Satisfactory Academic Performance and Level III: Advanced AcademicPerformance. The Level II cut score demarks the minimum level of performance considered tobe “Proficient” for accountability purposes.Page 2 of 23

MAP tests are vertically scaled interim assessments that are administered in the form ofa computerized adaptive test (CAT). MAP tests are constructed to measure student achievementfrom Grades K to 12 in math, reading, language usage, and science and aligned to the TEKSstandards. MAP scores are reported on a Rasch Unit (RIT) scale with a range from 100 to 350.Each subject has its own RIT scale. To aid interpretation of MAP scores, NWEA periodicallyconducts norming studies of student and school performance on MAP. For example, the 2015 RITScale norming study (Thum & Hauser, 2015) employed multi-level growth models on nearly500,000 longitudinal test scores from over 100,000 students that were weighted to create large,nationally representative norms for math, reading, language usage, and general science.Estimated MAP Cut Scores Associated with STAAR Readiness LevelsTables 1 to 4 report the STAAR scaled scores associated with each of the threeperformance levels, as well as the estimated score range on the MAP tests associated with eachSTAAR performance level. Specifically, Tables 1 and 2 apply to MAP scores obtained during thespring testing season for reading and math, respectively. Tables 3 and 4 apply to MAP tests takenin a prior testing season (fall or winter) for reading and math, respectively. The tables also reportthe percentile rank (based on the NWEA 2015 MAP Norms) associated with each estimated MAPcut score. The MAP cut scores can be used to predict students’ most probable STAARperformance level, based on their observed MAP scores. For example, a 6th grade student whoobtained a MAP math score of 240 in the spring testing season is likely to be at the very high endof Level II (Satisfactory) on the STAAR taken during that same testing season (see Table 2).Similarly, a 3rd grade student who obtained a MAP reading score of 210 in the fall testing seasonis likely to be at Level III (Advanced) on the STAAR taken in the spring of 3rd grade (see Table 3).Page 3 of 23

TABLE 1. CONCORDANCE OF PERFORMANCE LEVEL SCORE RANGES BETWEEN STAARAND MAP READING (WHEN MAP IS TAKEN IN SPRING)STAARGradeLevel IUnsatisfactoryLevel IISatisfactoryLevel -15861587-17821783-2300MAPGradeLevel IUnsatisfactoryLevel IISatisfactoryLevel tes. 1. %ile percentile.2. Bolded numbers indicate the cut scores considered to be at least “proficient” for accountability purposes.Page 4 of 23

TABLE 2. CONCORDANCE OF PERFORMANCE LEVEL SCORE RANGES BETWEEN STAARAND MAP MATH (WHEN MAP IS TAKEN IN SPRING)STAARGradeLevel IUnsatisfactoryLevel IISatisfactoryLevel -15941595-18531854-2300MAPGradeLevel IUnsatisfactoryLevel IISatisfactoryLevel tes. 1. %ile percentile.2. Bolded numbers indicate the cut scores considered to be at least “proficient” for accountability purposes.Page 5 of 23

TABLE 3. CONCORDANCE OF PERFORMANCE LEVEL SCORE RANGES BETWEEN STAARAND MAP READING (WHEN MAP IS TAKEN IN FALL OR WINTER PRIOR TO SPRINGSTAAR TESTS)GradeLevel IUnsatisfactorySTAARLevel 700-15861587-17821783-2300Level IIIAdvancedMAP FALLGradeLevel IUnsatisfactoryRIT%ileLevel IISatisfactoryRIT%ileLevel 00-2061-24207-23225-83233-35084-99MAP WINTERGradeLevel IUnsatisfactoryRIT%ileLevel IISatisfactoryRIT%ileLevel 00-2091-26210-23327-82234-35083-99Notes. 1. %ile percentile.2. Bolded numbers indicate the cut scores considered to be at least “proficient” for accountability purposes.Page 6 of 23

TABLE 4. CONCORDANCE OF PERFORMANCE LEVEL SCORE RANGES BETWEEN STAARAND MAP MATH (WHEN MAP IS TAKEN IN FALL OR WINTER PRIOR TO SPRING STAARTESTS)GradeLevel IUnsatisfactorySTAARLevel 700-15941595-18531854-2300Level IIIAdvancedMAP FALLGradeLevel IUnsatisfactoryRIT%ileLevel IISatisfactoryRIT%ileLevel 00-2171-31218-25332-93254-35094-99MAP WINTERGradeLevel IUnsatisfactoryLevel IISatisfactoryLevel tes. 1. %ile percentile.2. Bolded numbers indicate the cut scores considered to be at least “proficient” for accountability purposes.Page 7 of 23

Consistency Rate of ClassificationConsistency rate of classification (Pommerich, Hanson, Harris, & Sconing, 2004),expressed in the form of a rate between 0 and 1, provides a means to measure the departurefrom equity for concordances (Hanson et al., 2001). This index can also be used as an indicatorfor the predictive validity of the MAP tests, i.e., how accurately the MAP scores can predict astudent’s proficiency status in the STAAR test. For each pair of concordant scores, a classificationis considered consistent if the examinee is classified into the same performance categoryregardless of the test used for making a decision. Consistency rate provided in this report can becalculated as, for the “proficient” performance category concordant scores, the percentage ofexaminees who score at or above both concordant scores plus the percentage of examinees whoscore below both concordant scores on each test. Higher consistency rate indicates strongercongruence between STAAR and MAP cut scores. The results in Table 5 demonstrate that onaverage MAP reading scores can consistently classify students’ proficiency (Level II or higher)status on STAAR reading test 87% of the time and MAP math scores can consistently classifystudents on STAAR math test 87% of the time. Those numbers are high suggesting that both MAPreading and math tests are great predictors of the students’ proficiency status on the STAAR tests.TABLE 5. CONSISTENCY RATE OF CLASSIFICATION FOR MAP AND STAAR LEVEL IIEQUIPERCENTILE .880.880.870.860.84FalsePositives FalsePositives .120.09Proficiency ProjectionProficiency projection tells how likely a student is classified as “proficient” on STAAR testsbased on his/her observed MAP scores. The conditional growth norms provided in the 2015 MAPNorms were used to calculate this information (Thum & Hauser, 2015). The results of proficiencyprojection and corresponding probability of achieving “proficient” on the STAAR tests arePage 8 of 23

presented in Tables 6 to 8. These tables estimate the probability of scoring at Level II or aboveon STAAR in the spring and the prior fall or winter testing season. For example, if a 3rd gradestudent obtained a MAP math score of 190 in the fall, the probability of obtaining a Level II orhigher STAAR score in the spring of 3rd grade is 73%. Table 6 presents the estimated probabilityof meeting Level II benchmark when MAP is taken in the spring, whereas Tables 7 and 8 presentthe estimated probability of meeting Level II benchmark when MAP is taken in the fall or winterprior to taking the STAAR tests.Page 9 of 23

TABLE 6. PROFICIENCY PROJECTION AND PROBABILITY FOR PASSING STAAR LEVEL II(SATISFACTORY) WHEN MAP IS TAKEN IN THE SPRINGReadingGrade34MathProjected ProficiencyStart%ileRITSpringCut ScoreLevel sYesProb. 0.01 0.01 0.010.010.030.170.380.620.830.940.990.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.01 0.01 0.01 0.010.030.110.270.500.730.890.970.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99Projected ProficiencyStart%ileRITSpringCut ScoreLevel YesProb. 0.01 0.01 0.010.010.040.150.370.630.850.920.98 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.01 0.01 0.01 0.01 0.010.020.080.250.500.630.850.960.99 0.99 0.99 0.99 0.99 0.99 0.99Page 10 of 23

TABLE 6. (CONTINUED)ReadingGrade56MathProjected ProficiencyStart%ileRITSpringCut ScoreLevel YesYesProb. 0.01 0.01 0.010.010.060.170.380.620.830.940.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.01 0.010.010.060.270.500.730.890.970.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99Projected ProficiencyStart%ileRITSpringCut ScoreLevel YesYesProb. 0.01 0.01 0.010.010.040.250.500.750.920.98 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.01 0.01 0.010.010.080.370.630.850.960.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99Page 11 of 23

TABLE 6. (CONTINUED)ReadingGrade78MathProjected ProficiencyStart%ileRITSpringCut ScoreLevel sYesYesNote. %ile percentileProb. 0.01 0.01 0.010.030.170.380.620.830.940.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.01 0.010.010.110.270.620.830.940.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99Projected ProficiencyStart%ileRITSpringCut ScoreLevel YesYesProb. 0.01 0.01 0.01 0.010.040.150.500.750.920.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.01 0.01 0.01 0.010.040.250.630.850.98 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99 0.99Page 12 of 23

ABLE 7. PROFICIENCY PROJECTION AND PROBABILITY FOR PASSING STAAR READINGLEVEL II (SATISFACTORY) WHEN MAP IS TAKEN IN THE FALL OR WINTER PRIOR TOSPRING STAAR 6209211214218224Projected ProficiencyCut Score Level II 4Yes0.99194Yes 0.99194Yes 0.99194Yes 0.99202No 2Yes0.99202Yes 0.99202Yes 6219223228Projected ProficiencyCut Score Level II Prob.194No 0.01194No 94Yes0.97194Yes0.99194Yes 0.99194Yes 0.99194Yes 0.99194Yes 0.99194Yes 0.99202No 0.01202No 02Yes0.96202Yes0.98202Yes 0.99202Yes 0.99202Yes 0.99202Yes 0.99202Yes 0.99Page 13 of 23

TABLE 7. 5217219221224226230236Projected ProficiencyCut-Score Level II 07Yes0.99207Yes 0.99207Yes 0.99207Yes 08Yes 0.99208Yes 0.99208Yes 0.99208Yes 6229233238Projected ProficiencyCut-Score Level II Prob.207No 7Yes0.96207Yes0.97207Yes0.99207Yes 0.99207Yes 0.99207Yes 0.99207Yes 0.99207Yes 0.99208No 208Yes0.98208Yes0.99208Yes 0.99208Yes 0.99208Yes 0.99208Yes 0.99208Yes 0.99208Yes 0.99Page 14 of 23

TABLE 7. 1223225228230234237243Projected ProficiencyCut-Score Level 99211Yes 0.99211Yes 0.99211Yes 0.99211Yes 211Yes0.99211Yes 0.99211Yes 0.99211Yes 2235239244Projected ProficiencyCut-Score Level II Prob.211No 211Yes0.98211Yes0.99211Yes 0.99211Yes 0.99211Yes 0.99211Yes 0.99211Yes 0.99211Yes 0.99211No 211Yes0.99211Yes 0.99211Yes 0.99211Yes 0.99211Yes 0.99211Yes 0.99211Yes 0.99211Yes 0.99Note. %ile percentilePage 15 of 23

TABLE 8. PROFICIENCY PROJECTION AND PROBABILITY FOR PASSING STAAR MATHLEVEL II (SATISFACTORY) WHEN MAP IS TAKEN IN THE FALL OR WINTER PRIOR TOSPRING STAAR 9211214216220225Projected ProficiencyCut Score Level II 9Yes0.99199Yes 0.99199Yes 0.99199Yes 0.99212No Yes0.99212Yes 0.99212Yes 1223227232Projected ProficiencyCut Score Level II Prob.199No 9Yes0.95199Yes0.97199Yes0.99199Yes 0.99199Yes 0.99199Yes 0.99199Yes 0.99199Yes 0.99212No 0.01212No 0.01212No 12Yes0.97212Yes0.99212Yes 0.99212Yes 0.99212Yes 0.99212Yes 0.99Page 16 of 23

TABLE 8. 2224226228231234238243Projected ProficiencyCut-Score Level II Prob.215No 15Yes0.93215Yes0.96215Yes0.98215Yes0.99215Yes 0.99215Yes 0.99215Yes 0.99215Yes 0.99218No 18Yes0.96218Yes0.98218Yes0.99218Yes 0.99218Yes 0.99218Yes 0.99218Yes 0.99218Yes 6239243248Projected ProficiencyCut-Score Level II Prob.215No 0.01215No 215Yes0.99215Yes 0.99215Yes 0.99215Yes 0.99215Yes 0.99215Yes 0.99215Yes 0.99218No 0.01218No 218Yes 0.99218Yes 0.99218Yes 0.99218Yes 0.99218Yes 0.99218Yes 0.99218Yes 0.99Page 17 of 23

TABLE 8. 1233236238241245249256Projected ProficiencyCut-Score Level II Prob.222No 22Yes0.95222Yes0.98222Yes0.99222Yes 0.99222Yes 0.99222Yes 0.99222Yes 0.99222Yes 0.99223No 23Yes0.96223Yes0.98223Yes0.99223Yes0.99223Yes 0.99223Yes 0.99223Yes 0.99223Yes 5248253259Projected ProficiencyCut-Score Level II Prob.222No 0.01222No 222Yes 0.99222Yes 0.99222Yes 0.99222Yes 0.99222Yes 0.99222Yes 0.99222Yes 0.99223No 0.01223No 223Yes0.79223Yes0.88223Yes0.94223Yes0.98223Yes 0.99223Yes 0.99223Yes 0.99223Yes 0.99223Yes 0.99223Yes 0.99223Yes 0.99223Yes 0.99Note. %ile percentilePage 18 of 23

Summary and DiscussionThis study produced a set of cut scores on MAP reading and math tests for Grades 3 to 8that correspond to each STAAR performance level. By using matched score data from a sampleof students from Texas, the study demonstrates that MAP scores can accurately predict whethera student could be proficient or above on the basis of his/her MAP scores. This study also usedthe 2015 NWEA norming study results to project a student’s probability to meet proficiencybased on that student’s prior MAP scores in fall and winter. These results will help educatorspredict student performance in STAAR tests as early as possible and identify those students whoare at risk of failing to meet required standards so that they can receive necessary resources andassistance to meet their goals.While concordance tables can be helpful and informative, they have general limitations.First, the concordance tables provide information about score comparability on different tests,but the scores cannot be assumed to be interchangeable. In the case for STAAR and MAP tests,as they are not parallel in content, scores from these two tests should not be directly compared.Second, while the sample data used in this study were collected from 147 schools in Texas,caution should be exercised when generalizing the results to test takers who differ significantlyfrom this sample. Finally, caution should also be exercised if the concorded scores are used for asubpopulation. NWEA will continue to gather information about STAAR performance from otherschools in Texas to enhance the quality and generalizability of the study.Page 19 of 23

ReferencesHanson, B. A., Harris, D. J., Pommerich, M., Sconing, J. A., & Yi, Q. (2001). Suggestions for theevaluation and use of concordance results. (ACT Research Report No. 2001-1). Iowa City,IA: ACT, Inc.Kolen, M. J., & Brennan, R. L. (2004). Test equating, scaling, and linking. New York: Springer.Pommerich, M., Hanson, B., Harris, D., & Sconing, J. (2004). Issues in conducting linkage betweendistinct tests. Applied Psychological Measurement, 28(4), 247-273.Thum Y. M., & Hauser, C. H. (2015). NWEA 2015 MAP Norms for Student and School AchievementStatus and Growth. NWEA Research Report. Portland, OR: NWEA.Page 20 of 23

AppendixData and AnalysisDataData used in this study were collected from 147 schools in Texas. The sample containedmatched STAAR and MAP reading scores of 50,108 students in Grades 3 to 8 and matched STAARand MAP math scores of 46,987 students in Grades 3 to 8 who completed both MAP and STAARin the spring of 2015.To understand the statistical characteristics of the test scores, descriptive statistics areprovided in Table A1 below. As Table A1 indicates, the correlation coefficients between MAP andSTAAR reading scores range from 0.75 to 0.82, and the correlation coefficients between MAP andSTAAR math scores range from 0.76 to 0.87. In general, all these correlations indicate a strongrelationship between MAP and STAAR test scores.TABLE A1. DESCRIPTIVE STATISTICS OF THE SAMPLE 3STAARSDMin148.26 1026150.91 1047145.94 1110139.78 1178133.51 1295133.39 1178154.28 1033153.21 1191151.22 1189145.49 1024140.65 101597.65 ax248259270261264283283290298278312293Page 21 of 23

Equipercentile Linking ProcedureThe equipercentile procedure (e.g., Kolen & Brennan, 2004) was used to establish theconcordance relationship between STAAR and MAP scores for grades 3 to 8 in reading and math.This procedure matches scores on the two scales that have the same percentile rank (i.e., theproportion of scores at or below each score).Suppose we need to establish the concorded scores between two tests. 𝑥 is a score onTest 𝑋 (e.g., STAAR). Its equipercentile equivalent score on Test 𝑌 (e.g., MAP), 𝑒& 𝑥 , can beobtained through a cumulative-distribution-based linking function defined in Equation (A1):𝑒& 𝑥 𝐺 * [𝑃 𝑥 ](A1)where 𝑒& 𝑥 is the equipercentile equivalent of scores on STAAR on the scale of MAP, 𝑃 𝑥 is thepercentile rank of a given score on Test 𝑋. 𝐺 * is the inverse of the percentile rank function forscores on Test 𝑌 which indicates the scores on Test 𝑌 corresponding to a given percentile.Polynomial loglinear pre-smoothing was applied to reduce irregularities of the frequencydistributions as well as equipercentile linking curve.Consistency Rate of ClassificationConsistency rate of classification accuracy, expressed in the form of a rate between 0 and1, measures the extent to which MAP scores (and the estimated MAP cut scores) accuratelypredicted whether students in the sample would pass (i.e., Level II or higher) on STAAR tests.To calculate consistency rate of classification, sample students were designated “BelowSTAAR cut” or “At or above STAAR cut” based on their actual STAAR scores. Similarly, they werealso designated as “Below MAP cut” or “At or above MAP cut” based on their actual MAP scores.A 2-way contingency table was then tabulated (see Table A2), classifying students as “Proficient”on the basis of STAAR cut score and concordant MAP cut score. Students classified in the truepositive (TP) category were those predicted to be Proficient based on the MAP cut scores andwere also classified as Proficient based on the STAAR cut scores. Students classified in the truenegative (TN) category were those predicted to be Not Proficient based on the MAP cut scoresand were also classified as Not Proficient based on the STAAR cut scores. Students classified inthe false positive (FP) category were those predicted to be Proficient based on the MAP cut scoresbut were classified as Not Proficient based on the STAAR cut scores. Students classified in thefalse negative (FN) category were those predicated to be Not Proficient based on the MAP cutscores but were classified as Proficient based on the STAAR cut scor

Scale norming study (Thum & Hauser, 2015) employed multi-level growth models on nearly 500,000 longitudinal test scores from over 100,000 students that were weighted to create large, nationally representative norms for math, reading, language usage, and general science. Estimated MAP Cut Scores Associated with STAAR Readiness Levels