Featured Post

Buy Essay Online 100% Fast, Cheap, Safe

Buy Essay Online 100% Fast, Cheap, Safe Created from scratch by top experts, it will elevate your term paper writing to a complete new st...

Friday, June 19, 2020

ACT Writing Scores Explained 2016

Why did ACT suddenly reverse course and ditch the 1-36 score for ACT Writing? ACT announced on June 28th, 2016 that as of the September 2016 test date ACT Writing scores will change once more. One of the most critical purposes of a test scale is to communicate information to the score user. In that regard, the 1-36 experiment with Writing was a  failure. ACT has admitted that the scale caused confusion and created a perceptual problem. It is not yet clear if the new change piled onto the class of 2017s already large load of changes will lessen the criticism that it has been receiving. Is the test changing? The essay task is not changing, and two readers will still be assigning 1-6 scores in four domains. ACT states,  Some language in the directions to the students has been modified to improve clarity. It has not yet clarified what the clarification will be. How will scores be reported going forward? The basic scoring of the essay will remain unchanged, but the reporting is being overhauled. Two readers score each essay from 1-6 in four domains Ideas Analysis, Development Support, Organization, and Language Use Conventions. A student can receive a total of 8 to 48 points from the readers. On the test administrations from  September 2015 to June 2016 this raw score was converted to a 1-36 scale to match the scaling process used in the primary ACT subject tests. The mean, distribution, and reliability, however, were fundamentally different for Writing than for English, Math, Reading, and Science. ACT should not have used the same scale for scores that behaved so differently. The proposed change is to go back to a 2-12 score range used prior to September 2015. In yet another confusing twist, though, the 2-12 score range for 2016-2017 is very different than the one used in 2014-2015. Whereas the old ACT essay score was simply the sum of two readers holistic grades, the new 2-12 score range is defined as the average domain score. The average is rounded to the nearest integer, with scores of .5 being rounded up. For example, a student who receives scores of {4, 4, 4, 5} from Reader 1 and {4, 4, 5, 4} from Reader 2 would receive domain scores of {8, 8, 9, 9}. The students overall Writing score would be reported as a 9 (34/4 = 8.5). Will the ELA score change? In September 2015 ACT began reporting an ELA score that was the rounded average of English, Reading, and Writing scores. They also began reporting a STEM score that was the rounded average of Math and Science. Now that the Writing score is no longer on the 1-36 scale, it would seem that the ELA scoring would need to change. Except that ACT doesnt want it to change. In effect, they are preserving the 1-36 scaling of Writing  buried within the ELA calculation. If you cant see it, you cant be confused seems to be the message from ACT. Needless to say, we feel that the confusion exists on the other end of the line. What does this mean regarding my plans to re-test in September? Will my February scores be converted into the new 2-12 scores? In general, students should not be making re-testing plans based solely around ACT Writing scores. If you were  planning on repeating the ACT in the fall, the score reporting change should not change your mind. If you are satisfied with your scores, you should tune out any hubbub surrounding the new reporting. ACT has produced example student, high school, and college score reports corresponding to the September 2016 updates (some subscore categorizations are also changing!). ACT will continue to report scores by individual test date  (College Board, in contrast, will report all of a students scores with each report). A students old 1-36 scores will not be changed. If this is just a reporting change, how could it impact my ranking in any way? This is where things start to get really confusing. There are any number of intersecting issues: percentiles (new, old, 1-36, 2-12, ELA), concorded scores, the scaling of individual test dates, and rounding artifacts. What sort of problems can occur with averaging and rounding domain scores? With the 1-36 scale, there were obviously 36 potential scores (although not all test forms produced all scores). On the 2-12 reporting, only 11 scores are possible. The tight range of scores typically assigned by readers and the unpredictability of those readers means that reader agreement is the exception rather than the norm. Even small grading differences can create large swings. For example,  a student with scores of {4, 4, 3, 4} and {4, 4, 3, 3} would have a total of 29 points or an average domain score of 7 (7.25 rounded down). Had the student received even a single additional point from a single reader on a single domain, her scores would have added to 30 points and averaged 8 (7.5 rounded up). This  seemingly inconsequential difference in reader scoring is the difference between the 84th percentile and the 59th percentile. And her readers were in close agreement! ACT Writing scores are clustered around the mid-range. Readers gravitate toward giving 3s, 4s, and 5s.  According to ACTs percentile data for the 2-12 reporting (see below), 65% of students 2-12 scores will be 6s, 7s, or 8s. Average Domain ScoreCumulative Percent 21 32 49 518 640 759 884 993 1098 1199 12100 Visually, You can see how compact the range is, as well. Less than 5% of test-takers receive a 1, 2, 11, or 12. Isnt it a good thing that the essay is reported in gross terms rather than pretending to be overly precise? Yes and no. The essay is a less reliable instrument than the other ACT tests and far less reliable than the Composite score. A criticism of the 1-36 scale for Writing was that it pretended a level of accuracy that  it  could not deliver it is, after all, only a single question.  What the change cannot do, however, is remake  the underlying fundamentals of the test. The bouncing around of scoring systems has made ACT encourage  the use of percentiles in gauging performance. Percentiles cannot improve the reliability of a test. Percentiles cannot improve the validity of a test. Percentiles like scaled scores can easily provide a false sense of precision and ranking. To understand how this would work, take the most extreme example completely random scores from 2 to 12. Even though there would be no value behind those scores, someone receiving a score report would still see a 7 as the 55th percentile and a 9 as the 73rd  percentile. The student with the 9 clearly did better, right? Except that we know the scores were just throws at the dartboard. The percentile difference seems meaningful, but it is just noise. ACT Writing scores are not random (although it may sometimes seem that way), but the tests reliability is well below that of other ACT subjects. Percentiles are not the silver bullet of score interpretation.  In a display of misleading precision, ACT released cumulative percents tallied to the hundredths place for a test that will have only 11 score buckets and a standard error of measurement of 1.  College Board, by contrast, has opted to provide no norms for its new essay scores. What percentiles should students and colleges use and believe? Another problem with percentiles is that they are dependent upon the underlying pool of testers. It is interesting to note that students performed better on the ACT essay than ACT originally estimated. That may seem like a good thing, but it means that the newly released percentiles are more challenging. A 22, for example, was reported as 80th percentile when the 1-36 scale was introduced in 2015. The newly released data, however, shows a 22 as the 68th percentile. The chart below shows how graphs of the cumulative percentiles differ across the range. This difference  is unrelated  to the new scoring its the difference between the numbers ACT had been touting from its pilot study versus the actual results from the last year of testing. Its clear on a number of counts that ACT misjudged student performance. They   either set the mean below where it should have been (20-21 to match the other subjects), or they tried to set it in the right place and were waylaid by readers grading more harshly than expected. The last option is the most difficult to believe they set the mean well below the other subject means, not realizing the confusion that would be caused. Its unknown how the percentile  figures may have evolved  over time. Was it an advantage or a disadvantage to have more experienced readers by the February and April tests? Students making judgments about scores would have had no way of knowing the actual score distribution. In fact, they would have received the incorrect percentile tables with their reports. Presumably the new tables will be used when presenting scores to colleges, although ACT has not yet clarified that point.  The table below shows new and old percentile figures and the difference. ACT Writing Score 1-36 ScaleCumulative Percent (Stated Summer 2016)Cumulative Percent (Stated Summer 2015) Difference 1121 2121 3121 4132 5242 6264 7385 8385 97136 109167 1111198 1215238 13183113 14213514 15253712 16344410 17405212 18445814 19526311 20586810 21647410 22688012 2378835 2486882 2588902 2691932 2794951 2895950 2997970 3098980 3198980 3299990 3399990 3410099-1 3510099-1 3610099-1 The visualization of the student numbers at each score shows how haphazard things are. Because of how the test is scored and scaled, certain scores dominate the results. Among the various form codes, almost 1 in 10 test takers ended up with 23 and another 8% at 24. These happen to be common scores when high Composite scoring students complain about low Writing scores. The low- to mid-20s is not that out of character for a 30+ student. Will colleges use ACTs concordance or calculate the average domain score? Will the results always be the same? By trying to give a variety of ways of thinking about Writing scores, ACT seems to be confusing matters more in its 5 Ways to Compare 2015–2016 and 2016–2017 ACT Writing Scores white paper. If there are so many ways to compare scores, which one is right? Which one will colleges use? Why dont they all give the same result? Most students are familiar with the concept that different raw scores on the English, Math, Reading, or Science tests can produce different scaled scores. The equating of forms can smooth out any differences in difficulty from test date to test date. When ACT introduced scaling to the Writing test, it opened up the same opportunity. In fact, we have seen that the same raw score (8-48) on one test can give a different result on another test. Not all prompts behave in the same way, just as not all multiple-choice items  behave in the same way. This poses a problem, though, when things are reversed. Suddenly ACT is saying to ignore all that scaling nonsense and just trust our readers. Trusting the readers helped get ACT into this mess, and ignoring the scaling is hard to do when an estimated one million students  have already provided  scaled Writing scores to colleges. Because of the peculiarities of scaling and concordances, the comparison methods that ACT suggests of calculating a new 2-12 score from an old score report versus using a concordance table can produce differing  results. On the April 2016 ACT, a student with reader scores of {4, 3, 4, 3} and {4, 4, 4, 3} would have a raw score of 29 and would have received a scaled score of 21.  In order to compare that score to the new score range, we could simply take the rounded average of the domain scores and get 7 (29/4 = 7.25). An  alternative provided by ACT is to use the concordance table (see below). We could look up the 21 scaled score the student received and find that it concords to a score of 8. Same student, same test, same reader scores, different result. Here is where percentiles can give false readings, again. The difference between a 7 and an 8 is the difference between 59th percentile and 84th percentile.  Thats a distressing change for a student who already thought she knew exactly where she had scored. It would seem as if directly calculating the new 2-12 average would be the superior route, but this neglects to account for the fact that some prompts are easier than others the whole reason the April scaling was a little bit different than the scaling in September or December. There is no psychometrically perfect solution; reverting to a raw scale has certain trade-offs.  We cant unring the bell curve. Below is the concordance that ACT provides to translate from 1-36 scaled scores to 2-12 average domain scores. Scaled 1-36 ScoreConcorded 2-12 Score 12 22 32 43 53 63 73 84 94 104 115 125 135 146 156 166 176 187 197 207 218 228 238 248 259 269 279 2810 2910 3010 3111 3211 3311 3412 3512 3612 Will I still be able to superscore? What will colleges do? Students faced with great composite scores and weak essay scores have faced a re-testing dilemma. Many have hoped that more colleges would announce superscoring of Writing scores. Unfortunately, the scoring change does nothing to alleviate the  dilemma. By making it even harder for colleges to have a uniform set of scores in its files, the new reporting  decreases the likelihood that Writing scores from 2015-2016 will be superscored with those from 2016-2017. Would  colleges superscore the rounded, domain average? The concorded score? What ACT has effectively precluded them from doing is using the 1-36 scale as the benchmark for all scores. In the long-run the demise of the 1-36 essay score is a  good thing. In the short-run, it leaves the class of 2017 with even more headaches. Will I still be able to get my test rescored? ACT has not announced any changes to its rescoring policy. You can request a rescore of your essay within 3 months of your test date for a $50 fee. The fee is refunded if your  score is changed. Scores will never be lowered due to a rescore. ACT almost certainly took note of the increased requests it was receiving for rescores and the increased number of refunds it was issuing for changed scores. The shift to 2-12 scoring makes it somewhat less likely that a rescore will result in a change (fewer buckets). Students  can work with their school counselor to obtain a copy of their essay and decide if a rescore is merited. Is this even about college admission? Not really, but ACT wont admit it. Less than 15% of colleges will require ACT Writing from the class of 2017; most put little weight on it; and the nature of Writing scores means that the distinctions between applicants rarely have meaning. The real target is the state and district consumer. The difference between a 7 and an 8 might not indicate much about an individual student, but if one large high school in your district averages 7.2 and another averages 7.8, the difference is significant. Domain scores may be able to tell state departments of education how their teachers and students are performing in in different curricular areas. Increasingly, states and districts are paying for students to take the ACT (or SAT) in order to make all students college ready or  to fulfill testing mandates. ACT and College Board view this as a growth opportunity that potentially extends across all of K-12. The school and district consumers care more about converting scores because the longitudinal data matters. They want to be able to compare performance over time and need a common measuring stick. The sudden introduction of the new scaling in September 2015 and the sudden reversal for September 2016 has undercut the credibility of a test that colleges had already viewed dubiously.  For two classes in a row admission offices have had to interpret two different sets of scores from the same students. They will be facing the third type of Writing score before they had a chance to adjust to the second. We expect colleges to trust the ACT Composite and test scores that they have used for decades and to take a wait-and-see attitude toward essay scores.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.