After all the preceding, it might be interesting to look at some totals by subject, but I think the time has come to look at the school grade level numbers directly. This analysis with the number of tested students should have some similarity to eventual analysis of average scores themselves, if all goes well.

This is another Tukey mean-difference plot, showing changes in number of tested students in Math. The top panel randomly pairs records to show a sort of baseline maximal variation. Grades three to five have less top end because elementary grades tend to be smaller than than middle school grades. The middle panel shows changes in numbers for the same grade at the same school, between years. So for example if a school had 40 grade 4 math tests in 2008 and 50 grade 4 math tests in 2009, that’s a change of +10 for grade 4. The bottom panel shows changes in numbers for the same cohort at the same school, between grades (which is also between years, of course). For example, if a school had 40 grade 4 math tests in 2008 and 42 grade 5 math tests in 2009, that’s a change of +2, which will show up in the grade 5 sub-panel. Since testing starts in grade 3, there isn’t any change to be observed for the cohort into that grade.

As expected, there is typically more variation in *grades* than in *cohorts*. This is true even for the cohort change into grade 6 versus the grade 6 year to year, though it’s closer (MAD 10 vs. 9). We know that schools tend to change more from grade 5 to grade 6, generally – students could come from an elementary school and enter a K-8 school, for example, and that would show up here. There are conspicuous outliers for cohort changes especially going from fifth to sixth grade and from sixth to seventh. This is a little weird. They are almost all positive, meaning that a school seems to have started testing a lot of students in seventh grade, for example, who weren’t tested in sixth grade. It could also be that a school has a big influx of sixth graders from other schools, as another possibility. These outliers suggest that some care should be taken in later analysis of test scores – perhaps a check that numbers tested are more or less similar.

Because of the outliers, changes for grades and changes for cohorts actually have similar overall standard deviations (about 20) but the median absolute deviation for grade changes (10) is twice what it is for cohort changes (5).

This figure showing the same things but for ELA testing looks very similar but is in fact different. (See code.) The extreme similarity even for outliers may be reflecting that Math and ELA testing go hand in hand, so what happens for one also happens for the other. Hopefully this can be taken as evidence that apparent cohort swings are due to student movement and not changes in whether particular students were tested or not. It might be worth tracking down explanations for some of these strange patterns anyway, but I won’t pursue this for the moment.

NYC standardized test results: Changes in average scores for school grades and cohorts – Plan Space from Outer Nine

NYC Test Data – Plan Space from Outer Nine