I was recently pointed to the 2010 USDOE/Mathematica paper that shares this post’s name. One has to think that Rockoff and Kane et al. have seen it, but nobody seems to ever mention it. From the abstract:
Simulation results suggest that value-added estimates are likely to be noisy using the amount of data that are typically used in practice. Type I and II error rates for comparing a teacher’s performance to the average are likely to be about 25 percent with three years of data and 35 percent with one year of data.
Those numbers are awful. What’s interesting to me is that they aren’t even that much better when using three years of data, as compared to one year. I had thought they would probably improve a lot with triple the data. They just stay bad.
This study used a fairly simple VAM, similar to what they do in Tennessee, so that’s one possible critique. But the fact is, this is the only research I’ve seen that seriously attempts to address the trustworthiness of VAM at the teacher and school level. Everybody else seems to be ignoring it, as if there’s no cost to making arbitrarily incorrect judgements about teachers’ work. This paper is worth a look.