The summer of 2011 I made this web site called naldaramjui, which means flying squirrel in Korean. It’s mostly an interface to the Korean government’s test of proficiency in Korean, TOPIK. Just lots of multiple-choice questions. I had always sort of intended to make it even more awesome and do something with the data recorded from people using it, but life happened and Google App Engine is kind of a pain anyway, so I never got the data out – UNTIL NOW.
I think it could be super fun to spend some time analyzing this data properly. Doing the most naive analysis, I can find that the hardest question is advanced listening question 18, and the easiest question is beginner grammar question 17. In the case of the beginner question, I think it might be that it’s at the sweet spot where both it isn’t terribly hard and only people who are pretty good for beginners make it to question 17 – that’s the kind of thing that makes the simple analysis silly. Oh yes, this is going to be a fun data set to play with. Feel free to join in – data is on github!