Quizz Quotes

I was exploring Google Papers the other day and came across Quizz: Targeted Crowdsourcing with a Billion (Potential) Users by Ipeirotis and Gabrilovich. Downside: occasionally reads like a Google ad. Upside: really interesting results from an experimental Q&A system which is still live. It’s very cool. Here are some quotes with my commentary:

… the strong self-selection of high-quality users to continue contributing, while low-quality users self-select to drop out.

… there is little incentive for unpaid users to continue participating when there is no monetary reward and they are not good at the task.

The goal of the system was not educational, so they celebrate the fact that it isn’t fun if you suck.

These results indicate that users may be more interested in learning about the topic rather than just knowing whether they answered correctly.

The results included that people answer more questions when the interface shows the correct answer as “feedback” rather than just showing “correct” or “incorrect.” This section of experimental results was particularly interesting, including commentary on possible failures of leaderboards.

… as more and more users participate, the achievements of the top users are difficult to match, effectively discouraging users from trying harder.

They did say that a leaderboard including only the last week’s worth of results was more effective.

I’m less interested in the application of this kind of system for crowd-sourcing information, more interested in educational applications, but there is some clear overlap, and cited papers such as The multidimensional wisdom of crowds seem very interesting. Also through Ipeirotis’ blog I found out about Smarterer, which is interesting as well. There’s some sort of spectrum, or multi-dimensional thing going on, with education, crowdsourcing, and evaluation all in the mix.

The authors’ application of information gain and a Markov Decision Process are also interesting.

Writing to think: Questions on the web

I have made some things online that involve “asking and answering questions” in the traditional multiple-choice-test way. I built the software to do that (with Python on Google App Engine, again differently with node.js on Heroku) both times.

Is there any “built in” web element for questions and answers of the types I’m thinking of? There are HTML forms. HTML forms provide pretty much flexibility, and even start to have some functionality for different question structures – radio buttons for a single choice vs. checkboxes for multiple selections. But HTML forms, being just HTML, have pretty clear limits. Javascript can add some more functionality, and then eventually you need a web server backend of some kind to support more.

There are web services like Google Forms and SurveyMonkey, and the very task-specific Doodle, which take all of HTML/Javascript/backend and run it all for you. This means that the available functionality is whatever they provide, everything is hosted by them, and as far as I know there is little or no mechanism for creating things outside of their web GUIs.

The popular services just mentioned mostly collect information without any feedback; when you want to have a “correct” answer there isn’t much functionality. Where is a good existing solution? There’s internet detritus like MakeaQuiz.net. There’s Quizlet, which seems pretty neat but also isolated perhaps by its attempt to chase education spending. (It also supports, like most education sites, an unhealthy distinction between student and teacher.)

The desire for profit seems to poison projects that could otherwise have a broader positive effect. Projects affiliated with the very cool JiTT methodology disappeared into companies. I’m not even sure what sort of thinking led to the closing of the Khan Academy source.

But it isn’t just the profit motive that keeps question-and-answer technology balkanized; there’s no real standard, and I don’t think it’s very easy to come up with one. The systems I built aren’t easily transferred anywhere for use by others, for example. This is my fault, but I also don’t think it’s a very easy thing to design.

There are some attempts at standards for questions, at least. BlackBoard has a way to load questions from some tab-delimited formats. Moodle has something called GIFT. There’s the Question and Test Interoperability spec, which is such a huge mess you need to employ a stapler guy to support it. And there’s something called QUOX. Oh my.

And these are all purely for assessment, where earlier there were some purely for survey/data collection. It seems to me that they shouldn’t be so different. Fundamentally isn’t it all just questions?

Another take on this, I suppose, is sites like Stack Overflow, which represent a different sort of questioning. And there is OSQA, “the Open Source Q&A system”, which is cool. You could run that on your server, or for that matter run Moodle, or some survey platform, most likely. So that’s also another delivery model: the run-your-own-server-with-pre-built-software model. A lot of setup/maintenance overhead, and still not a lot of interoperability as far as I can tell. (OSQA is also available hosted.)

Just one more: There are also frameworks for building assessments, which try to generalize while still providing some structure. I was happy to find out about the one linked, for Rails; I don’t know if there are others or if any are widely used.

Markdown is pretty much the best thing ever. (Note to self: get off wordpress…) Can we come up with a markdown solution to the question problem? Something super light-weight, that blends easily into text files that humans would actually write…

The kramdown (etc.) markdown extension for definition lists seems like a candidate. Here’s how it works:

This is the "term".
: This is the "definition".

Get’s rendered something like this, using the standard HTML definition list tags:

This is the “term”.
This is the “definition”.

So let’s say the term is the question, and the (possibly many) definitions are answer choices. Of course we could have a blank definition represent a text box (or text area):

What do you think?
:

A multiple-choice survey could be as easy as this then:

What's your favorite color?
: red
: blue
: green

To add correctness functionality, a little more syntax could be added:

Sugar is sweet.
: true*
: false

The idea here is that these text files would be rendered into interactive HTML/Javascript such that you wouldn’t see which was the correct answer – you would select an answer, possibly have a submit button of some kind, and get feedback on whether your answer agreed with the one in the text. I do think that teacherly paranoia about “test security” is one thing that prevents good functionality from spreading much on the web. Nobody wants to share their oh-so-secret correct answers, lest the horrible children cheat. I think this perspective is a disease on society.

Maybe this could be a short answer question:

What is the capital city of Wisconsin?
:
Madison

Of course you have the problems of evaluating text answers (Is “Madison, WI” also correct? etc.). Generally, there is of course an awful lot of functionality that you want from questions, and it may be hard to reduce it all down. Some things should be obvious: true and false is a special case of multiple choice. But other things like scoring, when/whether to show the correct answer, etc. seem difficult to abstract very far.

The text questions could be rendered as stand-alone HTML/Javascript, or to connect with (or even be hosted on) some sort of web system. More details would have to be worked out.

The illustrious Ramnath, who always seems to be doing cool things several years before I know about them, has thought about this markdown question idea to some degree. I want to find out more about what he’s done.

doge coding: much wow

I have recently come across two more or less doge-titled educational resources for coding. This definitely constitutes a trend.

happy sun

First up is Learn You a Haskell for Great Good!. I’m pretty sure the title includes the exclamation point. It’s a free book about Haskell, of course. (You can also buy it if you want.)

Last up is Learn You The Node.js For Much Win!. Same deal with the exclamatory title. This one is a command-line interactive tutorial about node.js that runs on workshopper. I found out about this after first hearing about a similar thing for git called git-it.

I, for one, would love to see these somehow form the basis for an entire line of amusingly titled “Learn you” books (and so on).

Data from naldaramjui.com

The summer of 2011 I made this web site called naldaramjui, which means flying squirrel in Korean. It’s mostly an interface to the Korean government’s test of proficiency in Korean, TOPIK. Just lots of multiple-choice questions. I had always sort of intended to make it even more awesome and do something with the data recorded from people using it, but life happened and Google App Engine is kind of a pain anyway, so I never got the data out – UNTIL NOW.

I think it could be super fun to spend some time analyzing this data properly. Doing the most naive analysis, I can find that the hardest question is advanced listening question 18, and the easiest question is beginner grammar question 17. In the case of the beginner question, I think it might be that it’s at the sweet spot where both it isn’t terribly hard and only people who are pretty good for beginners make it to question 17 – that’s the kind of thing that makes the simple analysis silly. Oh yes, this is going to be a fun data set to play with. Feel free to join in – data is on github!

Aims of Education

Bret Victor put up his links for 2013 and one of them particularly caught my eye: Why education is so difficult and contentious, by Kieran Egan, which says that there are three conflicting goals of education. This reminded me of a paper I wrote back when I was getting my master of arts in teaching. I was able to find the paper, and sure enough I had been responding to related writing also by Egan. The difficulty with identifying the goal of education, I think, is that it’s ultimately a pretty big question. What is the meaning of life? Why are we here? In my paper I tried to move toward articulating a coherent goal of education. I think the paper did not really succeed completely, but it was a step. Here it is, exactly as it was in 2006:

blank

The Aim of Education

blank

Man is a tame or civilized animal ; nevertheless he requires proper instruction and a fortunate nature, and then of all animals he becomes the most divine and most civilized ; but if insufficiently or ill educated, he becomes the savagest of earthly creatures.  Wherefore the legislator ought not to allow the education of children to become a secondary or accidental matter.

(Plato, quoted from Laws in Frank, 1947, p. 291)

blank

The ideal aim of education is creation of power of self-control.

(Dewey, 1938/1997, p. 64)

blank

The safest general characterization of the [Western] philosophical tradition is that it consists of a series of footnotes to Plato.

(Whitehead, 1929/1978, p. 39)

blank

Education as an institution, and particularly our typically requisite kindergarten through twelfth grade, is largely accepted as a necessary and good thing.  The mission of the U.S. Department of Education includes “assuring access to equal educational opportunity for every individual” and promoting “improvements in the quality and usefulness of education” (Department of Education Organization Act, 1979).  It does not specify, however, what use it is that education should have.  This ambiguity of purpose can cloud discussions about education.  To evaluate how well a system achieves its goals, it must first be agreed what the goals are.

Several considerations should inform a discussion of education’s purpose.  First there is the question of how many such purposes there may be.  If there are multiple objectives then in order for education to succeed they must be compatible.  Second, philosophy and pedagogy must remain clearly differentiated.  An aim of education is not the same thing as a pedagogy.

The underlying question of education’s fundamental aim has unfortunately been neglected and misunderstood.  Students ask why they have to go to school, and teachers should have a good answer.  This paper considers the work of educational theorists’ in order to identify a consistent philosophy of education.

Egan (1997) considers the modern school as developing contemporaneously with the modern hospital and prison.  Concerning prisons, he asserts that they have in the West “two aims – to punish and to rehabilitate,” and that the incompatibility of these aims leads to difficulties with the system’s implementation (p. 10).  However, it could also be said that the single aim of the modern prison is to reduce crime, in which case there is no conflict in the ends but only in the means.  Alternatively, it could be said that the aim of the prison is simply to house certain individuals away from the general population, for reasons and durations determined by other systems.  It is important to articulate goals at an appropriate level of specificity.

Another modification to Egan’s aims for prisons would be to classify rehabilitation as punishment, in which case there is one goal and it is punishment, or alternatively that punishments contribute to the one goal of rehabilitation.  Either argument, if accepted, collapses Egan’s two aims to just one and eliminates the alleged conflict.  If two goals are truly distinct, it is important to establish this clearly.

After prisons, Egan goes on to construct his dilemma for education, saying that the modern school has not two but three conflicting aims.  The first of these is socialization, “the homogenization of children” and the production of “a skilled workforce of good citizens” (p. 11).  The second aim is labeled Platonic and focuses on “learning those forms of knowledge that would give students a privileged, rational view of reality” (p. 13).  Egan attributes the third aim to Rousseau, with “focus on fulfilling the individual potential of each student” in accordance with “the nature of students’ development, learning, and motivation” (p. 15-16).  After arguing the incompatibility of these three, Egan suggests a fourth of his own, sketching a sort of Vygotskyan recapitulation theory.

There are several problems with Egan’s taxonomy.  Prime among these is a forced mischaracterization of Plato, with whom Egan groups E. D. Hirsch.  Even in Egan’s own words, Plato hoped for students to acquire “the ability to reflect on ideas, to pull them this way and that,” not only to acquire an inert body of knowledge (p. 13).  Plato and Hirsch may be similar in as much as they recommend a curriculum, but they have very different aims.  Hirsch’s ideas clearly fall in Egan’s first category of socialization – his concern is with transmitting a homogenizing “cultural literacy” in order to ensure, for example, a “common reader” (Hirsch, 1987).

Plato’s goals for education are being confused with his pedagogy in Egan’s interpretation.  The aims should be considered independent of the nature or effectiveness of the curriculum.  Regarding his underlying philosophy of education, scholars of The Republic have noted that Plato was intensely aware of the importance of the environment in the development of individuals, and that since the uncultivated environment is often imperfect, it “can be counteracted only by creating a power for good as penetrating, as unconscious, and as universal; and to do this is the true function of a public system of education” (McClintock, 1880/1968, p. 7).  Bosanquet offers that Plato’s school is “the space and atmosphere needed for the human plant to throw out its branches and flowers in their proper shape,” which sounds very much like Egan’s third category (1900, p. 12).

Indeed, Egan’s second, “Platonic” category is a phantom, on the one hand aggrandizing socialization and on the other trivializing the developmental goals shared by Plato and Rousseau alike.  Egan shares these goals as well, and the “new idea” he offers is pedagogical, a different route but not a different destination.  Egan notes that Rousseau saw his work as “a kind of supplement to The Republic,” and upon examination it is clear that there is, beyond pedagogy, no underlying conflict between Plato and the theorists of Rousseau’s tradition (p. 15).

After clarification of Egan’s four aims there are only two.  The first is the expansion of the socialization category to include the transmittance of any inert knowledge, social or otherwise.  The important note here is that transmittance of knowledge as a goal is different from the use of transmittance of knowledge to achieve other goals.  This goal of transmittance is very specific.  It is the goal of Hirsch, the philosophy of Thorndike, the Lockean model of “the child as empty vessel or blank slate” (Lillard, 2005, p. 9).

The second category of aim is that of Rousseau, Plato, and Egan, which has yet to be fully articulated.  The lack of explicit language about this alternative fundamental aim means that in some sense “educators must establish new goals for learning” (Grabinger, p. 667).  This process can begin with a review of the major relevant educational theorists starting, as Egan indicates, with the work of John Dewey.

Dewey, “the man acknowledged to be the pre-eminent educational theorist of the twentieth century,” has conspicuously little to say about fundamental aims in his Experience and Education (1938/1997).  It reflects the characteristic neglect of the issue that there is so little in this work explicitly addressing it.  One might think the reason for this is that it is taken for granted, as a closed question, but it is then a bit surprising when Dewey explicitly claims that “The ideal aim of education is creation of power of self-control” (p. 64).

This unexpected announcement comes in a discussion of Dewey’s notions of social control and the true nature of freedom as freedom of thought, not freedom of action.  He argues that thinking is “a postponement of immediate action” (p. 64).  Much more recent evolutionary theory such as that espoused by Devlin suggests that “off-line thinking,” thought not connected to physical action, is a key factor differentiating humans from other animals (2000).

Other more recent educational theorists have echoed Dewey more or less subtly, as illustrated for example in the title of Bandura’s 1997 Self-Efficacy: The Exercise of Control.  It was also Bandura who said that “self-reflection is the most uniquely human characteristic,” (qtd. in Pajares) aligning with Rousseau’s naturalism in the modern theory of Devlin and others.  This focus on self-control as the product of education, connected as it is to evolutionary psychology, can then be viewed as a result of following Rousseau’s advice to “fix your eye on nature [and] follow the path traced by her” (qtd. in Egan, p. 16).  Piaget said that scientific knowledge is “drawn in large part from common sense,” referring to his fundamental hypothesis of genetic epistemology, “that there is a parallelism between the progress made in the logical and rational organization of knowledge and the corresponding formative psychological processes.” (1968)  This view of knowledge implies that anything one might learn is in some sense a result of the nature of humanity – an interesting formalization of the view that education makes us “more human.”

The two goals then can be contrasted as external versus internal.  The first goal, previously called socialization, is to transmit external knowledge into the student, enforcing social rules, for example, from the outside.  The second goal, of Rousseau, Dewey, and the rest, is to develop the student’s natural internal mental faculties.  One might question the extent to which these goals are incompatible, or, if one were Hirsch, whether the second goal is realistically possible, meaningful, or adequate.

Dr. Maria Montessori experimentally developed a system of education that focuses on the work of students, with minimal transmittance in the usual sense of moving information from teacher or textbook into the student.  She observed that in the proper environment discipline “sprang up spontaneously” and that from a student body of children judged passive or domineering, good or bad, “there remain[ed] only one kind of child” (1967, p. 202).  She called this phenomenon “normalization,” and characterized its discovery as “the most important single result of [her] whole work” (p. 204).  In as much as the goal of Hirsch is a behaviorist goal, concerned with instilling good (socially acceptable, beneficial, hard-working) behavior in students, Montessori provides empirical observations that such behavior can arise from students themselves, if the school setting is appropriately constructed.  Plato’s interpreters would agree that in the “proper space and atmosphere” emerges “this divine humanity which is in the truest sense the self” (Bosanquet, 1900, p. 12; McClintock, 1880/1968, p. 22).

Our faith in human nature can extend beyond behavior of this kind, and in particular to language.  Human children learn to speak and understand their native language without being explicitly taught – an often neglected marvel.  Montessori herself noticed “the many wonders of the language mechanism,” anticipating in some ways the modern linguistics often identified with Noam Chomsky, that “there is something special about the human mind that equips it to acquire language” (Montessori, 1967, p. 116; O’Grady, Archibald, Aronoff, & Rees-Miller, 2005, p. 390).  In this view, language is not learned only from external sources, but is to a degree inborn in all humans as a “Universal Grammar.”  This theory is supported by a number of arguments, as well as the apparent existence of a critical period for language learning (O’Grady, Archibald, Aronoff, & Rees-Miller, 2005).

One’s native language is typically learned before schooling begins, but the nativist idea extends beyond language.  For example, the principal claim of Devlin’s The Math Gene is that “the feature of our brain that enables us to use language is the same feature that makes it possible for us to do mathematics” (p. 70).  Since everyone has the capacity for language, everyone has the capacity for mathematics – a traditional bastion of education.  The reason people appear to vary so much more in mathematical fluency than in linguistic fluency is then a combination of the differences in environments that people are exposed to and the unnatural forms that are imposed by traditional education on mathematics.

At least in these three areas – behavior, language, and mathematics – theory and research support the view that they are natural human abilities and so are definitely developed in accordance with the second or naturalist goal for education.  Generalizing broadly, it can be said that anything humans do, no matter how far removed from the struggle for survival, is by definition a naturally human behavior.  In this way anything one might want to transmit to a student was produced by someone, and could also be produced by the student.  There is no real conflict between “natural” and “school” capabilities; the second goal can achieve the aims of the first as well.

The converse, that inculcation of knowledge should develop the mental faculties of the student, is not so clear, and in fact it seems not to be true in general.  Research indicates that “knowledge acquired in abstract circumstances without direct relevance to the needs of learners is not readily available for application or transfer to novel situations” (Grabinger).  It seems that learning many facts or examples does not always lead to mental generalized concepts or what Beers and others call metacognitive skills.  As Beers asked a teacher who had just spent a class explaining one short story to her class, “what did you show them that they could use with another story?” (Beers, 51)

Standardized tests provide a good illustration of this point.  Although of course such tests vary in quality, test-takers with advanced cognitive skills provided even meager content exposure can generally be expected to test well.  Those with extensive externally collected content knowledge may have positive results on the relevant tests, but will not necessarily develop cognitive skills in the process of internalizing this data.  Their knowledge is not a transferable, applicable, and hence not a useful, thing.  So while well-written standardized tests may provide a rough measure of cognitive ability, they encourage or at least allow the test-specific alternative.

Important mental skills are not served well by this first goal.  As Grabinger reports, there is a difference between intentional and incidental learning (p. 672).  Hirsch’s own 1898 dissertation research showed that a teacher’s instruction in one specific task did not help students perform other related tasks.  Hirsch concluded that the problem was with the human mind rather than with the nature of the instruction or evaluation.  The Deweyan goal for education provides a necessary alternative to this bleak outlook.

Jerome Bruner “take[s] as perhaps the most general objective of education that it cultivate excellence” and explains that this goal means “helping each student achieve his [or her] optimum intellectual development” (1960/1977, p. 9).  What is meant by “excellence” or “optimum intellectual development” is not necessarily well-defined.  Firstly, “optimum” may imply a stopping point.  This language can however be alternatively understood to not contradict the common sentiment that one should always “desire to go on learning,” so that it is this process of learning that proceeds optimally rather than merely reaching an optimal level (Dewey, 1938/1997, p. 48).  Secondly, specifying each student’s optimum begs the question of how much students might vary and how this should be taken into consideration.

Since education cannot very well affect the genetic nature of students, one would hope that the natural endowments of students would not vary too wildly, and this appears to be the case.  The three examples above, most notably the cognitive capability for mathematics, are human universals.  Humans are genetically more similar than different, and the degree to which internal characteristics appear depends largely on the outside environment (Ridley, 2003).  Across many fields, “the preponderance of psychological evidence indicates that experts are made, not born” (Ross, 2006).

Having analyzed the developmental aim of education, some consideration of how it may affect the substance of education is in order.  Neither category of goal specifies a particular classroom, content, pedagogy, and so forth, but the second goal does not even specify that there might be a fixed curriculum at all.  However, neither does it preclude fixed content or systems of instruction – in fact all the major theorists demand structure of some kind in education (eg. Dewey, Montessori).  While it is not specified that schools should include a fixed list of disciplines for study, the second goal certainly allows for them, and provides an overarching principle for organizing and motivating their instruction.  Considering the example of language, we see that children learn to speak and understand through immersion in speech communities in which they are free to participate fully – this can provide a model for what we call “learning communities.”  Schools provide communities that might not be available elsewhere, identified by rigorous, disciplined thought, where teachers should provide examples of the appropriate fluencies.

The educational goal of developing students’ intellectual abilities is well-founded in this analysis.  Students possess natural capabilities that may be brought out by education, and in fact anything that might be taught can be so developed naturally.  It is a more robust goal as compared to the transmittance of knowledge alone, since it can effectively encourage the other while the reverse is questionable.  It is a goal that can benefit all students, and it is the goal advocated, explicitly or not, by nearly all modern philosophers of education.  Arriving at this goal provides a solid foundation from which to build educational programs and evaluate existing systems, but it is clearly a starting and not a finishing point.  Important questions of what intellectual faculties are and how they can be developed remain, only hinted at herein.  As with all things, it is important to identify and understand the questions before one can reasonably hope to find answers.

blank

We are all aware, probably, that the word “school” is derived from a Greek word meaning “leisure.”  This conception of “leisure” is one of the greatest ideas that the Greeks have left us.  It is not that of amusement or holiday-making.  It is opposed both to this and to the pressure of bread-winning industry, and indicates, as it were, the space and atmosphere needed for the human plant to throw out its branches and flowers in their proper shape.  “To have leisure for” any occupation, was to devote yourself to it freely, because your mind demanded it ; to make it, as it were, your hobby. It does not imply useless work, but it implies work done for the love of it.  In the modern world leisure is a hard thing to get ; and yet, wherever a mind is really and truly growing, the spirit of leisure is there.  It is worth thinking of, how far in education the idea of the growth of a mind can be made the central point, so that the things which are considered worth teaching may really have time to sink into and to nourish the whole human being, morally and intellectually alike.

(Bosanquet, 1900, p. 11-12)

blank

References

Beers, K. (2003). When kids can’t read: What teachers can do. Portsmouth, NH: Heinemann.

Bosanquet, B. (1900). Introduction. In The education of the young in The Republic of Plato (pp. 1-23) [Introduction]. London: Cambridge University Press.

Bruner, J. S. (1977). The process of education. Cambridge, MA: Harvard University Press. (Original work published 1960)

Department of Education Organization Act, 20 U.S.C. § 3402 (1979), http://www4.law.cornell.edu/‌uscode/‌html/‌uscode20/‌usc_sec_20_00003402—-000-.html.

Devlin, K. (2000). The math gene: How mathematical thinking evolved and why numbers are like gossip. Basic Books.

Dewey, J. (1997). Experience and education (Touchstone ed.). Kappa Delta Pi lecture series. New York: Simon & Schuster. (Original work published 1938)

Egan, K. (1997). Three old ideas and a new one. In The educated mind: How cognitive tools shape our understanding (pp. 9-32). Chicago: The University of Chicago Press.

Frank, S. (1947). Education of women according to Plato. In Plato’s theory of education (pp. 287-308). New York: Harcourt, Brace and Company.

Grabinger, R. S. (n.d.). Rich environments for active learning.

Hirsch, E. D., Jr. (1987). The practical outlook. In Cultural literacy: What every American needs to know (pp. 134-145). Boston: Houghton Mifflin.

Lillard, A. S. (2005). Montessori: The science behind the genius. New York: Oxford University Press.

McClintock, R. L. (1968). The theory of education in the Republic of Plato. New York: Teachers College Press. (Original work published 1880)

Montessori, M. (1967). The absorbent mind (C. A. Claremont, Trans.). New York: Holt, Rinehart and Winston.

O’Grady, W., Archibald, J., Aronoff, M., & Rees-Miller, J. (2005). Contemporary linguistics: An introduction (5th ed.). Boston: Bedford/‌St. Martin’s.

Pajares, F. (n.d.). Self-efficacy beliefs in academic contexts: An outline.

Piaget, J. (1968). Genetic Epistemology (E. Duckworth, Trans.). New York: Columbia University Press.

Ridley, M. (2003). The agile gene: How nature turns on nurture. New York: HarperCollins.

Ross, P. E. (2006, August). The expert mind. Scientific American. Retrieved August 20, 2006, from http://www.sciam.com/‌article.cfm?articleID=00010347-101C-14C1-8F9E83414B7F4945&ref=sciam&chanID=sa006

Whitehead, A. N. (1978). Part II, Chapter I, Section I. In D. R. Griffin & D. W. Sherburne (Eds.), Process and reality: An essay in cosmology (Corrected ed., pp. 39-42). New York: The Free Press. (Original work published 1929)

NYC Test Data

A series of posts analyzing publicly available New York City Math and English Language Arts (ELA) standardized test results. There are a lot of graphs. Code is on github.

  1. Putting the data together and looking at it
  2. Checking out the number of students tested in Math and ELA
  3. Checking out the number of students tested in Math and ELA again
  4. The total number of students and tests
  5. The total number of students and tests by grade
  6. Considering District 75 schools
  7. The total number of tests by grade viewed by cohort
  8. Number of students tested at the school grade subject level
  9. Normalizing the distributions of average scores
  10. Schools fight the Law of Large Numbers
  11. Changes in average scores for school grades and cohorts
  12. Changes in scores by year – where is the Common Core shake-up?

 

Building Predictive Models for NYC High Schools

I don’t have anything to say about building predictive models today myself, but I did want to share this paper from Alec Hubel, which demonstrates some of the really interesting things that you can do even with just the publicly available education data out there on the net.

Common warnings about correlation vs. causation in education definitely apply, but I think this provides a really neat look at some of the data that NYC makes available. I like it even just for the second figure, which is a simple but rarely seen visualization that makes it easy to explore, by looking, some of the patterns across districts.

Thanks Alec!

NYC standardized test results: Changes in average scores by year – where is the Common Core shake-up?

New York City had changed Math and ELA tests in 2013, aligning them to the Common Core. This was billed as representing a big shift, testing deeper concepts and so on. We’ve seen that the distribution of scores shifted down dramatically in 2013, and results aren’t being reported for District 75 schools any more. Also percent proficient is down a lot, but that’s like saying more people are now shorter than a stick you raised in the air, so I’m not paying it any mind. The overall position of the distribution of scores is also pretty much arbitrary. It would be interesting if we saw changes in school grade position in the distribution change more from 2012 to 2013 than between other years, indicating that the tests changed more from 2012 to 2013 than between other years. Do we see this?

Figure 12a. Changes in average Math and ELA test scores for same grade year to year and same cohort grade to grade, 2006-2013

Figure 12a. Changes in average Math and ELA test scores for same grade year to year and same cohort grade to grade, 2006-2013

Nope. What if we use this other technique to give our eyes some help?

Figure 12b. Density of changes in average Math and ELA test scores for same grade year to year and same cohort grade to grade, 2006-2013

Figure 12b. Density of changes in average Math and ELA test scores for same grade year to year and same cohort grade to grade, 2006-2013

Still nope. If anything, the 2013 results resembled the 2012 results more than other years have. Any way you look at it, the 2013 tests don’t seem to have shuffled NYC schools any more than other years’ tests. Of course, this would be more meaningful if there was in general less shuffling with every new test administration. I’m still not over how little stability there is in the school average scores.

[table of contents for this series]

NYC standardized test results: Changes in average scores for school grades and cohorts

I’ve normalized and re-normalized these average scores so that I can compare the scores across years and across grades. Now, there are good reasons not to do this. The tests aren’t vertically aligned. The fourth-grade test in math could be on different material than the third-grade test, etc. I don’t have student level information, so I don’t know which students are really in these averages. Tests are evil. And so on, etc., etc. I’m going to do it anyway, and hopefully I’ll be sufficiently critical of any results.

I suspect that, as was mostly the case for the number of students tested, test performance will vary more at the same school and same grade from one year to the next (4th grade 2008 to 4th grade 2009) than at the same school and same cohort from one year to the next (4th grade 2008 to 5th grade 2009). This would indicate that the test measurement error (at the school grade level, for this data) is smaller than the variability of classes at schools. If this is the case then there’s some hope for using these averages to try to get a sense of how well schools are educating their students. If not, then we’ll have less confidence about many things.

Well, here’s the result:

Figure 11-1a. Changes in average average Math test scores for random records, same grade year to year, and same cohort grade to grade

Figure 11-1a. Changes in average average Math test scores for random records, same grade year to year, and same cohort grade to grade

That’s a little disappointing. Cohort doesn’t look appreciably stabler than grade, if at all. Going into sixth grade, the distribution isn’t even centered at zero! Of course we only see schools that have fifth and sixth grades there, which isn’t as many schools. But what’s causing that? Do those schools have an influx of smart middle-schoolers who weren’t in their fifth grades? That seems unlikely. Is it a result of how students move around system-wide between fifth and sixth grades? Is it an artifact of my chosen normalizing method? Curious. Here’s the same graph for ELA.

Figure 11-1b. Changes in average average ELA test scores for random records, same grade year to year, and same cohort grade to grade

Figure 11-1b. Changes in average average ELA test scores for random records, same grade year to year, and same cohort grade to grade

Okay actually, before moving on, here’s one more look at changes by grade. I really wanted them to be stabler for cohorts. The density plots below could help deal with overplotting issues, but the conclusion is about the same. Cohorts are particularly strange going from fifth to sixth grade. For ELA, cohort scores might be a little stabler going into seventh and eighth grades, but it doesn’t make me particularly thrilled.

Figure 11-2. Density of changes in average Math and ELA test scores for same grade year to year and same cohort grade to grade

Figure 11-2. Density of changes in average Math and ELA test scores for same grade year to year and same cohort grade to grade

I can’t think of a way for the above to be a good result for anybody, aside from finding something interestingly weird about fifth-to-sixth-grade cohorts.

[table of contents for this series]

NYC standardized test results: Schools fight the Law of Large Numbers

After all the hemming and hawing over choices that make no visible difference in the plots of this post, I’ve decided I like the bottom right option from the last post the most – subtracting out the estimated student-level median and then dividing by median absolute deviation. So be it!

The question of this post, however, is how the variability of school grade average scores changes with the number of tested students. Smaller samples generally produce more extreme results. There’s less chance for regression to the mean, as it were. The Law of Large Numbers hasn’t had a chance to take effect. If we’re drawing randomly, then the variance of the sample mean gets small when the sample size is big. Does that happen for school grade averages?

Figure 10a. Normalized average Math scores vs. number of tested students for non-D75 NYC public schools (charter and non-charter) grades 3-8, 2006-2013

Figure 10a. Normalized average Math scores vs. number of tested students for non-D75 NYC public schools (charter and non-charter) grades 3-8, 2006-2013

It does, to some extent. The graphs are vaguely cone-shaped. (ELA is a little more cone-shaped; see below.) Of course, students are not randomly assigned to schools, which is what makes these graphs particularly interesting. The elementary grades (including 3 to 5) are generally smaller, yet it looks like the elementary grade averages approach the student average even while the middle school grades (6 to 8) vary widely. This seems consistent with the idea of smaller local elementary schools that educate mostly whoever is nearby, with more of a filtering effect starting with middle school – a parent might be more likely to support a longer commute to a better school, which also means the better school draws good students from a wider area.

It is good news that the school scores do mostly seem to converge toward average when there are more students. If there were perfect segregation of better and worse-performing students we could see all these averages avoiding the center red lines entirely.

If you squint a little, it looks like there somehow aren’t any middle schools with around 200 students in a grade that perform much above average in ELA. You can almost see something like that for math too. Weird. (Code.)

Figure 10b. Normalized average ELA scores vs. number of tested students for non-D75 NYC public schools (charter and non-charter) grades 3-8, 2006-2013

Figure 10b. Normalized average ELA scores vs. number of tested students for non-D75 NYC public schools (charter and non-charter) grades 3-8, 2006-2013

[table of contents for this series]