NYC goes back to school: Attendance numbers show overall ramp-up

I’ve been archiving (and tweeting) NYCDOE attendance numbers since last spring, so this has been the first Fall that I’ve archived. I was interested to see what the start of the school year looks like. As I sort of guessed, there is a clear “ramp up” in the system overall, in addition to the usual weekly pattern:

overall

But there’s also huge variability; schools are all over the place when it comes to attendance:

individuals

And schools behave very differently – for example, Stuyvesant doesn’t seem to follow the overall trend at all:

stuy

I suspect that attendance patterns provide some information about school performance which is beyond just raw average attendance for a school. As a first guess, perhaps a more prominent “ramp-up” is associated with lower academic performance as measured by standardized tests. I think this could be interesting to investigate.

It would be great if more people took a look at these attendance numbers, and also if we could get more similar data from other school systems!

Polya’s How to solve it: Quotes and comments

The first rule of style is to have something to say.

solveit

I wish I had read How to Solve It when I was a fairly young student of mathematics. I wish also that I had read it when I was becoming a teacher of mathematics. It has been recommended many times, and now I will recommend it also. Really first rate stuff. Everyone’s excited these days about Thinking, Fast and Slow, but if you’re moving into real problem-solving that necessitates “slow” mental work, it’s Polya who really has something to say. His comments extend beyond the core problem-solving theses as well. I include here some quotes of particular note.

Mathematics is interesting in so far as it occupies our reasoning and inventive powers.

 

Can our knowledge in mathematics be based on formal proofs alone? … It is certain that your knowledge, or my knowledge, or your students’ knowledge in mathematics is not based on formal proofs alone. If there is any solid knowledge at all, it has a broad experimental basis, and this basis is broadened by each problem whose result is successfully tested.

 

Definitions in dictionaries are not very much different from mathematical definitions in the outward form but they are written in a different spirit.

The writer of a dictionary is concerned with the current meaning of the words. He accepts, of course, the current meaning and states it as neatly as he can in form of a definition.

The mathematician is not concerned with the current meaning of his technical terms, at least not primarily concerned with that. What “circle” or “parabola” or other technical terms of this kind may or may not denote in ordinary speech matters little to him. The mathematical definition creates the mathematical meaning.

 

Teaching to solve problems is education of the will.

 

Another “problem to prove” is to “prove the theorem of Pythagoras.” We do not say: “Prove or disprove the theorem of Pythagoras.” It would be better in some respects to include in the statement of the problem the possibility of disproving, but we may neglect it, because we know that the chances for disproving the theorem of Pythagoras are rather slight.

 

If the student failed to get acquainted with this or that particular geometric fact, he did not miss so much; he may have little use for such facts in later life. But if he failed to get acquainted with geometric proofs, he missed the best and simplest examples of true evidence and he missed the best opportunity to acquire the idea of strict reasoning. Without this idea, he lacks a true standard with which to compare alleged evidence of all sorts aimed at him in modern life.

 

[Not all mathematical theorems can be split naturally into hypothesis and conclusion. Thus, it is scarcely possible to split so the theorem: “There are an infinity of prime numbers.”]

That last is one place where I may disagree with Polya. The split is easy and natural: “Assuming our usual definitions (hypothesis) there are an infinity of prime numbers (conclusion).” It’s easy to forget that what we consider natural, as “the natural numbers” and so on, is just a made-up mathematical system when considered formally. Where mathematics comes from is a good exposition on this sort of thing, I think. And here, in a bit of a contrast, is perhaps my favorite quasi-philosophical quote from Polya’s book:

Do not believe anything but doubt only what is worth doubting.

Perplexity: what it is, and what yours is

[Test your perplexity!]

Perplexity is an interesting measure of how well you’re predicting something. It has a nice mathy definition; I’ll try to describe it quite simply here.

Say the real thing you want to predict is the sequence of numbers from one to six. If you predict each number in turn with a six-sided die, you will be right about one sixth of the time. The die’s perplexity with this data (and really with any data it could possibly predict – dice are dumb) is six.

The perplexity of whatever you’re evaluating, on the data you’re evaluating it on, sort of tells you “this thing is right about as often as an x-sided die would be.”

Note that a human, after seeing that the first number is 1 and the second is 2, might guess that the third is 3, and so on. So a human is a better predictor than a die, and would have a lower perplexity. So would a linear model trained on the data seen so far, in this case. That is, whatever you’re evaluating (a model, or whatever) is allowed to see the data so far before it predicts what the next thing is, and that is frequently helpful.

This is interesting and easily calculable for predicting categorical things like letters, or words. Maybe you’re a computational linguist and you want to measure how well your model can predict the next English word in a book. Humans are pretty good at this, by the way. (“The early bird gets the ___?”) Computers can achieve a perplexity of 247, apparently – like having a 247-sided die predict each subsequent word. (The computer “makes a new die” for each prediction, based on the words so far.)

You can also predict one letter at a time, and you’ll get a much better perplexity score, even just since there are many fewer different letters than different words. This has been done, even, as described in this interesting article about Andrei Kolmogorov:

To measure the artistic merit of texts, Kolmogorov also employed a letter-guessing method to evaluate the entropy of natural language. … His group conducted a series of experiments, showing volunteers a fragment of Russian prose or poetry and asking them to guess the next letter, then the next, and so on.

I know you’re itching to participate in such an experiment – and you can! On the linked page you’ll be able to try your hand. It’s interesting to consider whether you’re evaluating the artistic merit of the text, or your own knowledge of English, or some combination, or what.

Computers can predict letters pretty well – a perplexity of about 3.4. And that’s predicting from the range of all printable characters (well, ASCII characters). This human test only asks you to predict from a range of about 30 characters, which is a bit easier. So I hope you can beat 3.4!

[Test your perplexity!]

Thanks to David for originally leading me to think about perplexity!

A sensible characterization of mode, median, and mean

Often “types of data” are introduced all together, and then “measures of central tendency” are introduced all together. For “types of data” I mean nominal, ordinal, and numeric (leaving aside interval vs. ratio). For “measures of central tendency” I mean mode, median, and mean.

A common response to this exposition, even if median is justified with reference to skew, is that mode is a stupid thing and its inclusion in the list is almost insulting.

A much nicer exposition would introduce each type of data together with the “measure of central tendency” that is in some sense the best you can do for that type of data.

With nominal data the best you can do is frequency counts. Mode reports the most common thing. The mode of an election is (often) the winner. This is useful.

With ordinal data you can do better by putting everything in order. Now even if there are 8 A’s, 6 B’s, and 7 C’s, still B is more representative for its middle-ness.

Finally with numeric data you can take the mean, and you may want to or you may not want to, but people often do and they may well be right to.

Described this way, the types of data and the ways of measuring them have a pleasant pattern – an organized relationship that gives every part more meaning.

Intro statistics textbook authors, you may update your texts! (Do any already point this out? I don’t recall ever seeing it written.) (Also, I don’t particularly like the term “measure of central tendency” so I keep it in scare quotes throughout.)

Information Dashboard Design by Stephen Few (Second Edition)

The second edition of Stephen Few’s book on making dashboards came out about two weeks ago. I read it, not having read the first edition. In some ways Few is just Tufte for MBAs, but he does have bullet graphs in addition to sparklines, and he focuses on and provides examples of dashboards.

book cover

The cover of the second edition (above) is way better than the cover of the first edition (below) because it features a complete example of the style of dashboard that Few advocates. If you understand everything on the cover above, you understand the entire book. It’s a really neat sort of meta-visualization. This old cover is rather awful:

book cover (first edition)

The other great thing, distinct to the new edition, is that it features multiple examples of dashboards for teachers, displaying student data. Neat! Mr. Few facilitated a dashboard design competition in 2012, which I had been unaware of. The new edition features the two best submissions, some more examples that weren’t as good, and Few’s own creation, which I think would look a lot better with different color choice. (You can see a lot of this content on Few’s blog, as linked.) The education use case is very interesting to me. I’d like to see the principles of this book applied to, say, NYC’s ARIS data system. I wonder how other education data system vendors’ products stack up!

I think Stephen Few is fundamentally right about dashboard design. The only thing I would add or discuss further is the primacy of analysis. Said another way, the dashboard should focus on communicating reality, not on communicating metrics. People who think they know what metrics they’re interested in are very often wrong. There may be better metrics, or (more likely) it may be that finding a way to present more of the data without reducing it to metrics allows communication of a more complete picture. This could require the use of perhaps more expressive, possibly less conventional means even than sparklines and bullet graphs. But you should certainly know what’s in this book.