This first article in a series of eight is derived from Jonathan Bartlett’s new concept of Generalized Information. The series addresses teaching machines to generalize instead of memorize, with many accessible examples.
Why do we think that teachers should not “teach to the test,” that is, focus on getting students to pass a standardized test? Passing the test, after all, gets the student good grades, and getting good grades makes the student employable. Not only does teaching to the test make the teacher look good, but surely it also is best for the student. After all, if the student is not ready for the test, he may fail. He could end up on the streets. If the means the difference between having a job and being homeless, we might as well just give the student the answer sheet…
Now, of course, we see the problem. Teachers are meant to teach knowledge, not answers to a test. The test is merely an indicator of knowledge, and indicators are not the things they indicate. Employers value good test scores not because they care about the test but because they care about the student’s ability to acquire and use knowledge effectively. If the teacher just gives the student the answer sheet, then the test losses all of its significance as a knowledge indicator.
This is an example of a principle known as Goodhart’s Law. The law states that whenever a metric becomes a goal, it loses its value. It was formulated in the context of the USSR, (1922–1991) where the central government would legislate the quotas of goods for factories to produce. If the quota was expressed in terms of quantity, say X number of nails, then the factories would create the skinniest, smallest nails possible to meet it. If the quota was expressed in terms of weight, then the factories would plop out a few oversized nails. Industry faced a constant shortage of needed materials that wouldn’t help meet the quota. Goodhardt’s Law summarizes this general trend: Once we substitute a measure of progress toward a goal for progress itself, then we begin optimizing for the metric and cease making progress.
For example, if we replace the goal of acquiring knowledge with a measure of progress toward the goal—the test—then we churn out students with high scores and no understanding of the subject matter. Students can cram for the exam in a caffeine-fueled study binge, and then purge all the answers the next morning. They may make it into Ivy league schools with their superior cramming abilities, but that provides no guarantee that they can actually apply the subjects they crammed.
This brings us to the subject of machine learning. Just as, wth human learning, w want to ensure that students learn the material and not the answer key, so it is with machine learning. We want the machine learning algorithms to learn general principles from the data we provide, and not merely little tricks and nonessential features that can generate high accuracy scores on the training data, but do not apply to the problem domain in general.
But, how can we guarantee that an algorithm—or a student—has truly learned a general model, and not just memorized an answer key? Learn how we can eliminate cheaters in our next installment!
You might also enjoy some of Eric Holloway’s thoughts on AI:
“Friendly” artificial intelligence would kill us. Is that a shocking idea? Let’s follow the logic
If you are worried about things like that happening, check out Eric Holloway’s Could AI think like a human, given infinite resources? Given that the human mind is a halting oracle, the answer is no.
Also: Generalized Information. (Jonathan Bartlett) In machine learning, the Solomonoff induction helps us decide how successful a generalization is.
Why it’s so hard to reform peer review. Robert J. Marks: Reformers are battling numerical laws that govern how incentives work. Know your enemy! (a discussion of Goodhart’s Law in higher education)