Now that the hue-and-cry over the government’s proposed
changes to the way schools are funded has subsided, it may be a good time to
undertake a sober reflection of the issues because they are bound to re-surface
in the future.
First, common sense suggests that increasing class sizes is
detrimental to student performance. More students per class mean more harried
teachers and therefore less attention per student. The causality here is
clear.
But, teacher quality certainly plays a crucial role as well.
So what the government is essentially proposing is a trade-off: larger classes
(bad) but better teachers (good).
The important question, however, is does the additional
value created by good teachers counter-act the bad consequences of larger class
sizes? At this point the answer is not clear.
It is most likely that in supporting this change in policy
the government is drawing inspiration from a recently released study undertaken
by Raj Chetty and John Friedman of Harvard and Jonah Rockoff of Columbia. In
this large and comprehensive study, which has been cited approvingly by many,
including Barack Obama, the authors claim that students with high value added
(HVA) teachers who raise their standardized test scores are “more likely to
attend college, earn higher salaries, live in better neighborhoods and save
more for retirement.”
But how reliable is this work? As Gary Gutting, professor of
philosophy at Notre Dame, writing for the New York Times asks: does this
compare with the work by biochemists on the effects of light on plant growth?
No one, for example, questions the validity of the physics on which our space
programs are based. But even the best-developed social sciences like economics
have nothing like this status. Since humans are much more complex than plants
and biochemists have far more refined techniques for studying plants, we may
well expect the biochemical work to be of greater validity.
Furthermore when it comes to generating reliable scientific
knowledge the most important issue is the ability to predict future events. And
the only way we can make such informed predictions about teacher impact would
be to run randomized controlled experiments. Suppose that you could randomly
assign some students to a teacher rated HVA, and other students to a teacher
rated low value-added (LVA). If the students are identical on average, you
could compare the test scores of each group. If the kids with the HVA teacher
do better, then we can draw reasonable conclusions about the teachers’ value
added
But such controlled experiments are nearly impossible for
obvious reasons. So instead Chetty and his colleagues look at data from grades
three to eight for 2.5 million children in one of the largest school districts
in the USA over a 20-year period (1989-2009). They then used other public
records to track students after high school. They used a massive data set of
nearly 20 million observations to find situations that were virtually identical
to the controlled experiment that I have described above.
But here is the next problem. Chetty and his colleagues do
all of this while holding class size constant. What happens when the class size
changes? Now the controlled experiment required becomes more complex. For one
thing now you would have to measure the impact of HVA and LVA teachers
separately for large classes and small classes and then measure if the higher
value added by HVA teachers counter-act the drop in performance in larger
classes.
And underlining the difficulties of making meaningful
predictions the authors point out that while the impact of HVA teachers may be
significant, there are a whole host of other factors that affect performance
including relationships with parents and peers.
Furthermore, how much faith can or should we have in the
government’s ability to identify good teachers or a set of best practices in
the classroom? Teaching is a multi-faceted and complex activity that defies easy
quantification. If we measure teacher performance by improvements in students’
test scores as the Chetty et al study proposes, then does that not leave us
open to the possibility of teachers teaching to the test? Researchers at the
University of Chicago have also shown that when student test scores are the only
metric, then teachers are not immune to cheating by changing student answers on
standardized tests ex post.
In New Zealand we already have experience with the
government’s attempts to measure the quality of university staff. This is
called the Performance Based Research Funding Exercise. One would be hard
pressed to find a single academic in the whole country who thinks that the PBRF
actually does a good job of measuring quality; because quality is an extremely
elusive concept and because university staff engage in a diverse array of
activities including research, teaching and service. PBRF is a cumbersome and
costly (both in terms of money and time) process that has significantly
increased the administrative burden at universities. The pool of money available
for division remains unchanged, except that the universities are now spending
increasingly more resources in chasing that constant sum.
Given the inherent constraints in social science research
and the difficulties in making meaningful predictions, it seems impulsive to
use the findings of such research to usher in sweeping changes in existing
policy. Certainly we should expect world class research to inform our policy
decisions but they cannot really be a substitute for practical experience,
empathy and common sense.
No comments:
Post a Comment