June was the busiest month of the academic year for New York State high school teachers and their students.   In addition to getting their students to hand in any last minute assignments, NYS high school teachers had to make sure that their students were fully prepared to take required standardized tests, called Regents examinations, commissioned by the NYS Education Department.  

These exams are critical for most students who entered a New York high school in 2008 or beyond, as graduation is contingent upon passing (65% or above) all five of the required Regents exams.   However, now, due to recent legislation passed by Governor Cuomo, students won’t be the only ones who will suffer the negative repercussions of failing a Regents exam.  

Starting in September 2011, student performance on these tests will heavily affect the job security of many teachers, ultimately resulting in strict adherence to a curriculum that leaves little room for creative exploration and the development of critical thinking.   I don't want to be misunderstood: I do think teachers should be evaluated rigorously.   However, we must take care not to use faulty methods when doing so as the consequences can be dire.

For the sake of full disclosure, I would like to make it known that I am married to a NYS public high school teacher who will be directly affected by this legislation. However, my views, which are based on various concepts in educational philosophy and student learning, would remain the same independent of my personal relationship.   Furthermore I am a mother who is relying on the NYS education system to provide her children with an excellent education.   As such, I feel it necessary to become personally involved in how New York intends to ensure that outstanding teachers do not slip through the cracks disguised by red tape.

Here, I will analyze the recent changes in the NYS legislation affecting teacher evaluations and present a case for why Regents exams and standardized test scores in general are not good indicators of teacher performance.

Leading up to the new teacher evaluation system in NY State

You see, we know what's possible from our children when reform isn't just a top-down mandate, but the work of local teachers and principals, school boards and communities. Take a school like Bruce Randolph in Denver. Three years ago, it was rated one of the worst schools in Colorado — located on turf between two rival gangs. But last May, 97 percent of the seniors received their diploma. Most will be the first in their families to go to college. And after the first year of the school's transformation, the principal who made it possible wiped away tears when a student said, "Thank you, Ms. Waters, for showing that we are smart and we can make it." That's what good schools can do, and we want good schools all across the country.

These were the words spoken by President Obama in his January 25th, 2011 State of the Union Address in support of his educational progress program, Race to the Top. A major part of this initiative was a competition whereby several states, upon successfully proposing "ambitious yet achievable plans for implementing coherent, compelling, and comprehensive education reform," were granted a substantial award to help revamp their education system.  

After submitting plans based around four key components, including putting a greater emphasis on standardized test scores when evaluating teacher performance and increasing the number of charter schools allowed, New York placed second in the national competition (behind Massachusetts), capturing nearly $700 million for education reform. It is thought that the securing of these funds was, at least in part, a result of an unprecedented collaboration between the NY State Education Department and the NY State Union of Teachers (NYSUT), and resulted in the formation of a 63-member Regents Task Force, comprised of teachers, administrators, union leaders and school board members.

In their April 4th, 2011 report (PDF) presented to the Board of Regents, the Regents Task Force states:

As representatives of our stakeholder groups, our one unifying purpose is – to develop a comprehensive teacher and principal evaluation system that will improve teaching practice to advance learning for all students… This new system will be a comprehensive restructuring of how teachers and principals are evaluated and New York State is leading the way. It is all new , and there is no existing, comparable system that can provide a blue-print for us to follow. What we do know, and all stakeholders share, is the understanding that the new system must be fair, transparent and result in meaningful evaluations for teachers and principals. It must be comprehensible to those being evaluated and also to the public.

The fact that almost all teachers receive a "satisfactory" grade or greater has made it clear that the current NYS teacher evaluation system is ineffective and, after several rounds of deliberation, it was agreed that the metrics for evaluation would be broken down as follows: 20% on standardized tests, including Regents exams; 20% on locally selected measures of student achievement; and 60% on other measures of teacher and principal effectiveness.   However, the rights for collective bargaining were still maintained.  

Many individual NYSUT members have openly expressed their dissatisfaction with the inclusion of standardized test scores in evaluations, as they have always been strictly a measure of student assessment. An even bigger surprise was the recent eleventh-hour addition to the legislation by Governor Andrew Cuomo, passed by a margin of 14-3, which would allow standardized test scores to account for up to 40% of teacher evaluations.  

This has obviously sparked outrage among teachers and members of NYSUT, who calls it a "Breach of Trust," and has resulted in a NYSUT-mediated suspension of collaboration with the NYS Education Department. Upping the ante when it comes to the Regents scores and teacher evaluation encompassed in this new legislation is highly charged and incredibly controversial. There are many factors outside of the classroom that can affect test scores and many argue that it is impossible to accurately quantify an individual teacher’s contribution to student learning through the use of exam grades. But desperate times call for desperate measures, right? In the end, it seems like this move will do more harm than good.

The case against value-added models and why Regents exam scores are not a clear indicator of teacher performance

The 2002 implementation of the No Child Left Behind Act (NCLBA) by the George W. Bush administration, which put a greater emphasis on student performance as a measure of school success, has essentially set the stage for adopting what many believe to be the gold-standard for teacher evaluations: value-added models (VAM). A hot topic in education, VAMs take the NCLBA a step further by transferring accountability to teachers and principles. In short, VAMs are thought to provide an estimation of individual teacher contribution to student success (or lack thereof).  

While this sounds good in theory, we should be very cautious when it comes to relying heavily on VAMs, especially those based on test scores, for teacher assessment. For instance, the general behavior, level of parental involvement, sleep habits, nutritional status, and other components relating to socioeconomic status of students in a class – factors that are difficult or even impossible to quantify - can vary from year to year, and could affect overall test scores. Furthermore, there is no single accepted calculation for determining the contribution of teachers using student test scores, and different statistical models will yield different results – how do we know which one is right and which one is less right?  

In a recent report written by Sean Corcoran, Associate Professor of Educational Economics at NYU, VAMs are scrutinized. Similar to the controversial social study performed by the LA Times to evaluate teacher performance in the LA area, Corcoran used VAM methods to assess the effectiveness of teachers in NYC and Houston. Yes, he was able to generate data; however the margin of error was so large that it was impossible to tell if the differences were actually real. He concludes that: "value-added assessment of teacher effectiveness has great potential to improve instruction and, ultimately, student achievement… However, the promise that value-added systems can provide such a precise, meaningful, and comprehensive picture is not supported by the data."  

Corcoran wasn’t the only one to claim that VAM falls short of providing an accurate measure of teacher effectiveness. In 2009, Jesse Rothstein, an economist at Princeton University’s Woodrow Wilson School of Public and International Affairs , published a report in Quarterly Journal of Economics (summary), which examines the disparity between teachers who generally work with "gifted" students and teachers who specialize in working with children with special needs. He suggests that VAMs will reward those working with advanced students while penalizing those who work with children who struggle in the classroom.

However, a study by Cory Koedel and Julian R. Betts from UCSDs National Bureau of Economic Research suggests that more complex VAMs can help to overcome these obvious biases. Yet even more recently, Rothstein reviewed the Bill & Melinda Gates Foundation’s Measures of Effective Teaching (MET) Project, which supports VAM for evaluating teachers, for the University of Colorado at Boulder think tank National Education Policy Center (NEPC). Again, Rothstein finds fault with the VAM system, citing that “the correlations between value-added scores on state and alternative assessments are so small that they cast serious doubt on the entire value-added enterprise.”

As just discussed, VAMs – which are generally based on student performance over time – have obvious drawbacks. So why are Regents exams, which are merely a snapshot of student progress, thought to be a clear indicator of teacher effectiveness? Furthermore, how can legislators be so sure that their proposed statistical modeling will drown out potential biases?  

According to Roger Tilles, a member of the New York State Board of Regents, the new legislation in NY that allows up to 40% of teacher evaluations to be based on Regents exam scores is neither fair nor is it effective. In his commentary published on the Washing Post education blog "The Answer Sheet," he exemplifies this unfairness by comparing the use of VAM in other professions:

The proposal under consideration applies these test-based value-added techniques to teacher evaluations. If these value-added techniques were applied to other professions as they are being applied to teachers, it would mean that dentists be would evaluated not on their skills but only on how many cavities a dentist’s patients gets in a year or with a doctor on how many times his patients get sick in a year. Similarly, police are not evaluated on the number of crimes committed on their beat, nor fire personnel on number of fires in their jurisdiction. We would all acknowledge that such rating systems are at best incomplete.       

And what about the efforts shown by the students? While there are many children who are inherently dedicated to school, there are a great number who could care less. Why should a teacher be penalized, despite intense effort to engage, for students who do not prepare for their Regents exam (and get a grade that reflects their level of preparedness)?  

If you think that education is a one-way street, I will have to disagree with you. Learning does not occur through osmosis; students need to do their part and it is unfortunate that unmotivated kids will be armed with the potential to negatively affect their teacher’s job security. Also, there are many students who just don’t do well on tests, but have learned nonetheless.

Yes, there are teachers whose methods are less effective than others and, like in other professions, those who are good at what they do should advance, and those who display subpar performance should not. However, relying on the Regents exam as a proxy for teaching ability will do more than hurt our teachers. We must also consider what will happen to the students.  

The implications for students

The high-stakes consequences associated with giving significant weight to Regents exam scores when evaluating teachers will quickly trickle down into the classroom, in more ways than one. This will no doubt turn the classroom into a pressure cooker and some teachers may respond by focusing on the students who are likely to yield the highest results on the Regents exam, leaving borderline and struggling students to fall by the wayside. Also, some feel that increasing the competitive nature of the teaching profession will significantly reduce collaboration among teachers within the same department.  

But perhaps the biggest criticism of these measures is that they will force teachers to "teach to the test." It is already difficult for NYS high school teachers to introduce subject areas that do not fall within the Regents curriculum, let alone approach certain topics in a more hands-on but more time-consuming fashion. By putting a greater emphasis on standardized tests, we are completely snuffing out creative thinking and the ability to problem solve. And that is not a good thing.

The ultimate goal of putting a child through school is to prepare him or her for entering the "real world." Yes, knowing basic concepts in math, science, history, and English is important when laying an educational foundation (as is foreign language, art, and music, but there are no Regents exams for those subjects). But let’s be honest with ourselves – being able to recite passages from the Federalist Papers is probably not the most useful piece of knowledge; however, being able to interpret these documents – or any other body of intelligent text for that matter - in a way that is both meaningful and relevant is an absolute requirement to be successful.  

By emphasizing Regents exams and standardized tests in general, we will be creating a workforce of robots.   Although Heather Wilson was speaking about American universities in her article "Our Superficial Scholars," her sentiments are no less applicable here:

...high-achieving students seem less able to grapple with issues that require them to think across disciplines or reflect on difficult questions about what matters and why.

            There is no denying that our educational system is broken and it is our future citizens who are left to suffer the consequences. Why would we want to perpetuate this situation by pushing teachers, principals, and all others involved to do whatever it takes to get the highest possible number on tests? This tactic does nothing but add fuel to the fires of corruption.  

Just as there is no magic pill that will make us lose those dreaded extra pounds, fixing our schools so that the curricula is designed with our children in mind instead of satisfying some political agenda will take a great deal of work. It’s high time we stop using teachers as the scapegoats and explore other models of education. As Roger Tilles puts it, " Adding teacher evaluation to this spiral, will accentuate the decline of an already reeling system."  

Knowing the correct answer to a question shouldn’t always be more important than why it is important to know the answer as well as how we came to answer it. Arming our children with the ability to ask questions and come up with novel solutions is as an important of an investment as our 401Ks (if not more). Heavily weighting Regents examination scores during teacher evaluation is not the answer to our educational problems in NY and will do nothing short of derail any progress that has been made to fix these grave issues.

If there was ever a need for creative thinking, it is now.   If our children are to move forward, we need to remove these useless obstacles from the path of learning – this move will not only help those in NY but will also help to contribute to the ever growing need for global economic growth. I don’t want my kids to be bogged down by test scores. Instead, I want teachers to influence them in ways that drives them to explore their passions. Anything less would be unjust.

About the Author: Jeanne Garbarino is a mother of two young girls, aged 2 and 4. In her other (easier) gig, Jeanne is a postdoc at Rockefeller University in the Laboratory of Biochemical Genetics and Metabolism. There, she studies how cholesterol moves inside of our cells and relates this information to human health and the development of cardiovascular disease. In addition to being a scientific researcher, Jeanne is a self-proclaimed scientist-communicator, often blogging about relevant scientific issues on her blog The Mother Geek, as well as co-organizing a monthly science discussion series, Science Online NYC (# SoNYC ), which is open to anyone who is interested about how science is conducted. You can find her tweeting as @ JeanneGarb or can follow The Mother Geek on Facebook.    

The views expressed are those of the author and are not necessarily those of Scientific American.