Psychology Itself Is Under Scrutiny
Posted July 16, 2018 12:05 p.m. EDT
The urge to pull down statues extends well beyond the public squares of nations in turmoil. Lately it has been stirring the air in some corners of science, particularly psychology.
In recent months, researchers and some journalists have strung cables around the necks of at least three monuments of the modern psychological canon:
— The famous Stanford Prison Experiment, which found that people playacting as guards quickly exhibited uncharacteristic cruelty.
— The landmark marshmallow test, which found that young children who could delay gratification showed greater educational achievement years later than those who could not.
— And the lesser known but influential concept of ego depletion — the idea that willpower is like a muscle that can be built up but also tires.
The assaults on these studies are not all new. Each is a story in its own right, involving debates over methodology and statistical bias that have surfaced before in some form.
But since 2011, the psychology field has been giving itself an intensive background check, redoing more than 100 well-known studies. Often the original results cannot be reproduced, and the entire contentious process has been colored, inevitably, by generational change and charges of patriarchy.
“This is a phase of cleaning house and we’re finding that many things aren’t as robust as we thought,” said Brian Nosek, a professor of psychology at the University of Virginia, who has led the replication drive. “This is a reformation moment — to say let’s self-correct, and build on knowledge that we know is solid.”
Still, the study of human behavior will never be as clean as physics or cardiology — how could it be? — and psychology’s elaborate simulations are just that. At the same time, its findings are far more accessible and personally relevant to the public than those in most other scientific fields.
Psychology has millions of amateur theorists who test the findings against their own experience. The public’s judgments matter to the field, too.
It is one thing to frisk the studies appearing almost daily in journals that form the current back-and-forth of behavior research. It is somewhat different to call out experiments that became classics — and world-famous outside of psychology — because they dramatized something people recognized in themselves and in others.
They live in the common culture as powerful metaphors, explanations for aspects of our behavior that we sense are true and that are captured somehow in a laboratory mini-drama constructed by an inventive researcher, or research team.
The Stanford prison experiment is a case in point.
In the summer of 1971, Philip Zimbardo, a midcareer psychologist, recruited 24 college students through newspaper ads and randomly cast half of them as “prisoners” and half as “guards,” setting them up in a mock prison, compete with cells and uniforms. He had the simulation filmed.
After six days, Zimbardo called the experiment off, reporting that the “guards” began to assume their roles too well. They became abusive, some of them shockingly so.
Zimbardo published dispatches about the experiment in a couple of obscure journals. He provided a more complete report in an article he wrote in The New York Times, describing how cruel instincts could emerge spontaneously in ordinary people as a result of situational pressures and expectations.
That article and “Quiet Rage,” a documentary about the experiment, helped make Zimbardo a star in the field and media favorite, most recently in the wake of the Abu Ghraib prison scandal in the early 2000s.
Perhaps the central challenge to the study’s claims is that its author coached the “guards” to be hard cases.
Is this coaching “not an overt invitation to be abusive in all sorts of psychological ways?” wrote Peter Gray, a psychologist at Boston University who decided to exclude any mention of the simulation from his popular introductory textbook.
“And, when the guards did behave in these ways and escalated that behavior, with Zimbardo watching and apparently (by his silence) approving, would that not have confirmed in the subjects’ minds that they were behaving as they should?”
Recent challenges have echoed Gray’s, and earlier this month Zimbardo was moved to post a response online.
“My instructions to the guards, as documented by recordings of guard orientation, were that they could not hit the prisoners but could create feelings of boredom, frustration, fear and a sense of powerlessness — that is, ‘we have total power of the situation and they have none,'” he wrote. “We did not give any formal or detailed instructions about how to be an effective guard.”
In an interview, Zimbardo said that the simulation was a “demonstration of what could happen” to some people influenced by powerful social roles and outside pressures, and that his critics had missed this point.
Which argument is more persuasive depends to some extent on where you sit and what you may think of Zimbardo. Is it better to describe his experiment, questions and all — or to ignore it entirely as not real psychology?
One psychologist who does not have to choose is David Baker, executive director of the Center for the History of Psychology at the University of Akron, which hosts the National Museum of Psychology.
“We put everything in that’s an important part of our history, including the controversy,” Baker said.
“To me, the target question of an experiment should be considered,” he added. “In this case, do social context and expectations significantly change behavior. And if so, when and how so?”
The issues surrounding the marshmallow studies and the ego depletion work are different, but land researchers in the same fundamental bind: Is this something, or is it nothing? Even younger psychologists who are eloquent partisans on the side of self-correction can be conflicted.
“With ego depletion especially, it seems like there’s some truth there — we have a subjective feeling of cognitive fatigue” after exercising self-control, said Katie Corker, as assistant professor of psychology at Grand Valley State University in Michigan.
A recent replication, rigorously done by one of the original authors, found evidence of an effect, but it was a small one, Corker said.
“Maybe we’re not studying it right, I don’t know. The better question may be, what does it take to kill off a big finding like this? Or, what should it take?”
Given modern ethics restrictions, mounting precise replications of old experiments is not always possible. The prison experiment would likely have to be seriously modified to pass institutional review.
The marshmallow test and ego depletion studies are fair game for further examination, and in those cases modifications may in fact clarify the picture. Some children do exhibit a streak of self-restraint early that seems to become central to their developing personality. What is the best way to measure that ability, or trait? What are its rewards over time, and its costs?
A more careful investigation of the “subjective cognitive fatigue” resulting from exercising self-control might help answer the latter question. It may also save ego depletion from being discarded prematurely as a useful scientific concept.
When Nosek published his first major replication paper in 2015, finding that about 60 percent of prominent studies did not pan out on a second try, it was a gift to skeptics eager to dismiss the entire field (and maybe all of social science) as a joke, a congregation of poorly anchored findings that shift in the wind, like nutrition advice.
It is not. On the contrary.
Housecleaning is a crucial corrective in science, and psychology has led by example. But in science, as in life, there is reason for care before dragging the big items to the curb.