The Trolley Problem in Real Life

TL;DR

Most participants did not pull the lever in a real-life trolley-problem simulation, contradicting the common survey answer of sacrificing one to save five.

Briefing Cornell Notes

Briefing

A real-world version of the trolley problem produced a result that clashes with the classic survey answer: when people believed they alone controlled a switch that could save five workers or sacrifice one, most did not pull the lever. Instead, many froze—often explaining afterward that they deferred responsibility to others, assumed the situation would be handled, or worried that acting would make things worse. The finding matters because it suggests moral “preferences” expressed in theory may not predict behavior under time pressure, fear, and uncertainty—exactly the conditions that autonomous-vehicle ethics and safety design must confront.

The experiment was built around Philippa Foot’s 1967 trolley problem: a runaway train threatens five people on one track, and a lever can divert it to a second track where one person would die. In polls, most people say they would pull the lever to save five. But the project aimed to test what happens when the dilemma is experienced as immediate, emotionally charged, and personally consequential rather than hypothetical.

To do that, the organizers sought ethical approval through an institutional review board, explicitly weighing psychological harm against potential social benefit. The team consulted behavioral neuroscience and emphasized safeguards: screening out participants vulnerable to traumatic reactions (including signs such as high suicidal thinking or acting out), providing an on-site trauma counselor, and conducting a structured debrief. The ethical groundwork was informed by the legacy of controversial experiments such as Stanley Milgram’s obedience studies at Yale, where participants experienced real distress even though no one was physically harmed.

The staged railroad scenario used an abandoned line, a hired freight train, and a remote switching station with monitors showing “live” feeds. Participants were recruited under a cover story (a focus group about California high-speed rail) and then trained on how the switch worked by an experienced railroad actor. During the crisis phase, the operator left the participant alone in the station while actors stood on the tracks—five on one side and one on the other—while repeated warnings signaled that an object or train was approaching. Participants believed they were controlling the outcome in real time; phones were collected to prevent outside help.

When the warning sequence played and the train approached, none of the participants pulled the lever. Some moved toward action but stopped; others looked around for someone else to take charge, offered self-soothing behaviors, or treated the situation as something that would be resolved by technology or other workers. Elsa, for example, described feeling pressure and quickly deciding to switch to save more lives, but her choice was the exception rather than the rule. J.R. and others articulated a different logic: they “suspended responsibility” because they didn’t know what to do, assumed others would intervene, or feared making an irreversible mistake.

Afterward, participants were shown that no one was actually in danger—“everyone is safe” was displayed before any real impact could occur—and they met the actors during the debrief. Follow-up indicated most were doing well. The project concluded that behavior under moral stress can diverge sharply from what people claim they would do, and that freezing—common across animals—may be a default response when responsibility feels ambiguous. The central takeaway is not just that people hesitate, but that their explanations reveal how quickly “greater good” reasoning can collapse into uncertainty, attribution, and fear—conditions that any real deployment of moral decision-making systems will need to account for.

Cornell Notes

The classic trolley problem asks whether people would sacrifice one person to save five. In surveys, most say they would pull the lever. A real-life, ethics-approved experiment tested that choice under fear and time pressure by putting participants in a remote switching station with staged “live” train footage and actors on the tracks. Despite believing they alone controlled the switch, none of the participants pulled the lever; many froze or looked for someone else to take responsibility. The results highlight a gap between moral intentions and actual behavior, with major implications for how autonomous systems and human training should handle high-stakes dilemmas.

Why did the experiment focus on “behavior under pressure” rather than just asking what people would do?

Classic trolley-problem surveys measure stated preferences, but the project targeted the moment of decision—when fear, uncertainty, and personal responsibility are present. Participants were led to believe they were controlling a real switch during an approaching-train crisis, so the study could observe whether people act on their stated “save five” preference or default to inaction (freezing).

How did the organizers handle ethics and psychological risk?

An institutional review board was central to approval. The team consulted experts about potential harms such as guilt-rumination and trauma-like re-experiencing. Safeguards included prescreening for vulnerability to traumatic reactions, providing an on-site trauma counselor, and running a debrief that clarified the staged nature of the scenario. The approach was shaped by awareness of past controversies in psychology, including Milgram’s obedience work at Yale, where participants experienced real distress.

What did participants actually experience during the crisis phase?

Participants learned how to switch tracks using a lever in the remote station, then were left alone when the crisis began. Monitors showed staged “live” feeds, and warnings like “Attention, train approaching” and “Attention, object on track” repeated as actors stood on the tracks (five on one side, one on the other). Phones were collected, and the participant believed they had to decide whether to divert the train.

What were the main reasons participants gave for not pulling the lever?

Many described freezing or deferring responsibility. Some assumed others would notice and act, or believed the train/technology would handle the situation. Others worried that touching the equipment could make things worse or that they didn’t know who should live and who should die. In several accounts, participants “suspended responsibility” because the correct action felt unclear in the moment.

How did the study’s debrief change what participants understood afterward?

Before any harm could occur, the setup displayed an “everyone is safe” card. Afterward, participants were told the scenario was an experiment and met the actors involved. This reinforced that the crisis was not real, and follow-up indicated participants were doing well—supporting the claim that the risk-management plan worked.

What does “freezing” mean in this context, and why is it central to the conclusion?

Freezing refers to paralysis or inaction when danger is imminent and responsibility is ambiguous. The study found that most participants froze rather than making the utilitarian choice. The broader implication is that moral reasoning may not override instinctive threat responses, so stated ethical preferences may not predict real actions in high-stakes emergencies.

Review Questions

How did the experiment’s design (cover story, remote station, warnings, phone collection) aim to make the dilemma feel personally consequential?
What specific psychological mechanisms did participants describe when explaining inaction (e.g., responsibility attribution, fear of error)?
Why might the classic “save five” survey result fail to predict behavior in the real scenario?

Key Points

1
Most participants did not pull the lever in a real-life trolley-problem simulation, contradicting the common survey answer of sacrificing one to save five.
2
Ethical approval hinged on minimizing psychological harm through prescreening, on-site counseling, and a structured debrief that clarified the staged nature of the scenario.
3
Participants believed they alone controlled the switch, creating time pressure and fear—conditions that exposed how quickly moral intentions can collapse into uncertainty.
4
Many explanations for inaction involved deferring responsibility to others or assuming the situation would be handled by technology or other workers.
5
The study found that freezing is a dominant response under threat when the “right” action is unclear, raising questions for how autonomous systems and human training should be designed.
6
The experiment’s safeguards were informed by past controversies in psychology, including Milgram’s obedience research, where participant distress was real even without physical harm.

Highlights

None of the participants pulled the lever, even though they believed the choice was between saving five and sacrificing one.

Participants frequently described “suspending responsibility”—waiting for others to act or assuming the train/technology would intervene.

The project treated ethics as a design constraint, using prescreening, trauma counseling, and a debrief to manage psychological risk.

Meeting the actors afterward reinforced that the crisis was staged, and follow-up suggested participants were doing well.

Topics

Trolley Problem
Autonomous Vehicles
Moral Decision-Making
Psychological Ethics
Freezing Response

Mentioned

Philippa Foot
Aaron Blaisdell
Stanley Milgram
Michael
Greg
Natasha
David
Cory
Elsa
J.R.