SIDS · Evidence, Rigor, and RCTs · Interactive Exercise

Threshold Deliberation Exercise

New Zealand, 1986–1991. The evidence is arriving. So is the opposition.

"A year before I was born, my parents lost their healthy three-month-old son—their third child—to 'crib death.' Without a remarkable study, you or your children might have shared my late brother's fate."

SEEBELIEVECREATE

What this exercise is for

The SIDS case is often described as a triumph of observational epidemiology. But in real time, the decision to act was not obvious — it was contested. At each evidence stage, you will face a credible dissenting voice raising a legitimate scientific objection, and a stakeholder pressure creating a different kind of noise. Your task is to engage with both before making a recommendation.

Each stage has five steps that must be completed in sequence. None can be skipped.

Graduate framing

The dissenting voices in this exercise are not strawmen. Each raises a concern that would be considered methodologically legitimate in many contexts. The analytical challenge is not to dismiss them, but to weigh them against the specific features of this evidence body — and to distinguish scientific caution from obstruction.

Five steps per evidence stage

1. Read the evidence · 2. Rebut the dissenting voice · 3. Classify the stakeholder pressure · 4. Assign the asymmetric cost · 5. Make your recommendation

Each step unlocks the next. Feedback after each stage addresses all five inputs.

Part One · Stage 1 of 3

The pattern in the records

Observational — clinical pattern

What Mitchell found

Dr. Ed Mitchell, the most junior member of his department, is assigned to review infant deaths occurring outside the hospital — deaths that hospital physicians never saw because these patients never arrived alive.

Reviewing the records, he finds a striking and repeated pattern: otherwise healthy infants who went to sleep and did not wake up. Dr. Shirley Tonkin, who had spent years visiting bereaved families, noticed that many apparently healthy babies had been placed to sleep on their stomachs. Dr. Susan Beal had made the same observation independently.

At Dr. Tonkin's insistence, sleep position is added to Mitchell's review. The review confirms a strong pattern: many infants who died suddenly had been placed prone.

What you know so far

A consistent clinical pattern across independent observers: prone sleeping and sudden infant death appear to co-occur. But a pattern alone does not establish cause. You do not yet know whether all babies are put to sleep prone — the association could be an artefact of universal practice.

SP

A Senior Paediatrician

Government Advisory Panel on Infant Health

Dissenting voice

"We have acted on clinical patterns before and been forced to reverse ourselves publicly — and that reversal cost us more in public trust than the delay would have. A pattern observed across case records, without a comparison group, is a hypothesis, not evidence. We have no way of knowing whether prone sleeping is universal. If it is, this association tells us nothing. I cannot support any recommendation to the Ministry on the basis of what amounts to a shared clinical impression."

Before proceeding: respond directly to this objection. Do you agree, partially agree, or disagree — and why?

Your rebuttal:

Please write a rebuttal before continuing.

⚖️

Incoming pressure

The legal counsel of a major manufacturer of prone-sleeping infant positioners — a product widely marketed as reducing colic and reflux — writes formally to the Ministry of Health. The letter warns that any government communication linking prone sleeping to infant death, without completed trial evidence, will be treated as commercial defamation and met with litigation. The manufacturer notes that their product has been recommended by some paediatricians and is widely sold through healthcare retailers.

Classify this pressure before proceeding:

Please classify this pressure before continuing.

Asymmetric cost assignment — where does the greater risk lie?

Acting too early
If the association doesn't hold

Waiting too long
If the association is real

Move the slider to assign the greater risk

Please move the slider off-centre before continuing.

Based on this evidence — and having engaged with the objection and the pressure — what do you recommend?

Your recommendation and reasoning:

Please select a recommendation and provide your reasoning.

Part One · Stage 2 of 3

The case-control study — year one data

Statistical — case-control study, year one

What the data showed

A three-year case-control study launches, comparing infants who died suddenly to infants who did not. This design directly addresses Prof. Pemberton's objection: it controls for the possibility that prone sleeping is universal, by measuring rates of prone sleep in cases against a matched group of living controls.

At the end of year one, Mitchell takes the data on sabbatical to London. He is himself skeptical: "As a pediatrician, I couldn't believe something as simple as placing babies prone could increase the risk of sudden death substantially. Basically thought it was rubbish."

He shows the data to a prominent statistician. On the back of an envelope, the statistician calculates that prone sleeping accounts for more than half of all SIDS deaths. It is rare for a single risk factor to account for more than half of deaths from any cause.

What you know so far

A well-designed case-control study is underway. Year-one data suggest prone sleeping may account for the majority of SIDS deaths — an unusually strong association. The study is not yet complete — two more years of data remain.

GS

A Government Statistician

Ministry of Health Research Division

Dissenting voice

"I am not opposed to action in principle — but year-one data from a three-year study has well-known statistical instability. Interim effect sizes routinely attenuate as follow-up continues. We have an institutional policy against basing public health recommendations on incomplete trials for precisely this reason. If this association halves by year three, we will have launched a national campaign on a finding that didn't replicate within its own study. The credibility cost of that reversal would set back evidence-based public health in this country for a decade."

Before proceeding: respond directly to this objection. She is not arguing against the evidence — she is arguing against acting on it yet.

Your rebuttal:

Please write a rebuttal before continuing.

🏛

Incoming pressure

Two cabinet ministers are applying pressure in opposite directions. The Minister for Health Promotion, facing an unrelated political crisis and needing a visible public win, is pushing for an immediate campaign announcement — for reasons that have nothing to do with the evidence. Meanwhile, the Minister for Government Accountability is warning that any campaign launched on the basis of year-one data from an incomplete study, which subsequently requires reversal, will be used as evidence of government recklessness in the upcoming election. Both are calling the research team directly.

Classify this pressure before proceeding:

Please classify this pressure before continuing.

Asymmetric cost assignment — where does the greater risk lie?

Acting too early
If effect attenuates by year three

Waiting too long
If the association is real and holds

Move the slider to assign the greater risk

Please move the slider off-centre before continuing.

Based on the evidence now available — and having engaged with Dr. Stanmore and the political crossfire — what do you recommend?

Your recommendation and reasoning:

Please select a recommendation and provide your reasoning.

Part One · Stage 3 of 3

The mechanism question — and the confounding challenge

Biological plausibility — the mechanism question

What remains unknown

Mitchell reflects: "As a pediatrician, I couldn't believe something as simple as placing babies prone could increase the risk of sudden death substantially."

The precise biological mechanism by which prone sleeping causes sudden infant death is not known. It remains unknown to this day. However, biological implausibility weakens considerably once you accept that sleep mechanics could affect airway function and that a prone infant's ability to rouse in response to physiological stress may be compromised.

The intervention — placing infants to sleep on their backs — is simple, low-cost, and has no plausible mechanism of harm.

What you know so far

A rigorous case-control study shows an unusually strong association. The mechanism is not fully understood, but biological implausibility does not hold once sleep mechanics are considered. The proposed intervention carries no plausible risk of harm.

CP

A Child Health Physiologist

National Institute for Child Health Research

Dissenting voice

"Without a mechanism, we cannot rule out that prone sleeping is a proxy for something else entirely — a third variable we haven't measured. Large effect sizes in observational studies are not proof against confounding; they can actually make confounded associations more dangerous, because they appear more convincing than they are. Consider the J-curve for alcohol and cardiovascular mortality: abstainers appeared to have worse outcomes than moderate drinkers for years, and the association was strong. It turned out to be an artefact — sick quitters who had stopped drinking because of existing illness were distorting the abstainer category. If prone sleeping is correlated with some other unmeasured risk factor — room temperature, bedding type, parental smoking, feeding method — we could campaign against a behaviour that was never the actual cause."

Before proceeding: this is the most sophisticated objection you have faced. Respond directly. Does the J-curve analogy hold here? What features of the SIDS evidence make confounding more or less plausible?

Your rebuttal:

Please write a rebuttal before continuing.

📰

Incoming pressure

A prominent national newspaper runs a front-page story: "Government to dictate how babies sleep — on the basis of an unfinished study." The article quotes a respected academic (unnamed) calling the evidence "preliminary and mechanistically unsubstantiated." The Ministry of Health communications team advises the research group that launching a campaign now will be portrayed as reckless in the press, and that silence is the politically safer option. The communications director suggests waiting until the full three-year study is published before saying anything publicly.

Classify this pressure before proceeding:

Please classify this pressure before continuing.

Asymmetric cost assignment — where does the greater risk lie?

Acting too early
If prone sleeping is a proxy and not causal

Waiting too long
If the causal relationship is real

Move the slider to assign the greater risk

Please move the slider off-centre before continuing.

Based on everything now available — and having engaged with the confounding objection and the media pressure — what do you recommend?

Your recommendation and reasoning:

Please select a recommendation and provide your reasoning.

Part One · Stage 4 — The full picture

What an RCT would have required

Intervention feasibility — what waiting actually meant

The RCT that could not be conducted

An RCT to confirm that prone sleeping causes SIDS would have required randomizing thousands of infants to sleep on their stomachs — deliberately exposing them to a risk that the existing evidence already suggested was substantial.

The trial would have taken years. Once evidence about prone sleeping became public, adherence in the control arm would have been impossible to maintain. Randomization was not merely inconvenient — it was ethically problematic. The evidence could not be blinded. The time frame was not feasible.

The New Zealand government did not wait. It launched the Back to Sleep campaign, educating parents to place infants on their backs. SIDS deaths declined spectacularly — by more than half. To this day, no one knows exactly why prone sleeping causes sudden infant death.

The full picture

The New Zealand government acted on: a well-designed case-control study; an unusually strong association; no biological implausibility once sleep mechanics were considered; a simple, low-cost intervention with no plausible harm; and an RCT that was ethically and practically impossible to conduct. The dramatic decline in SIDS deaths confirmed the hypothesis. "Waiting for the RCT" was not a cautious position — it was a choice with a cost measured in preventable infant deaths per year of delay.

Graduate lens

The three dissenting voices you faced represent a progression: procedural caution (no comparison group), statistical caution (interim data instability), and epistemological caution (confounding without mechanism). Each is a legitimate scientific concern in the abstract. What made them insufficient here was not that they were wrong in principle — it was that the specific features of this evidence body addressed them. The case for action did not require dismissing the objections. It required showing why they did not apply.

Reflection before your summary

You have engaged with three dissenting voices and three stakeholder pressures. Looking back across your decisions: did any of the objections actually change your position? Which pressure was hardest to set aside — and why? What does that reveal about how you apply an evidentiary threshold under opposition?

Your reflection:

Please write your reflection before continuing.

Part One Complete

Your deliberation record

What the SIDS case established

The most important question for evaluating evidence is not "was this an RCT?" but "is this evidence rigorous enough for this decision?" The dissenting voices you faced were not wrong to raise their concerns — they were wrong to treat those concerns as decisive in the face of an unusually strong association, a no-harm intervention, and an RCT that was structurally impossible. The SIDS case is not a story about ignoring caution. It is a story about knowing when caution becomes obstruction.

Graduate lens

Program-based evidence and evidence-based programs are both forms of technical rigor — the difference is sequencing, not standard. The Back to Sleep campaign was itself an epidemiological instrument: the spectacular decline in SIDS deaths was confirmatory evidence for the causal hypothesis that no ethical RCT could have produced.

Part Two — RCT Feasibility Audit

Choose a public health question to audit

Three of these questions involve interventions for which demands for RCT evidence have been used to argue against action. One is a question for which an RCT is exactly the right instrument. Work through each to develop the analytical skill to tell them apart — the verdict will not always be the same.

Graduate framing

As you work through each audit, ask: who benefits from demanding RCT evidence for this question? The RCT fallacy is not only an intellectual error — it is a strategy.

National sodium reduction

A researcher proposes a national sodium reduction campaign. Critics say there is "insufficient RCT evidence."

Tobacco taxation

Governments raise tobacco taxes to reduce smoking. The industry argues there is no RCT evidence that price increases save lives.

Soda taxation

A municipality proposes a sugar-sweetened beverage tax. Critics demand an RCT showing it reduces obesity or diabetes.

Blood pressure medication

A pharmaceutical company proposes an RCT comparing a new blood pressure medication to standard treatment. Should this be tested in an RCT?

Please select a question before continuing.

Part Two — RCT Feasibility Audit

RCT Feasibility Audit

Your question

Work through each feasibility criterion for an RCT of this question.

1. Can you randomize?

2. Can you blind?

3. Is the time frame feasible?

4. Is randomization ethical?

5. Can adherence be maintained?

What evidence does exist — and what does it show?

Please answer all five questions and provide your evidence assessment before continuing.

Part Two — Audit Result

RCT Feasibility Verdict

The key distinction

The absence of RCT evidence and the absence of evidence are not the same claim. The evidence that exists for these questions is not insufficient — it is a different kind of evidence that must be evaluated on its own terms.

Graduate lens

Treating a pharmaceutical RCT and a population-level policy question by the same evidentiary standard is a category error that systematically advantages interventions that can be packaged as products over interventions that benefit populations.

Part Three — Synthesis

Lessons and debrief

The threshold question

Not "was this an RCT?" but "is this evidence rigorous enough for this decision?" Rigor means systematic collection, honest interpretation, and calibrated confidence — whatever form the evidence takes.

Program-based evidence

Implement on the basis of strong observational evidence, monitor outcomes systematically, revise as needed. Not a lower standard — the appropriate standard for a different kind of question.

Scientific caution vs. obstruction

Legitimate scientific caution is specific and addressable. Obstruction uses the form of scientific caution to delay action indefinitely. The distinction lies in whether the concern is proportionate to the evidence and the cost of waiting.

The RCT fallacy

Demanding RCT evidence when it cannot be obtained, then treating its absence as evidence of no effect, is not only an intellectual error. It is a tool of obstruction used strategically to delay action.

Asymmetric cost

Evidence decisions are not symmetric. The cost of acting too early and the cost of waiting must both be made explicit. In the SIDS case, "waiting" meant preventable infant deaths per year of delay — a cost that rarely appeared in the scientific debate.

Humility and generalizability

Even excellent evidence is evidence about a specific time, population, and context. Comstock's call for replication of his own findings was not self-doubt — it was the highest form of scientific integrity.

Discussion questions

1. Of the three dissenting voices you faced, which gave you the most pause — and why? Looking back, was your hesitation justified, or was it a form of being swayed by the authority and confidence of the objection rather than its substance?

Graduate extension: The three objections represent a progression from procedural to statistical to epistemological caution. At what level of sophistication does a scientific objection become genuinely difficult to distinguish from strategic obstruction — and what analytical tools help make that distinction?

2. Dr. George Comstock called for replication of his own landmark epidemiological findings: "That was then, this is now. The bacteria might have changed. The nutritional status of infected people might have changed. The inoculum might have changed. We won't know unless we study it." How does this kind of scientific humility differ from the caution raised by the dissenting voices in this exercise?

Graduate extension: Comstock was calling for replication after action, not instead of it. Is this the model that the SIDS case followed? What is the relationship between program-based evidence and the kind of ongoing monitoring Comstock was advocating?

3. In your RCT audit, what evidence does exist for your chosen question? Evaluate it on its own terms — not by whether it is an RCT, but by whether it is rigorous enough for the decision at hand.

Graduate extension: Who benefits from demanding RCT evidence for your chosen question? Identify the specific mechanisms by which the RCT fallacy functions as a strategic tool — and what institutional architecture would be required to counteract it.

4. The nasal spray influenza vaccine was supported by several well-designed RCTs showing it superior to the injected vaccine in children aged two to eight. Real-world surveillance the following year showed it was no better — and subsequently completely ineffective against the most common circulating influenza strain. What does this reveal about internal versus external validity — and when is real-world surveillance a more reliable guide than a rigorous RCT?

Graduate extension: What monitoring systems would need to be in place for an evidence-based program to detect when its evidence base has become outdated — before the failure becomes visible in outcomes?