ZeldaLabs
OUR NEW FLAGSHIP PRODUCT IS OUT!Check out TownSquare ->
Back to Research

Lab Note

|

March 2026

|

7 min read

What happens when you give 1,000 agents moral foundations

Emergent ethical behavior in large scale agent populations grounded in Haidt's Moral Foundations Theory, tested against ZeldaLabs' PersonaGen pipeline with 1,000 synthetic personas across five English speaking countries.

The Experiment

Jonathan Haidt's Moral Foundations Theory identifies six dimensions along which human moral reasoning varies: Care/Harm, Fairness/Cheating, Loyalty/Betrayal, Authority/Subversion, Sanctity/Degradation, and Liberty/Oppression. These six foundations explain a remarkable amount of variance in political attitudes, policy preferences, and intergroup conflict. We asked a simple question: if we embed these foundations as weighted parameters in synthetic personas and scale to 1,000 concurrent agents, what emergent social structures appear?

ZeldaLabs generated 1,000 personas using the PersonaGen engine, drawing from demographic distributions across five English speaking countries (UK, USA, Australia, Canada, New Zealand). Each persona received a full psychometric profile: Big Five personality traits, Schwartz value dimensions, attachment style, political orientation, religiosity, education level, and crucially, a unique moral foundations weighting. These weightings were calibrated against published MFT survey norms to ensure population level distributions matched real world data.

What Emerged

Without any explicit programming of group behavior, agents self organized into moral communities. This was the first and most striking finding. Care dominant agents formed cooperative clusters oriented around harm prevention and mutual aid. Authority weighted agents created hierarchical structures with clear deference patterns. Liberty focused agents resisted both, forming loosely connected networks that actively opposed coordination attempts from other clusters.

The clustering was not binary. Agents with mixed foundation profiles (high Care and high Authority, for example) formed bridge communities that mediated between clusters. These bridge agents played an outsized role in information propagation, a pattern consistent with Granovetter's weak ties theory and documented in real social network research.

Moral Outrage as Emergent Phenomenon

We introduced policy scenarios that violated specific moral foundations. A scenario involving perceived unfairness triggered intense response from Fairness weighted agents but was largely ignored by Authority weighted agents. A scenario involving perceived disrespect to institutional norms triggered the reverse pattern. The intensity of response scaled with the degree of foundation violation, not linearly, but with a threshold effect. Below a certain violation intensity, agents engaged rationally. Above it, they shifted to emotional, coalition mobilizing behavior.

This threshold effect has been documented in social psychology literature but has never been reproduced in synthetic populations. The fact that it emerged without explicit programming suggests that moral foundation weightings, combined with realistic cognitive architectures, are sufficient to generate the nonlinear dynamics observed in real moral outrage.

Intensity Asymmetry

One of the most policy relevant findings was intensity asymmetry. Agents whose moral foundations were violated held their positions with significantly higher conviction than agents who merely agreed with a policy. This mirrors real world dynamics where opposition is often louder and more organized than support. In our simulation, foundation violating scenarios generated 2.7x more inter agent messages than foundation affirming scenarios, even when the absolute number of agents on each side was roughly equal.

This has direct implications for policy simulation. If you only measure the mean opinion of a synthetic population, you miss the asymmetry in engagement intensity. A policy that 60% of agents support and 40% oppose can still fail in practice if the opposing 40% are 3x more motivated to act.

Internal Foundation Conflicts

Perhaps the most interesting finding was intra agent conflict. Personas with high weightings on competing foundations (Liberty and Authority, or Care and Fairness in zero sum scenarios) exhibited measurable internal tension. They took longer to form opinions, changed positions more frequently, and produced more nuanced reasoning. This mirrors the psychological concept of moral dilemmas, situations where deeply held values point in opposing directions.

In a population of 1,000 agents, 23% exhibited significant internal foundation conflicts on at least one policy scenario. These conflicted agents were disproportionately likely to be swing votes in group deliberation, consistent with the median voter theorem and real world observations about undecided voters.

Implications

Moral diversity alone, properly grounded in validated psychometric frameworks, is sufficient to generate realistic social complexity. You do not need to explicitly program polarization, outrage, or coalition dynamics. They emerge from the interaction of agents whose moral reasoning is genuinely heterogeneous. This validates ZeldaLabs' approach to synthetic human intelligence: ground the individual in real psychology, and the collective behavior takes care of itself.