Synthetic consensus is not real consensus

Why averaging agent opinions creates a dangerous illusion of agreement, and how ZeldaLabs designs its simulation outputs to preserve the structure of disagreement.

The Aggregation Illusion

When you average the outputs of a thousand agents, you get a number. That number feels authoritative. It has the weight of aggregation behind it. A thousand voices distilled into a single metric. But averaging opinions is not the same as building consensus, and the difference matters enormously for anyone using simulation outputs to make real decisions.

Consider a concrete example. You simulate 1,000 personas responding to a proposed healthcare policy. The mean approval score is 6.2 out of 10. This looks like moderate support. But the underlying distribution is bimodal: 45% of agents scored 8 or above, 40% scored 3 or below, and only 15% fell in the middle range. The mean describes a position that almost no individual agent actually holds. Reporting it as 'the result' would be actively misleading.

What Real Consensus Requires

Real consensus is a social process, not a mathematical operation. It involves negotiation, compromise, persuasion, and often the active suppression or accommodation of dissent. It takes time. It requires agents to actually engage with opposing arguments, update their positions, and arrive at agreements that account for minority concerns. The philosopher Jurgen Habermas described ideal consensus as the outcome of 'uncoerced discourse' where all participants have equal voice.

Synthetic consensus, the mean of agent outputs, skips this process entirely. It takes a snapshot of independent opinions, averages them, and presents the result as if agreement was reached. No negotiation occurred. No compromises were made. No minority concerns were addressed. The number exists in a social vacuum.

Three Properties of Real Consensus

1. Specificity

Real consensus is specific. It is not 'we generally agree on the direction.' It is 'we agree on these exact terms, with these exact conditions, and these exact carve outs for these exact concerns.' An average score of 6.2 has no specificity. It tells you nothing about what people actually agree on.

2. Residual Disagreement

Real consensus coexists with documented residual disagreement. The people who did not get their preferred outcome know exactly what they conceded and why. An average erases this information entirely. You cannot recover from a mean score which groups compromised, which groups held firm, or which issues were traded against each other.

3. Cost

Real consensus has a cost. It takes time, requires concessions, and sometimes fails entirely. The ease of computing an average is itself a warning sign. If consensus was easy to reach, it probably was not real consensus. In our simulations, scenarios where agents converge quickly and smoothly are more likely to reflect model level biases than genuine agreement.

How LLM Biases Manufacture False Consensus

Language models have a well documented tendency toward sycophancy and resolution. They want to agree. They want to find common ground. They want to resolve tension. This is a feature for a helpful assistant. It is a catastrophic bias for a social simulation. When LLM backed agents debate, the models' resolution bias pushes agents toward convergence even when their assigned psychometric profiles would predict sustained disagreement.

At ZeldaLabs, we measured this directly. In a controlled experiment, we ran the same policy debate with and without anti convergence mechanisms. Without intervention, 78% of agent pairs reached apparent agreement within three rounds. With our divergence preservation protocols (which penalize premature convergence and reward position maintenance consistent with psychometric profiles), only 34% converged. The difference represents the model's resolution bias, not genuine agreement.

ZeldaLabs' Approach to Reporting

Our platform explicitly resists the aggregation illusion. We report distributions, not averages. Every simulation output includes the full opinion distribution with identified clusters, not a single summary statistic. We surface minority positions explicitly, including their intensity and the reasoning behind them. We measure the structure of disagreement: where fault lines exist, which groups are furthest apart, and which issues are most divisive.

We distinguish between scenarios where consensus could plausibly form (overlapping distributions with bridgeable differences) and scenarios where it cannot (bimodal or multimodal distributions with no common ground). We report the conditions under which agreement might emerge: what concessions would be required, which groups would need to compromise, and what the cost of that compromise would be.

The most dangerous output of a simulation is a false sense of certainty. A single number that suggests agreement where none exists. A mean that masks a bimodal distribution. A convergence that reflects model bias rather than genuine persuasion. ZeldaLabs builds its reporting infrastructure around one principle: show the user the full landscape of opinion, including its conflicts, its extremes, and its unresolved tensions. Because the shape of disagreement is almost always more informative than the illusion of consensus.

All Lab Notes

Stay in the loop

Sign up for Lab Notes by zeldaLabs

Short dispatches from the frontier of synthetic human intelligence. No spam. Unsubscribe anytime.