The negation game is an experiment in a new type of collective intelligence, we call it epistocracy. If you’d like to play an early prototype, you can find it here: play.negationgame.com
In this post I’m going to try to communicate the key ideas of how we’re trying to coax collective intelligence from individual contributions.
As brief motivation: we know that it’s possible to create systems of collective intelligence using individual incentives. Markets are exactly one such mechanism that aggregate individual moves to achieve collective outcomes. We also know that markets aren’t sufficient on their own as they don’t self regulate (cf. externalities).
Perhaps we can do better.
In order to approach collective intelligence we’re going to steal a key idea from the philosophy of science: making statements about how you could find out that you’re wrong improves the credibility of your ideas.
The development of the General Theory of Relativity is frequently offered as a prototypical example of the way in which the falsifiability of a theory is an important part of evaluating it.
As the story goes, Einstein’s theory made some surprising predictions. Among them was the prediction that the curvature of spacetime around massive objects would cause gravitational lensing, literally the light would appear to be coming from the “wrong spot” compared to the usual map of the locations of stars in our night sky.
According to the theory, even our sun should have been capable of producing this sort of lensing effect. But, it was too hard to tell just by looking at the sun because its luminescence drowns out the stars around it (scientists call this effect “daytime”). However, it was speculated that during a solar eclipse it should be possible to take pictures of this lensing effect. So, in 1919 an astronomer named Arthur Eddington travelled to Argentina during a solar eclipse to do just that.
Later that year the Royal Society of London accepted the observations as evidence of the veracity of Einstein’s theories.
This story has become a prototype of the scientific process: publish a theory, see what observations it predicts, and then check those predictions. It’s the story told by the philosopher Karl Popper to motivate his theory of what is and what is not science (called the Problem of Demarcation). And it also manifests in a theory of intelligence and consciousness called the Bayesian Brain hypothesis, which posits that the brain has a model of the world, it then runs that model to make predictions about what it will observe, and then it updates its model when it encounters surprising observations — i.e. observations it did not predict.
It’s worth noting that many aspects of this story are simplified here to make a point, and are the subject of ongoing debate. For example, there’s some debate as to whether Eddington’s observations were actually valid, or whether it was actually subsequent observations wherein the deflection was distinguishable. Those debates don’t change the overall message of the story.
Here is where I’ll begin to opine. Often, when this story is retold, Eddington’s experiment is described as a “confirmation” of Einstein’s theory. As we build the negation game, the core perspective we’ll take is not that Eddington confirmed Einstein’s theory, but rather, at the risk of a mouthful, that Einstein’s theory offered surface area for invalidation, and then Eddington’s observations failed to invalidate the theory. Similarly, in some work there is emphasis placed on the importance of the “predictions” coming from Einstein’s theory — with some arguing that it’s not a scientific theory unless it makes predictions. Here, we instead say that Einstein’s theory was more precarious due to the fact that the predictions it made were surprising, and that made the failure to invalidate all the more persuasive. These concepts of surface area for invalidation, failure to invalidate, and the principle of precarity (precarity equal to persuasion) will become key ideas as we describe the negation game.
And now, without further ado, let us begin to discuss the structure of the mechanisms in the negation game. The first stop on our journey is the mechanism of staking, shortly followed by epistemic leverage. For many readers, the concept of staking is likely to be familiar due to its role in securing the Ethereum chain using Proof of Stake. The essential idea behind Ethereum’s system is that the person who stakes gets a reward for doing so, but they can also lose some of their stake — called getting “slashed” — if they disagree with the consensus. The job of the person staking in this context is to predict which choice everyone else is going to make so they get their reward and avoid the punishment. Since the decision they have to make is quite simple it’s easy for everyone to stake accurately.
Similar to Proof of Stake, the objective of the negation game is to create an incentive landscape that can achieve a collective view by weighing the contributions of various players. Unfortunately, unlike the questions the Ethereum network has to answer (essentially, “in what order did I receive these messages?”), the questions that communities have to answer are plagued by ambiguity, and therefore automated slashing as a result of disagreeing with the consensus is impractical.
In its stead, the negation game reaches for voluntary slashing as the means of achieving consensus, and does so by making it rewarding for a player to slash themself. But why would someone slash their own stake, causing them to lose money? There are two kinds of incentives the system can give to make this self-slashing a desirable move:
additional influence — as a result of slashing oneself a player gains (or regains) influence
long term earnings — as a new consensus forms, players that slashed sooner than others receive financial rewards for informing the consensus
The former of these two incentives we call epistemic leverage. It permits a player to make statements about how they can find out that they’re wrong and in exchange have more influence in the network; for example, they might do this to increase the likelihood their preferred policy is enacted.
Concretely, this is how epistemic leverage works:
a player stakes against a policy outcome they prefer
the player extends that stake by also staking that a contradictory argument is not true and if true would erode their position (potentially prompting them to partially slash their previous stake)
a second stake bets that they will never slash their stake.
The purpose of the secondary market is to identify surprising signals. What makes a signal surprising is that it comes from a player that’s believed to be honest and is providing information that is believed wrong by the network. It’s not possible to infer either of those properties directly, but we can know (given certain assumptions) that if a player slashes their stake it must be because they were honest and they thought they were wrong.
In other words, this market is estimating the joint probability of wrong and honest. You’ll notice that these are exactly the conditions that preceded the empirical tests of General Relativity in the earlier story: predictions that could be falsified by observation and were expected to be. This is an incentive based falsifiability detector.
What’s potentially compelling about this formulation is that it offers closure within the mechanism. It’s not possible for any game theoretic mechanism to directly sense the outside world (e.g. see the stars), it can only know the bets players make and the stakes they slash, and then it must use that as an input into its internal algorithm to estimate trustworthiness of information.
It’s worth explicitly stating a non-goal: this mechanism is not trying to estimate whether a proposal is true or good. I don’t know how to directly extract that information, that has to be done through the normal methods of conversation and debate. Rather, this mechanism is useful for training collective attention onto surprising information so that the conversation can continue.
There are two properties to highlight that this mechanism engenders.
It rewards being seen as an intellectually honest player — a player who can find out that they’re wrong.
It does not overpunish error.
Intellectual honesty is perhaps the most obviously good, so let’s start there.
Existing social networks are a masterclass in incentive design, proving to each of us that independent of our normal personality, our incentive environment has significant influence over our resultant behavior. The holy grail of social networking has therefore been to find ways to engender prosocial and epistemic behaviors while still managing to attract and maintain a userbase. Unique to epistemic leverage as a mechanism set is the ability for players to have influence in the domains they care about in proportion to the degree that they are believed to be trustworthy players.
In the political sciences this problem is known to be notoriously hard. The problem of deciding who should have voice in matters of public controversy is often framed as “judge selection”. What makes it so hard is that there are no good solutions. On the one hand, if you grant judges life tenure they are free from political oversight — this gives them latitude to vote their conscience without the concerns of reelection, however, it also means that whatever mechanism places them on the bench becomes the hotly contested political battleground. If instead of life tenure they face a vote for their seat, whether from the public or an oversight committee, it incentivises them to cater to the whims of those voting parties instead of to the legality or facts of the case. Typically it’s not possible to sample from the various “judges” in proportion to the degree to which they are viewed as honest actors even by their adversaries, this mechanism offers that tantalizing possibility.
Social following algorithms can be thought of as a kind of solution to the judge selection problem. Your feed is populated with “judges” of varying repute, that you have chosen to follow for their takes. These allow for more of a bottom-up voice discovery mechanism than does judge selection, but suffer from indiscretion. Nothing about a follow or a like implies the informativeness of an idea, nor the reliability of its author. It’s just as likely that an account will achieve stratospheric success for following those prosocial norms as for flouting them. It’s not to say that those other mechanisms of influence and reach shouldn’t exist, just that this offers an alternative with more epistemic properties.
Markets are also bottom up mechanisms that have little interest in the individual reputations of participants. This is fine, they serve a different purpose. The improvement epistemic leverage offers is amending markets’ incentive for private information. Private information is the substance of market movements. If a player has a valuable source of information their best incentive has them privately trading on that information, certainly not disclosing it as that would allow other participants to capitalize on the insights. This is generally fine for markets, but is terrible for other domains. Imagine scientists refusing to publish their insights, data, or methods. Or the marketing department refusing to tell sales about their intended strategy. Sensemaking is best done — perhaps can only be done — in the context of shared information. This mechanism doesn’t disincentivize the sharing of information, and as we’ll see, can be easily augmented to significantly reward it.
While highlighting the fair-minded, intellectually honest player is an easily laudable goal, it’s forgivable to view the permitting of error with some doubt. After all, do we really want to have our attention drawn to ideas of dubious merit? Rather than argue on behalf of error, it’s more important to linger on the consequences of suppressing it. Breakthrough, paradigm-shifting insight is highly aleatoric. It cannot be predicted literally because the prevailing paradigm cannot conceive of it. It’s therefore impossible to meander down a wide road of new breakthroughs, instead, it must be done the opposite way — by mapping the negative space. If it’s expensive to explore moonshot ideas then most effort will be spent on incrementalism, which have orders of magnitude smaller returns. For a much more thorough exploration of this topic please see A Vision of Metascience. In the design of sensemaking systems it’s desirable to reward both exploration and exploitation, overpunishing of error suppresses exploratory behavior.
As generators of new social norms, these two properties offer a picture of a world where many contributors asynchronously identify manifold possibilities, their attention trained to the theories most likely to be wrong that arise from trustworthy proposers, which are quickly resolved by the proposer so as to retain their good reputation for intellectual honesty. Strange ideas can quickly bubble to the top, receive attention, and then recede back into the darkness. Occasionally, one won’t recede, and instead it will earn converts and further staking, now catapulting its circle of attention and critique even wider.
As the idea’s reach grows we can easily imagine its stake growing, too. Until now we haven’t discussed where the stake goes when a player voluntarily slashes it. Perhaps with the foreshadowing of the problems of private information you too can see the shining opportunity for what to do with those freed funds.
The Riemann Hypothesis and Google’s digital infrastructure have something in common: their ostensive impenetrability, and a large bounty if you can make progress in doing so.
The bounties offered for progress on these problems themselves encode significant information. The willingness to offer the bounty tells you that the issue is important. The continued presence of the bounty tells you that it hasn’t yet been solved. The size and duration the bounty remains open tells you that it’s hard, resilient in some way to attack. These signals can be incredibly valuable. Imagine software repositories that stake that their code is secure, creating a bounty for engineers to double-check. Imagine insurance companies that stake the assertion that your data isn’t sold to third parties, creating a bounty for employees to leak receipts. Imagine manufacturers that confidently stake statements of their ethical manufacturing practices, creating a bounty for journalists to discover the alternative. Imagine scientists that stake the likelihood of replication, and the integrity of their datasets, creating bounties for other labs to double-check.
Despite the significant income stream created for these professions the bounty itself isn’t merely the point. In addition to spawning an industry it also provides valuable information to decision makers. Just as 100 reviews with an average of 4.9 stars is a signal of quality, standing bounties — measured in their size and their duration — will become a signal of integrity.
So far, we’ve been seeking to paint a picture of how this mechanism of epistemic leverage might be useful. But as yet we’ve not touched on how it fits into a broader mechanism set.
In order for epistemic leverage to be useful it must be tied into a larger mechanism set which puts the information gleaned from epistemic leverage to use. Broadly what's necessary for this to work is:
Markets
A way to connect markets to resolution criteria (hint hint)
A consensus mechanism that can be resolved
Together we call this collection of mechanisms a Carroll Mechanism, named after Lewis Carroll's famous dialogue "What the Tortoise Said to Achilles". Carroll Mechanisms are a particular implementation of epistocracy.
Carroll Mechanisms consist of 3 gadgets in our mechanism set:
A modified LMSR, basically just a typical AMM which allows you to buy shares (as is expected) as well as dispute the relevance between shares (what we call Carroll Mechanisms in the doc I shared). You can review an intuitive introduction to vanilla LMSRs here, and you can read about how we intend to implement them in this post on Carroll Mechanisms.
Epistemic leverage, which allow us to used the modified LMSR to connect high level market statements (e.g. "Human cooperation evolved primarily through cultural group selection: groups with more effective cooperative norms systematically outcompeted groups with less effective norms, driving the evolution of human ultrasociality.") to lower level statements (e.g. "Groups with stronger cooperative norms will consistently outperform groups with weaker norms in direct competition for resources, territory, or survival.") in exchange for some additional financial and reputational risk.
A consensus mechanism. A consensus mechanism is essentially a way to get a group of people to agree on a conclusion when presented with a highly objective (low ambiguity) question. "How much did it rain today?" is typically a low ambiguity question, whereas "How many raindrops fell in exactly this area?" is more ambiguous.
Then, we tesselate these mechanisms such that a result given by the consensus mechanism propagates back up through the stack, passing through the intermediate layers of epistemic leverage and markets and back up to the high level market we care about resolving. In fact, this is the key idea to understanding why this cascade is trustworthy: consensus mechanisms are really powerful. They underly Proof of Stake, Proof of Work, Polymarket's market resolution system called UMA, and many other systems, so we would like to use them more broadly.
So, why can't we use them for all truth? Why can't we just say, "Is the Earth flat?" or "Should Trump be president?" and then play a consensus game? The problem (and the solution) is that consensus mechanisms are highly sensitive to higher-order beliefs. If you can shift what people think most other people think, then you can manipulate the market.
See for example this thread about the problems with UMA:
So, why doesn't this problem plague Ethereum's Proof of Stake system?
What differentiates PoS and PoW from something like this is that PoS deals with low ambiguity questions. "Is this hash the same as that hash?" The low ambiguity of the question (the low likelihood that someone will disagree) permits you to play a consensus mechanism game, without worrying about possible manipulation (this property is sometimes also called attributability, because you're able to attribute an errors or manipulation to a particular player).
If you accept this argument, then you should also accept that this simplifies our problem. Now, we no longer need to find a mechanism that allows us to converge on the truth, we already have one of those so long as we have low ambiguity questions, instead we merely need to be able to convert a high ambiguity question into a low ambiguity question that we can feed into the consensus mechanism.
So, for the rest of this I'll assume you've bought that argument, which permits me to boldly claim that you should expect this mechanism to resolve to true and accurate conclusions because of the properties of consensus mechanisms; and the great news about this claim is that there are already 2+ trillion dollars to be hacked if this statement could be shown false.
So then your question must be:
Can you detect high ambiguity in questions and then distill from those concrete resolvable subquestions?
The claim is that that's what the first two gadgets, Carroll LMSRs and epistemic leverage, do together.
We'll make the high level argument for it and then walk through a concrete cascade.
High level, epistemic leverage is giving out more power (for lack of a better word) to participants in exchange for them saying what would change their mind.
We can trust that the signals from epistemic leverage are sincere because:
participants have no reason to indicate falsely relevant counterinformation, because they'll lose in the Carroll markets
if participants are thought to be obstinate about relevant and true counterinformation then there's upside for counterparties who can 'doubt' them (call their bluff).
participants don't want to self-slash, because they lose their shares to the people holding the counterparty shares (and they don't want to hold the counterparty shares because they are in conflict with the preferred shares)
This therefore gives us an ambiguity pump, a way to start with a high level statement, and then steadily reveal the constituent definition and indicators of that statement by way of the topology of the market graph built by participants playing game theoretically optimal moves (there may be multiple and multiple overlapping definitions, but that's fine, welcome to humanity). Sometimes this is why I say epistocracy is like markets meeting category theory. We don't actually have the ability from the governance market's perspective to peer into the meaning of each statement (that's for AI to do, which we don't want to trust in this context because we will want to govern AI with this) but we can look at the topology of incentives and glean the relevant information.
Once we have played that game to its terminus we'll have generated many many low ambiguity statements which we can feed to consensus mechanisms. As those consensus mechanisms converge, that changes where you can find "valleys" in our incentive landscape — stable attractors in topology space — which changes what the price is for shares asking the highest level questions of the market.
Ok, now let's look at a concrete example:
participant creates a new market statement, which says, "We should build a bridge over the Riva river."
then, to gain more favor, the participant 100% restakes on the statement, "The commerce that passes over the Riva river would not change if there were a bridge." They are indicating that they would change their mind if this were true.
the shareholders in "commerce won't change" restake on the statement, "a month long experiment where ferry transfers are subsidied to be free saw an increase in commerce by less than 2%" (remember, this doesn't actually mean they believe this statement is true, just that they would change their mind if it were).
Now let's do one of the cascades:
Participants place their bets on how the month long experiment will resolve, this moves the market prices around
If enough people have placed their bets and enough people have restaked then there will be lots of connectivity in the markets. The liquidity from that disagreement let's assume is enough to provide the month long subsidy for the experiment. In general, because of the structure of this governance market, disagreement is subsidy for experimentation and information foraging
Finally, the experiment is run and a generalized consensus starts to emerge, the markets begin to resolve. First the lowest markets resolve (this would be markets even deeper than the market about the experiment), as those resolve there's a small drama that plays out at each level of the market. Let's look at just the interaction between the shares associated with "commerce won't change" and "experimental subsidies", we'll call them C and S.
C shareholders begin to feel the price of their shares falling and therefore the price for "build the bridge" rises, which is what they don't want. They have two options, change their positions, or change their epistemic leverage. Let's say they choose to increase their epistemic leverage to keep the price of "build the bridge" low
Now, people begin to see that the favor of S is quite high, and it's considered tightly coupled but that the holders of C yet willing to self-slash, so S holders take the opportunity to 'doubt' the C holders, to take the counterposition, which pays out more yield the higher the favor and relevance of S is to C. In this case it's high, so the yield is quite sweet so long as the C holders don't self-slash.
The C holders experience that as both a drain on their position, and a weakening of the price of their shares — the exact thing they were trying to avoid. They can now attempt to do one of two things, they can either acquiesce or protest.
If they acquiesce the game is over, they pay out the payment to the counterparty players (hopefully the ones that funded the month long experiment were wise enough to also buy shares so they could receive the payout from the the C holders) and
If they choose to protest, they have two options, they can either can dispute the relevance of S, participating in the Carroll Mechanism to essentially say, "These two shouldn't have been tightly coupled after all." However, generally at this point disputing the relevance won't be a profitable pursuit, since they sat on their epistemic leverage for such a long time. So really this leaves their only option to stem the bleeding by dispute the veracity of S. They can do this by buying more shares in S or using epistemic leverage on counterparty shares to S. The thing to notice is this: that's great news! We've just moved one level deeper in the stack, and now we're likely dealing with even more concrete claims (they'll have to be, if they've proven to be informative to the S holders).
Now consider what this would have been like for questions like these:
Does sugar consumption cause obesity and type 2 diabetes?
Do vaccines cause autism?
Do opioids cause suicides and deaths of despair?
Do cigarrettes cause cancer?
Is climate change human caused?
It's not that this sort of mechanism changes anything about the science that must be done in order to answer these questions. Instead what it does is increase the credible neutrality of the process by which we reach answers, it does this by creating balanced incentives, rewards for accuracy, transparent reasoning, and is permissionless to participate in (no pedigree required).
It's not just that this mechanism increases the credibly neutrality of the conclusion (by modulating the influence of money), it's also directing funding to wherever there is disagreement, because that disagreement essentially represents a bounty which can be claimed for reducing ambiguity or risk.
That's helpful because it means disagreement becomes the heat that powers an engine for generating insights. This means that on net, we should expect people that use this to have better bridging behaviors: they'll converge faster to agreement and at a lower cost.
In a nearterm world, this will become especially important because it won't just be humans we want to incentive align and to transparently trust, but agents, too.
Language summary:
point — a statement with associated shares that can be linked to one another with the Carroll mechanism
stake — purchasing the shares of a particular point. It's a bit more accurate to call it staking that purchasing, since shares seem to imply you want some upside, when with these markets you might just be buying (staking) to increase the likelihood that some event will occur
restake — use epistemic leverage to get more influence by indicating that another particular point would be informative
slash/self-slash — give up a position they took by retaking, double down on a point by risking some money, influence, and reputation
doubt — take the counterparty position to a restake. A doubter is calling the bluff of the restaker. A doubter gets two things for doubting: they get paid by the $ that was restaked, and they get to decrease the extra influence a point got from the player retaking for it. However, the doubter loses the $ they doubted if the restaker slashes
Epistocracy represents a fundamental reimagining of how collective intelligence might emerge from individual contributions. By borrowing falsifiability from the philosophy of science and encoding it into market mechanisms, we create a system where intellectual honesty becomes profitable, where disagreement funds discovery, and where the most precarious ideas — those most likely to be wrong — receive the attention they deserve. Rather than seeking truth directly, this mechanism trains our collective attention on the boundaries of what we think we know, creating bounties for those willing to venture into uncertainty. In a world increasingly governed by algorithms and agents, such a system offers not just better decision making, but a more credibly neutral foundation for the conversations that shape our future. The game doesn't promise to make us right more often — it promises to make us wrong more quickly, more transparently, and more productively. And in that promise lies the seed of genuine collective wisdom.
Connor McCormick
So, we've succeeded in ratcheting once more toward low ambiguity questions, eventually one of which will be so low ambiguity that we can resolve it with a consensus mechanism. We have our ambiguity pump!
Therefore, we get one of two outcomes every time we have a conflict like this, either we get funding for information foraging (running experiments, collecting data, doing research, collecting opinions) or we reduce the ambiguity of our question, getting one step closer to resolving our markets and taking action at the top level (in this case, building our bridge). We're surfing the explore-exploit tradeoff frontier using markets!
Support dialog