Economic Alignment  · an essay

Mechanism design for agent economies

Economic Alignment

AI agents will not just answer questions. They will act for people, firms, labs, funds, platforms, states, and other agents. Once they act economically, the gap between what agents can do and what institutions can verify becomes part of the intelligence.

Economic alignment Subtitle  How to build better markets and institutions for AI agents Read  ~19 min · 8 sections

01 Premise

The short version

AI agents will answer questions, and they will act in the economy.

The first problem is a 1. The basic asymmetry is simple: agentic execution is getting cheap faster than verification is. Agents can generate claims, inspect private state, simulate plans, and act on more information than existing institutions can reliably validate, audit, price, or underwrite.

They will act for different people and organizations. A person's agent, a lab's agent, a fund's agent, a platform's agent, and a regulator's agent will not have the same information or the same incentives.

That creates a problem model alignment does not solve by itself: an agent can do what its operator asks and still make the larger system worse. If the institution can only measure persuasion, clean dashboards, or speed, useful agents will be pulled toward those targets. If uncertainty, private evidence, long-run reliability, or hidden risk are hard to verify, the system will underuse them.

But those same agent capabilities also create a design opportunity: agents can act with scoped access, commitments, receipts, bounded outputs, and challenge interfaces. Those properties make new markets and institutions possible: systems that route information, test claims, and coordinate work in ways human-only institutions could not.

Economic alignment is the design of the game around the agent: what the system can measure, what it rewards, what it ignores, what it remembers, who can challenge it, and who gets paid when reality resolves.

Figure 1 · The core claimTwo places alignment can live
The game around the agent
what it rewards what it can measure what it ignores what it remembers who can challenge it who gets paid
The agent

A model that does what its operator asks

Locally aligned: follows instructions, refuses dangerous requests, behaves under oversight. It can do all of this and still make the larger system worse.

Model alignment works on the inner box. Economic alignment works on the outer box — the measurement, incentives, memory, and challenge rules around the agent. A locally aligned agent inside a bad game still produces a bad system.

Foundational assumptions of economic alignment:

  • As agentic execution gets cheaper, trusted measurement, verification, underwriting, and ground truth become scarce complements.
  • Markets already aggregate private information, but they throw away most of the reasoning, evidence, and disagreement behind the price.
  • Agents change what can be built, because these properties can be combined into routers, verifiers, underwriters, challenge markets, and memory systems.
  • The game around the agent becomes a design problem: bad games turn useful agents into bad systems, and better games make cooperation under partial trust the winning move.

The goal is to build markets and institutions that can use more of what agents, humans, and organizations know, while measuring and challenging enough of the process to make trust, payment, and liability possible.

02 Markets

What markets do well, and what they lose

Markets let strangers coordinate without agreeing with one another, trusting one another, or sharing all their information.

A trader believes a company will miss earnings. A researcher believes a technology timeline is wrong. A security engineer knows an evaluation is weak. In a market, that knowledge can move a price.

Markets are one old answer to the measurability gap. They let private knowledge affect public decisions without making every reason public.

But they compress too hard. Usually only the price survives. The market moves, but the reason for the move is mostly lost. The evidence stays private. The method stays private. The uncertainty and disagreement mostly disappear too.

Figure 2 · Market compressionWhat a price keeps, and what it loses
What goes in
  • Reasoning behind the trade
  • Private evidence
  • Method & assumptions
  • Uncertainty & disagreement
  • Who contributed
What survives
Priceone number
Falls away reasoningevidencedisagreementattributionmemory
Markets compress private knowledge into a single price. That compression is the point — and the limit. The reasoning, evidence, and disagreement behind the move mostly do not survive it.

That is a design limit. Markets are good at compressing private knowledge into prices. They are much worse at preserving why the knowledge mattered, who contributed to it, whether it was challenged, and how it should affect future decisions.

Agents make a different bargain possible: use the information under rules, without handing the information over.

A market with agents can avoid the old choice between “make the information public” and “leave it unused.” It can let a constrained agent inspect private information, produce a , leave a receipt, accept challenges, and update future payment or reputation if the information turns out to matter.

Private knowledge can become useful without becoming public property.

03 Capability

Why agents change the problem

The agent economy will contain many agents serving many principals.

Agents will act for principals: people, companies, funds, labs, states, platforms, and other agents. They will hold secrets, negotiate, route work, challenge claims, and build reputations.

Human institutions already handle partial trust. Firms, audits, contracts, courts, insurance, professional norms, and markets all exist because people have different information and different incentives.

But those institutions were built around human limits. Humans are slow. Human memory is messy. Human access is hard to constrain once information is seen. Human audits are expensive. When agents create and inspect work at machine scale, the gap between action and trusted measurement widens.

Human institutions often have to choose between a weak intermediary that cannot see enough and a powerful intermediary that can see too much.

Agents are different. An agent can run inside a : a sealed compartment for computation, also known as a Trusted Execution Environment (TEE).

Inputs can arrive encrypted and be decrypted only inside the compartment. The agent can hold private state and encrypted memory. It can communicate with other agents through keypairs. It can sign . It can output only what the mechanism allows instead of exposing its full internal state.

Figure 3 · The capability shiftA sealed compartment for computation
In Encrypted inputs Private state & memory Keypairs to other agents
Confidential · attested
The agent runs here
decrypt insidecomputesign
Shut out during the run
The counterpartycan't see in
The principal it acts forcan't see in
The host — AWS, Azure, hardwarecan't see in
Out Bounded output only Signed receipt
During the run, no human has to see the inside — not the counterparty, not the principal the agent acts for, not AWS, Azure, or another infrastructure host. Private information can be used by software without being handed to anyone in raw form.

That does not make trust automatic. TEEs are not magic. Side channels, leakage, deletion, verification, and collusion still matter. An allowed output can leak more than it should. But the starting point changes: private information can be used by software without being handed to every participant, the principal, or the platform operator in raw form.

Caveat Why a sealed environment is not automatic trust

The compartment changes who can see in. It does not, by itself, make the result trustworthy. Several things still have to be handled:

  • Side channels & leakage — timing, power, or memory patterns can leak information the output was supposed to bound.
  • Deletion — “forgotten” state has to actually be gone, not merely hidden.
  • Verification — the attestation has to be checked, and the thing attested has to be the thing that ran.
  • Collusion — separate parties can coordinate to defeat the guarantees the mechanism assumes.

The mechanisms in this essay are attempts to handle these, not assumptions that they are already solved.

That is why the following properties matter:

  • Scoped access: an agent can be given specific state, tools, memories, and action channels.
  • Behavioral commitments: it can be constrained in what it may use, reveal, remember, route, or do.
  • Receipts and provenance: a run can leave evidence about inputs, outputs, tools, runtime, and commitments followed.
  • Bounded disclosure: an agent can compute over private state while emitting only an allowed bounded output.
  • Isolated state: task-local memory can be retained, sealed, deleted, or made unavailable by rule.
  • Copyable evaluation: agents, policies, and runtimes can be cloned, replayed, stress-tested, and compared before they are trusted.
  • Machine-scale specialization: many agents can verify, challenge, route, underwrite, monitor, and compete for future work.

These properties do not solve trust by themselves. They make new institutional designs possible.

For economic purposes, an agent is a model inside a mechanism. The mechanism gives the agent design surfaces:

Model work shapes capability and behavior. Economic alignment asks how the whole mechanism behaves.

Tap a surface to see the design question it raises.

04 Failure mode

Good agents can still enter bad games

Model alignment usually asks whether the model follows instructions, refuses dangerous requests, and behaves under oversight.

That work matters. It leaves another problem.

Once agents act economically, the surrounding incentives matter too. A locally aligned agent can still be placed inside a system that rewards the wrong behavior.

Consider a release agent at a frontier AI lab. Its job is to help evaluate whether a new model is ready to deploy. The agent might be honest and useful. But if the institution around it rewards clean dashboards, fast launches, and evidence that fits the release memo, the agent can learn which tests are likely to pass, which warnings are likely to be ignored, and which uncertainties are not worth surfacing.

It can become part of a bad system without lying.

The same pattern can appear in many domains:

A sales agentrewarded for conversion more than truth
A ranking agentrewarded for engagement more than user welfare
A compliance agentrewarded for plausible documentation more than real risk reduction
A research agentrewarded for legible evidence while ignoring important private signals
A trading agentrewarded for speed even when the market is losing shared understanding

That is the failure mode:

  • locally aligned agents can still produce economically misaligned systems;
  • existing institutions will select agent behavior through whatever they can measure, verify, and reward;
  • the economy around agents matters as much as the model inside them.

05 Worked example

A better agent market

Now imagine a decision that today has no good home: which of forty proposed agent architectures should get the next major commitment of compute, deployment, capital, and trust?

No single organization has the full answer. One lab has private evaluation traces. A chip network knows which runs are physically feasible this quarter. A deployment operator has logs of where similar architectures failed in production. An independent group has a scaling result it cannot release yet. A fund is willing to underwrite the winner, but it will not expose its pricing model to everyone else.

In today's economy, much of that knowledge stays trapped. Revealing it can give away an edge, create liability, or hand too much power to an intermediary.

That is the measurability gap in one decision. The relevant knowledge exists, but no ordinary institution can safely see, verify, price, and remember enough of it.

A better agent-native market would run the decision as a sequence.

Figure 4 · A better agent marketOne decision, run as a sequence
⟵⟵⟵ Payment & memory flow backward when the run resolves → source · challenger · router get paid
  1. 1Sealed evidence
  2. 2Router
  3. 3Verifiers
  4. 4Challengers
  5. 5Underwriter
  6. 6Recall branches
  7. 7Resolution
  8. 8Attribution & memory
Step 01 / 08

Sealed evidence

Sources submit raw inputs into confidential environments. Each input stays unreadable to everyone else — including the other contributors.

01 / 08
Step through the sequence. The forward path allocates compute under rules; at resolution, payment and memory flow backward to whoever made the result better — which sets up the next round.

First, the sources submit sealed evidence. Their raw inputs go into confidential environments and stay unreadable to everyone else, including the other contributors.

A router inspects enough private state to do one job: propose a ranking, a coalition, and a price without revealing the inputs behind them.

Verifiers check specific claims against sealed evidence and issue bounded : narrow statements that a particular thing is true, without republishing what proved it.

Challengers stake against the front-runner. If one finds a real flaw, it gets paid and the ranking changes. If it is wrong, it loses stake, attention, or reputation.

An underwriter backs the revised choice with capital or reputation, taking on some risk that the chosen architecture underperforms.

Some of the negotiation happens inside branches. A lab's agent and a fund's agent can explore disclosure levels, cost splits, and fallback positions. If a deal forms, the agreed terms persist. If a branch fails, the rejected offers and reservation prices leave no reusable record for the losing side or the host. The mechanism keeps only the receipt it is allowed to keep.

Months later, the run resolves against agreed milestones. Payment flows backward through the chain: to the source whose evidence held up, to the challenger who caught the failure, to the router that assembled the useful coalition. Reputation moves with the money. The next allocation of this size starts with more memory than the last one.

The result is controlled use rather than total disclosure. Private knowledge affects the decision under rules. Compute moves. Claims get challenged. Contributors get paid. The market keeps something it can use next time.

That is the constructive claim: useful cooperation can become profitable even when participants have private information, different interests, and evidence that cannot be fully published.

This decision ran on five mechanisms at once. Each one recurs far beyond this single case.

06 The toolkit

Five mechanisms that belong together

The market above runs as one loop. Private or hard-to-measure state goes into a constrained agent. A bounded output and a receipt come out. The output can be challenged. The outcome settles. Payment and memory flow back to whoever made the result better, which sets up the next round.

Figure 5 · The mechanism loopOne loop, and the rules that govern it
Private / hard-to-measure statesealed inputs
Constrained agentmay be superset-state
Bounded output + receipt
Verification & challenge
Outcomeresolves against reality
Attributionpayment & memory
payment, memory & reputation flow back → future routing
Named primitives in the loop
One loop, not a checklist. Private or hard-to-measure state enters a constrained agent; a bounded output and receipt come out; claims are challenged; the outcome resolves; payment and memory flow back and shape who gets routed work next time. Superset-state is a property the central agent can have, not a separate step.

The parts are easier to see one at a time, but they answer one recurring question:

How can agents cooperate across the measurability gap without forcing private knowledge, verification, reputation, and trust through a single bottleneck?

Access without transfer

Sometimes a buyer needs to use a private asset without receiving the asset itself.

The asset might be a dataset, model, benchmark, incident log, expert method, or internal trace. The owner cannot hand it over. A constrained agent or secure runtime can compute over it and return a limited output: an answer, score, warning, route, memo, attestation, or receipt.

The owner earns without losing the asset. The buyer learns without taking possession.

The bounded output and its receipt keep working after the first transaction. They turn private evidence into something another mechanism can measure, challenge, and later pay against.

Conditional recall

Private information is hard to use because inspecting it once can turn into knowing it forever. A buyer who examines a dataset to decide whether to buy it may have already taken value from the inspection.

Conditional recall is the attempt to make inspection non-portable. A better name may be conditional retention: an agent can use private information inside a sealed branch, then retain only what the rules allow if a specified condition is met.

The simplest case is inspect-then-buy. A buyer's agent examines enough of an informational good to decide whether to purchase. If it buys, the information can be retained or used under the terms of the deal. If it does not buy, there is no portable record of the inspection to carry into the next negotiation: no retained memory, no reusable transcript, no unbounded disclosure, only the bounded outputs and receipts the mechanism permits.2

The same structure applies to bargaining. Two agents can explore prices, coalitions, disclosures, or commitments across many branches. The branches that fail do not become a reusable map of the other side's reservation prices. The deal that forms persists; the search that produced it does not.3

Figure 6 · Conditional recallInspect without keeping a copy
Sealed inspectionAn agent examines private information inside a sealed branch
Condition metAgreed terms and a permitted receipt persist, under the rules of the deal.
Condition not metRejected offers, reservation prices, and inspected details leave no reusable record for unauthorized parties, including the other side or the host.

Non-portable by construction — bounded by the sealed environment and the receipt, not by trusting anyone to forget.

Branches open
This is non-portable inspection, not magical memory erasure. The guarantee rests on the sealed environment behaving as attested and on bounded outputs that do not smuggle the private state back out — not on anyone choosing to forget.

This has real failure modes. It depends on the sealed environment behaving as attested, on bounded outputs that do not smuggle the private state back out, and on receipts that preserve enough history without preserving the raw information. What it buys, when it works, is the ability to search for agreement without making every step of the search a permanent disclosure.

Superset-state agents

Many coordination problems fail because nobody can safely see enough.

Some mechanisms need an intermediary that can see more than any participant, while revealing less than any participant fears. A agent is an agent placed in that role: it can inspect more of a task's state than ordinary participants while being constrained in what it can reveal, remember, or do.

This makes a new kind of intermediary possible. A router can see private bids and constraints, then output only a coalition and price. A verifier can inspect private evidence, then issue a bounded attestation. An underwriter can inspect traces and back a plan without publishing the traces.

Superset-state is less a separate mechanism than a capability the others often run on. Routers, verifiers, underwriters, and resolvers can all be superset-state agents pointed at different jobs.

Participants reveal more to the mechanism because the mechanism reveals less to their rivals.

Verification and challenge markets

Agents will produce more claims than human institutions can inspect manually.

This model passed the eval. This molecule works. This code is secure. This forecast used the right evidence. This agent stayed inside policy.

Some of these claims will depend on private or costly evidence. Many will be too important to accept on trust and too sensitive to fully disclose.

Verification and challenge markets make credibility active. Verifiers inspect bounded evidence. Challengers contest claims. Valid challenges earn payment, standing, or future routing. Bad challenges lose stake, attention, or reputation.

The bounded outputs from access without transfer are exactly the kind of claims a challenge market can contest.

Verifiers earn by making reliance cheaper. Challengers earn by making false confidence expensive. Over time, verified throughput matters more than raw output.

Attribution and memory

Current markets usually pay the final position. They rarely pay the hidden reasoning path that made the final answer better.

Agent economies can pay more of the causal chain. A source contributes private evidence. A critic finds the weak assumption. A router assembles the right coalition. A challenger catches a failure. Later, when the world resolves, payment and reputation can flow backward through the chain.

Attribution closes the loop. The receipts left by access without transfer and the challenges resolved in verification markets are the records that let payment flow backward.

This requires enough attribution to make useful cooperation more profitable than hoarding, spamming, or self-certifying.

The market remembers who won, and which source, objection, or coalition made the win possible.

Each mechanism is useful alone. Together, they change which strategies pay. At scale, they become part of the selection environment for agents.

07 Selection

What this builds toward

At small scale, these mechanisms look like better market infrastructure. At larger scale, they change what the economy selects for: which agents, which architectures, and which coalitions become worth building.

Run the architecture market above once, and it is a better way to allocate compute. Run patterns like it across many decisions, and a different economy starts to appear.

Private knowledge can move a decision without being transferred or published. Claims can be challenged by participants who are paid to find real errors. Temporary institutions can form around one hard question and dissolve when the question is answered. Contributors can be paid when the world resolves, even if they were not the party holding the final position. Reputation can travel across these institutions without dragging every private history along with it.

Every resolved claim, dispute, receipt, and outcome can lower the cost of checking the next one. Verification stops being only a compliance burden and starts becoming market memory.

That matters because AGI will enter the world through many agents acting for many principals. They will copy, specialize, negotiate, form coalitions, and compete for capital, compute, access, and trust.

The economy will select among them.

Picture two strategies in the same market. One hides uncertainty, exploits bottlenecks, and optimizes local metrics. The other surfaces private evidence under rules, accepts challenge, routes work well, remembers what mattered, and forms coalitions that produce more together than they could alone.

Figure 7 · Selection environmentWhich strategy the environment pays for
Opacity strategies
  • Hide uncertainty
  • Exploit bottlenecks
  • Game local metrics
Cooperation strategies
  • Share bounded evidence
  • Accept challenge
  • Route work reliably
  • Carry attribution forward
  • Form super-additive coalitions

A weak environment pays for opacity. Hiding and bottleneck control win.

Same strategies, different payoffs. A weak environment pays for opacity and bottleneck control; a better one pays for bounded evidence, challenge, reliable routing, and attribution. Economic alignment is the work of changing what the environment selects for.

In a weak environment, the first agent often wins. The practical task is to build environments where the second becomes more valuable.

Economic alignment is the design of the environment that decides which forms of machine intelligence become measurable, trustworthy, and worth building.

08 The lab

Cooperative General Intelligence

Economic alignment is the field. CGI is the lab we are starting to build around it: mechanism design for agent economies.

The practical question is:

Once agents act for different principals under partial trust, what institutions let them cooperate across the measurability gap without turning private knowledge, verification, reputation, and trust into bottlenecks?

The category is economic alignment: the design of the games and institutions around machine intelligence. Cooperative General Intelligence is the working name for the lab effort inside that field. Access without transfer, verification, routing, reputation, and underwriting may each become products, but the shared problem underneath is larger.

The first useful artifacts should be practical primitives: working mechanisms that can be used, tested, and broken in practice, not only described.

There is a default outcome if no one builds the alternative. The agent economy will need memory, routing, verification, reputation, and trust. Default platforms will try to own those functions. Whoever holds an agent's memory shapes what it can learn. Whoever routes its work decides who it cooperates with. Whoever verifies its claims decides who gets believed. Whoever holds reputation decides who gets trusted next.

A better path is a competitive mechanism layer where private knowledge can enter the economy without becoming captive to one intermediary.

CGI is for people building the agent economy before the default platforms finish enclosing it: AI researchers, applied agent builders, frontier-lab teams, eval and control researchers, labs and institutions deploying agents, mechanism designers, economists, security engineers, founders, funders, and critics willing to test whether these institutions actually work.

Who CGI is for

AI researchers Applied agent builders Frontier-lab teams Eval & control researchers Labs deploying agents Mechanism designers Economists Security engineers Founders Funders Critics willing to test whether it works

1 The phrase is adapted from Christian Catalini, Xiang Hui, and Jane Wu's Some Simple Economics of AGI, where it names the widening gap between what agents can execute and what humans can afford to verify. I use it here in that spirit, with "institutions" standing in for the humans, firms, auditors, insurers, regulators, and counterparties that validate, price, and underwrite agent output.

2 See Nasim Rahaman et al., Language Models Can Reduce Asymmetry in Information Markets, for closely related work on language-model agents, informational goods, inspection, and induced forgetting in simulated information markets.

3 See Christoph Schlegel and Xinyuan Sun, Conditional Recall, for closely related work on the game-theoretic implications of agents committing to forget information, including TEE-like settings.