AI Capital Allocation Benchmark Manifesto

Opening thesis

AI is beginning to enter one of the most important decision loops in the world: where capital goes.

LLMs are already reading filings, summarizing earnings, ranking opportunities, generating investment theses, and helping people decide what to buy, avoid, or hold.

Today they are research assistants. Soon they will become portfolio copilots. Eventually, many will become agentic allocators inside funds, apps, APIs, and advisory products.

But capital allocation is not a demo.

A model that writes well is not necessarily a model that invests well. A model that explains risk is not necessarily managing risk. A model that sounds confident may simply be wrong with better grammar.

The gap

The market has a measurement problem

Most AI investing claims are still built on screenshots, cherry-picked calls, private prompts, and backtests that may never survive contact with live markets.

That is not evidence.

Capital allocation needs a public record. It needs repeatable tests, frozen decisions, real prices, and scoring that cannot be rewritten after the outcome is known.

The financial world is moving toward AI-assisted allocation before it has built the measurement layer required to trust it.

Adoption vs proof

AI adoption is rising. Proof is still missing.

Asset managers have AI adoption. They do not yet have proof.

Adoption Plan to increase AI usage

91%

Adoption Integrated in at least one investment process

55%

Adoption Use AI as an insight or analysis co-pilot

69%

Decision use Use AI for decision-making

Proof Report improved returns

Proof Report reduced risk or volatility

AI is already entering investment workflows, but proof of return and risk impact remains scarce. That gap is the opening for CapitalBench.

Source: Mercer 2026 AI in Asset Management Survey .

The proof layer

CapitalBench exists to build the proof layer

CapitalBench benchmarks leading AI models on real capital allocation decisions.

Every model receives the same market brief, the same asset universe, and the same decision window. Every portfolio is frozen before results are known. Outcomes are scored against real market prices.

No private retries.
No retroactive edits.
No moving the goalposts.
No vibes.

The goal is not to prove that AI can beat markets every week.

The goal is to make AI investment behavior observable.

Infrastructure

How the benchmark works

Same inputs

Each model receives the same market brief, constraints, and asset universe.

Frozen decisions

Portfolio files are locked before outcomes are known, creating an auditable record.

Real market scoring

Results are scored using actual market prices, not synthetic grades or subjective reviews.

Public proof

Results, methodology, and proof files are designed to be inspectable and hard to revise after the fact.

Useful questions

The questions that matter

Which models take too much risk?
Which models crowd into the same trades?
Which models diversify intelligently?
Which models only sound convincing?
Which models improve as markets change?
Which models deserve to be trusted near capital?

Why now

Why this matters now

The first wave of AI benchmarks measured knowledge, reasoning, coding, math, and instruction following.

The next wave will need to measure judgment under uncertainty.

Capital allocation is one of the hardest forms of judgment. It requires tradeoffs, risk control, timing, humility, and the ability to make decisions with incomplete information.

As AI moves from research assistant to portfolio copilot to agentic allocator, investors will need an independent record of how these systems actually behave.

CapitalBench is starting with public-market portfolios because they are measurable, time-stamped, and brutally objective.

Prices move. Outcomes arrive. Narratives get tested.

The Scoreboard for AI Capital Allocation