CapitalBench

The live benchmark for AI capital allocation

We give frontier AI models the same market brief, freeze their portfolios, and score them against real market returns.

See what AI is buying, what it avoids, and which models actually perform.

Research benchmark. Not financial advice.

Read the CapitalBench Manifesto
What AI models are doing now Selective risk taking
Risk-seeking

Models are taking risk, but concentrated in healthcare sector, biotechnology, and regional banks rather than spreading allocations broadly.

Current read Live AI portfolios are concentrated in Healthcare Sector (XLV)
Risk-Taking Score 69.6/100 Higher means more allocation to growth, momentum, and cyclical assets. Lower means more cash, bonds, and defensive sectors.
Change vs Jun 25 portfolios -8.7pts Jun 26 portfolios scored 69.6/100 for risk-taking, down from 78.3/100 on Jun 25. How it is calculated.
Source Latest weekly + monthly live portfolios 117 model portfolios across 23 live rounds
Healthcare Sector XLV 29.0% Biotechnology XBI 21.0% Regional Banks KRE 16.0% US Low Volatility Equities SPLV 8.0%
Allocation signal only. This does not use returns and is not a trading recommendation. How it is calculated.
Benchmark results

Model Performance

Current benchmarks rank models only inside equal-run comparison sets. Switch weekly/monthly.

All comparison sets

Current Weekly Benchmark

Shared resolved rounds

CapitalBench Score

Max possible = best eligible asset in each included round. Every ranked model has the same included rounds. Calculation.

Claude Opus 4.8
Grok 4.3
Claude Opus 4.7
GPT-5.5
Gemini 3.1 Pro
S&P 500
Max possible hindsight best asset
Claude Opus 4.8 Anthropic · 14/14 scored rounds
-12.1
Grok 4.3 xAI · 14/14 scored rounds
-15.9
Claude Opus 4.7 Anthropic · 14/14 scored rounds
-15.9
GPT-5.5 OpenAI · 14/14 scored rounds
-25.8
Gemini 3.1 Pro Google · 14/14 scored rounds
-30.6
S&P 500 S&P 500 · 14/14 scored rounds
-8.5
Max possible Hindsight best-performing eligible asset in each round, not a model portfolio
100.0
14 shared resolved rounds5 equal-run models rankedQualified at 6+ shared roundsNewest included round: CB-2026-06-18-1W
Return context

Average Return Details

Average portfolio return across the same finished rounds.

Return leader Claude Opus 4.8 -0.96%
Anthropic Claude Opus 4.8
-0.96%
xAI Grok 4.3
-1.25%
Anthropic Claude Opus 4.7
-1.26%
OpenAI GPT-5.5
-2.03%
Google Gemini 3.1 Pro
-2.41%
S&P S&P 500
-0.67%
MAX Max possible
7.88%
Leader audit Claude Opus 4.8 -12.1 = -13.39% total return / 110.34% oracle return × 100.
Rounds included: CB-2026-05-28-1W, CB-2026-05-29-1W, CB-2026-06-01-1W, CB-2026-06-02-1W, CB-2026-06-03-1W, CB-2026-06-05-1W, CB-2026-06-08-1W, CB-2026-06-09-1W, CB-2026-06-12-1W, CB-2026-06-13-1W, CB-2026-06-15-1W, CB-2026-06-16-1W, CB-2026-06-17-1W, CB-2026-06-18-1W Fairness rule: every ranked model completed every included round. A missed round is excluded from this set for everyone.
Methodology Same report, same choices, real prices
Full methodology
  1. 1
    Same report

    Every model reads the same market report.

  2. 2
    Same choices

    Every model chooses from the same 70 assets.

  3. 3
    Portfolios freeze

    Each model saves one portfolio.

  4. 4
    We wait

    The round runs for 7 days or 1 month.

  5. 5
    Prices score it

    Real market prices show which model did best.

No after-the-fact edits. Portfolios are frozen first. Results appear only after the scoring window ends.
What AI models are doing now Selective risk taking
Risk-seeking

Models are taking risk, but concentrated in healthcare sector, biotechnology, and regional banks rather than spreading allocations broadly.

Current read Live AI portfolios are concentrated in Healthcare Sector (XLV)
Risk-Taking Score 69.6/100 Higher means more allocation to growth, momentum, and cyclical assets. Lower means more cash, bonds, and defensive sectors.
Change vs Jun 25 portfolios -8.7pts Jun 26 portfolios scored 69.6/100 for risk-taking, down from 78.3/100 on Jun 25. How it is calculated.
Source Latest weekly + monthly live portfolios 117 model portfolios across 23 live rounds
Healthcare Sector XLV 29.0% Biotechnology XBI 21.0% Regional Banks KRE 16.0% US Low Volatility Equities SPLV 8.0%
Allocation signal only. This does not use returns and is not a trading recommendation. How it is calculated.
AI positioning

What AI Models Are Allocating To Now

Live rounds only. This is the current market condition as expressed by frozen model portfolios, before final scores are known.

Historical risk trend
AI risk appetite 69.6/100 Risk-seeking / Selective risk taking
Consensus allocation Healthcare Sector (XLV) 29.0% average live weight
Risk shift -8.7 Change vs Jun 25 portfolios
Model agreement Mixed 8.3 point dispersion
Model behavior patterns

See Each AI Model's Allocation Personality

CapitalBench tracks whether each model behaves like a risk-seeker, concentrator, defensive allocator, consensus follower, or distinctive outlier across official frozen portfolios.

215 saved portfolios 98 resolved results Peer overlap, concentration, turnover, and risk appetite
Current allocation signal

AI Risk Appetite

Latest weekly and monthly portfolios, equal-weighted by track. This measures current model positioning, not market returns or a trading recommendation.

Historical trend and methodology
Combined pulse 69.6/100 Risk-seeking
Current regime Selective risk taking CB-2026-06-26-1W and CB-2026-06-26-1M
Weekly tactical 67.8/100 Risk-seeking
Monthly strategic 71.3/100 Risk-seeking
Change -8.7 Change vs Jun 25 portfolios
Model agreement Mixed 8.3 point dispersion
Largest current allocations
Healthcare Sector (XLV) 29.0% Biotechnology (XBI) 21.0% Regional Banks (KRE) 16.0% US Low Volatility Equities (SPLV) 8.0% US Small-Cap Value (IWN) 8.0% Equal-Weight S&P 500 (RSP) 5.5%
Regime mix
Defensive equity 37.0% Broad and cyclical equity 36.0% Growth and technology 21.0% Rates and credit 2.5% Real assets and inflation 2.0%
Scope
View
All Open portfolios only
Live AI positioning

Semiconductors (SMH) is the largest live allocation.

19.2% points to Semiconductors (SMH), while US Equity accounts for 56.1% of open portfolios.

Largest assetSemiconductors (SMH)19.2%
Lead categoryUS Equity56.1%
Live rounds23All Open
Portfolios11734 assets held
Category mixClick a category to focus the pick list
Top allocationsAssets with the largest live model allocation
34 assets
24 smaller live allocations32.1%
Live tests

Live Portfolio Returns

Live rounds marked to the latest available close. These are not final scores.

Priced live rounds19 of 23
Latest closeJun 25
Next final scoreJun 29
Claude Fable 5Anthropic / 2 open
Portfolio+1.07%S&P 500-0.19%Portfolio Minus S&P 500+1.25%
Claude Opus 4.8Anthropic / 19 open
Portfolio-0.33%S&P 500-1.14%Portfolio Minus S&P 500+0.81%
Grok 4.3xAI / 19 open
Portfolio-0.46%S&P 500-1.14%Portfolio Minus S&P 500+0.69%
Claude Opus 4.7Anthropic / 19 open
Portfolio-0.46%S&P 500-1.14%Portfolio Minus S&P 500+0.68%
GPT-5.5OpenAI / 19 open
Portfolio-1.01%S&P 500-1.14%Portfolio Minus S&P 500+0.14%
Gemini 3.1 ProGoogle / 19 open
Portfolio-1.21%S&P 500-1.14%Portfolio Minus S&P 500-0.07%
S&P 50019 open tests
S&P 500 return-1.14%CloseJun 25

Interim returns use live rounds only. Completed rounds move to official scored results.

Marked to market from saved entry prices. Official results wait for the scheduled ending close.
Latest official results

Finished Benchmark Results

Switch between weekly and monthly results, then move backward or forward through completed rounds in that track.

All benchmark results
Weekly official results

Weekly results, newest official score first

Weekly result1 of 16
Weekly official result

Weekly result scored Jun 25

Frozen model portfolios scored after the one-week window. Live rounds stay out until final prices are available.

Scored
Model portfolios S&P 500 benchmark Maximum possible return
Claude Opus 4.8
Gemini 3.1 Pro
GPT-5.5
Claude Opus 4.7
Grok 4.3
S&P 500
Max
Claude Opus 4.8 Anthropic
-2.17%
Gemini 3.1 Pro Google
-2.66%
GPT-5.5 OpenAI
-2.72%
Claude Opus 4.7 Anthropic
-2.84%
Grok 4.3 xAI
-3.27%
S&P 500 Benchmark
-1.67%
Max possible XBI
7.83%
Portfolio context

Shows each model's saved portfolio weights.

Model portfolios

Ranked in the same order as the chart.

1
Claude Opus 4.8 Anthropic
SPY 40% SMH 25% QQQ 20% BIL 15%
2
Gemini 3.1 Pro Google
SMH 30% QQQ 30% SPY 40%
3
GPT-5.5 OpenAI
SMH 35% EWY 25% EWT 15% MTUM 15% XBI 10%
4
Claude Opus 4.7 Anthropic
SMH 35% QQQ 25% EWT 15% MTUM 15% SPY 10%
5
Grok 4.3 xAI
SMH 40% MTUM 30% EWY 30%
Reference points

Not model portfolios.

S&P 500 Benchmark

Benchmark return over the same scoring window

Max possible XBI

100% Biotechnology (XBI) hindsight ceiling

Official scored round

Weekly result scored Jun 25

Audit ID: CB-2026-06-18-1W

ScoredJun 25WindowJun 18 to Jun 25Models5Asset choices70LeaderClaude Opus 4.8HorizonWeekly
Benchmark universe

What Models Allocate From

Models get the same report, choose from the same assets, and wait for weekly or monthly scoring.

Models 6
Asset choices 70
Round lengths 2
Live rounds 23
Historical model style

Historical Risk Style By Model

Allocation-weighted from every official frozen portfolio, including live and completed rounds. It does not use future returns and is separate from the current AI Risk Appetite signal.

215 saved portfolios
Model portfolios

Current Frozen Model Portfolios

These are the saved model portfolios for the newest weekly and monthly rounds. They are waiting for final prices.

Weekly model portfolios

CB-2026-06-26-1W

2026-06-26 to 2026-07-02

Waiting for result
Anthropic Claude Opus 4.7
Healthcare Sector (XLV) 35% Regional Banks (KRE) 20% US Low Volatility Equities (SPLV) 20% Biotechnology (XBI) 15% Long-Term US Treasury Bonds (TLT) 10%
Anthropic Claude Opus 4.8
Healthcare Sector (XLV) 30% Regional Banks (KRE) 20% US Low Volatility Equities (SPLV) 20% US Mid-Cap Stocks (IJH) 15% Short-Term Treasury Bills (BIL) 15%
Google Gemini 3.1 Pro
Healthcare Sector (XLV) 40% Biotechnology (XBI) 30% Regional Banks (KRE) 30%
OpenAI GPT-5.5
Biotechnology (XBI) 35% Regional Banks (KRE) 25% US Small-Cap Value (IWN) 20% Healthcare Sector (XLV) 10% Crude Oil (USO) 10%
xAI Grok 4.3
Healthcare Sector (XLV) 35% Biotechnology (XBI) 25% US Low Volatility Equities (SPLV) 25% US Small-Cap Value (IWN) 15%
Shared top pick Healthcare Sector (XLV) Average across 5 frozen model portfolios.
Top 3 70% Spread 5.1 assets
Healthcare Sector (XLV) 30%
Biotechnology (XBI) 21%
Regional Banks (KRE) 19%
US Low Volatility Equities (SPLV) 13%
Monthly model portfolios

CB-2026-06-26-1M

2026-06-26 to 2026-07-24

Waiting for result
Anthropic Claude Opus 4.7
Healthcare Sector (XLV) 30% Regional Banks (KRE) 20% Biotechnology (XBI) 15% Equal-Weight S&P 500 (RSP) 20% Long-Term US Treasury Bonds (TLT) 15%
Anthropic Claude Opus 4.8
Healthcare Sector (XLV) 30% Financials Sector (XLF) 20% Industrials Sector (XLI) 15% Equal-Weight S&P 500 (RSP) 20% US Low Volatility Equities (SPLV) 15%
Google Gemini 3.1 Pro
Healthcare Sector (XLV) 40% Biotechnology (XBI) 30% US Small-Cap Stocks (IWM) 15% Equal-Weight S&P 500 (RSP) 15%
OpenAI GPT-5.5
Biotechnology (XBI) 30% Regional Banks (KRE) 25% US Small-Cap Value (IWN) 20% Healthcare Sector (XLV) 15% Energy Sector (XLE) 10%
xAI Grok 4.3
Biotechnology (XBI) 30% Healthcare Sector (XLV) 25% US Small-Cap Value (IWN) 25% Regional Banks (KRE) 20%
Shared top pick Healthcare Sector (XLV) Average across 5 frozen model portfolios.
Top 3 62% Spread 6.1 assets
Healthcare Sector (XLV) 28%
Biotechnology (XBI) 21%
Regional Banks (KRE) 13%
Equal-Weight S&P 500 (RSP) 11%
1 Same report

Every model gets the same market report.

2 Same choices

Every model allocates from the same asset list.

3 Frozen portfolios

Model portfolios are locked before results are known.

4 Real prices score

After 7 days or 1 month, real prices decide the result.

Results

Weekly And Monthly Are Separate

A 7-day round and a 1-month round are different contests. They get separate scores and separate overall results.

  • Weekly 7-day round
  • Monthly 1-month round
  • No mixing Scores stay separate
Weekly track

Weekly Results

16 completed / 5 live
Current benchmark leader Claude Opus 4.8 -12.1 score · 14 shared rounds
Latest scored CB-2026-06-18-1W Live round CB-2026-06-26-1W Next score After Jul 2 close
  1. Locked
  2. Live
  3. Scores
Monthly track

Monthly Results

4 completed / 18 live
Current benchmark leader Grok 4.3 11.4 score · 4 shared rounds
Latest scored CB-2026-05-28-1M Live round CB-2026-06-26-1M Next score After Jul 24 close
  1. Locked
  2. Live
  3. Scores
Live benchmark tests

Live Benchmark Tests

These are the open tests you can inspect now. Models already submitted portfolios; official scores wait for final closing prices.

Weekly test

One-week test

Live now

Short-term test of AI positioning over one market week.

Portfolios locked; scoring pending.

Model portfolios 5 Eligible assets 70 Risk-taking score 67.8/100 Top consensus Healthcare Sector (XLV) 30% average weight
Monthly test

One-month test

Live now

Longer test of AI allocation over one month.

Portfolios locked; scoring pending.

Model portfolios 5 Eligible assets 70 Risk-taking score 71.3/100 Top consensus Healthcare Sector (XLV) 28% average weight

Internal IDs and full reproducibility files are inside each audit packet.

Scoring calendar

Current Scoring Calendar

Models have already picked portfolios. Official scores publish only after the market window ends and final closing prices are available.

43 official rounds recorded Internal round and run IDs stay in the public audit trail.
View audit trail
Audit packet

Check The Public Audit Trail

Round pages show the report, prompt, model portfolios, starting prices, source reports, hashes, and result status behind each public benchmark round.

Why it is fair

Simple Rules, Public Audit Trail

CapitalBench keeps the comparison narrow: same report, same asset list, frozen portfolios, and no final result before the round ends.

Same rules

One frozen portfolio per model

The public score uses the saved portfolio, not private retries or experiments.

Same choices

70 current assets

Each round keeps the exact asset list, report, model output, starting prices, and audit hashes.

No early winner

23 live rounds waiting for results

Final results appear only after ending prices are available.