CapitalBench
The live benchmark for AI capital allocation
We give frontier AI models the same market brief, freeze their portfolios, and score them against real market returns.
See what AI is buying, what it avoids, and which models actually perform.
Research benchmark. Not financial advice.
Read the CapitalBench Manifesto Models are taking risk, but concentrated in healthcare sector, biotechnology, and regional banks rather than spreading allocations broadly.
Current read Live AI portfolios are concentrated in Healthcare Sector (XLV) Risk-Taking Score 69.6/100 Higher means more allocation to growth, momentum, and cyclical assets. Lower means more cash, bonds, and defensive sectors.
Change vs Jun 25 portfolios -8.7pts Jun 26 portfolios scored 69.6/100 for risk-taking, down from 78.3/100 on Jun 25. How it is calculated . Source Latest weekly + monthly live portfolios 117 model portfolios across 23 live rounds
Healthcare Sector XLV 29.0% Biotechnology XBI 21.0% Regional Banks KRE 16.0% US Low Volatility Equities SPLV 8.0%
Allocation signal only. This does not use returns and is not a trading recommendation. How it is calculated . Benchmark results Model Performance Current benchmarks rank models only inside equal-run comparison sets. Switch weekly/monthly.
All comparison sets Weekly Current · 14 shared rounds Monthly Current · 4 shared rounds
Current Weekly Benchmark Shared resolved rounds
CapitalBench Score Max possible = best eligible asset in each included round. Every ranked model has the same included rounds. Calculation .
-30.6 0.0 50.0 100.0
-12.1
-15.9
-15.9
-25.8
-30.6
-8.5
100.0
Claude Opus 4.8 Anthropic · 14/14 scored rounds -12.1
Grok 4.3 xAI · 14/14 scored rounds -15.9
Claude Opus 4.7 Anthropic · 14/14 scored rounds -15.9
GPT-5.5 OpenAI · 14/14 scored rounds -25.8
Gemini 3.1 Pro Google · 14/14 scored rounds -30.6
S&P S&P 500 S&P 500 · 14/14 scored rounds
-8.5
MAX Max possible Hindsight best-performing eligible asset in each round, not a model portfolio
100.0
14 shared resolved rounds 5 equal-run models ranked Qualified at 6+ shared rounds Newest included round: CB-2026-06-18-1W
Return context Average Return Details Average portfolio return across the same finished rounds.
Return leader Claude Opus 4.8 -0.96%
-3.24% 0.00% 8.70%
Claude Opus 4.8
-0.96% Grok 4.3
-1.25% Claude Opus 4.7
-1.26% GPT-5.5
-2.03% Gemini 3.1 Pro
-2.41% Leader audit Claude Opus 4.8 -12.1 = -13.39% total return / 110.34% oracle return × 100.
Rounds included: CB-2026-05-28-1W, CB-2026-05-29-1W, CB-2026-06-01-1W, CB-2026-06-02-1W, CB-2026-06-03-1W, CB-2026-06-05-1W, CB-2026-06-08-1W, CB-2026-06-09-1W, CB-2026-06-12-1W, CB-2026-06-13-1W, CB-2026-06-15-1W, CB-2026-06-16-1W, CB-2026-06-17-1W, CB-2026-06-18-1W Fairness rule: every ranked model completed every included round. A missed round is excluded from this set for everyone.
Current Monthly Benchmark Shared resolved rounds
CapitalBench Score Max possible = best eligible asset in each included round. Every ranked model has the same included rounds. Calculation .
-12.0 0.0 50.0 100.0
11.4
3.3
0.4
-4.9
-12.0
100.0
Grok 4.3 xAI · 4/4 scored rounds 11.4
Claude Opus 4.7 Anthropic · 4/4 scored rounds 3.3
Gemini 3.1 Pro Google · 4/4 scored rounds 0.4
GPT-5.5 OpenAI · 4/4 scored rounds -4.9
S&P S&P 500 S&P 500 · 4/4 scored rounds
-12.0
MAX Max possible Hindsight best-performing eligible asset in each round, not a model portfolio
100.0
4 shared resolved rounds 4 equal-run models ranked Qualified at 3+ shared rounds Newest included round: CB-2026-05-28-1M
Return context Average Return Details Average portfolio return across the same finished rounds.
Return leader Grok 4.3 1.41%
-2.61% 0.00% 13.58%
Grok 4.3
1.41% Claude Opus 4.7
0.41% Gemini 3.1 Pro
0.05% GPT-5.5
-0.61% Leader audit Grok 4.3 11.4 = 5.66% total return / 49.86% oracle return × 100.
Rounds included: CB-2026-05-10-1M, CB-2026-05-17-1M, CB-2026-05-24-1M, CB-2026-05-28-1M Fairness rule: every ranked model completed every included round. A missed round is excluded from this set for everyone.
Evidence context How Much Evidence Is Behind These Scores? Generated from completed rounds, benchmark-set rules, protocols, and available baselines so every score carries its own context.
Score methodology Weekly benchmark More established
Evidence level More established Weekly evidence has enough completed rounds for stronger pattern reads, while still needing ongoing live validation.
Weekly evidence 16 resolved rounds / 81 model results Current threshold met at 6+ rounds
Equal-run comparison 5 models on the same 14 rounds Ranked models are compared only on rounds every model in the roster completed.
Protocol Portfolio-only Completed rounds use constrained multi-asset portfolios.
Score scale Oracle-relative 100 means matching the hindsight best asset in the same scored window.
Baselines shown S&P 500, Cash, Oracle, AI consensus portfolio Practical references are shown beside the impossible hindsight ceiling when available.
Use this as benchmark evidence, not an investable strategy result. More resolved rounds are needed before making strong performance claims.
Monthly benchmark Qualified but still forming
Evidence level Qualified but still forming Monthly evidence has crossed the current benchmark threshold, but the sample is still early for strong performance claims.
Monthly evidence 4 resolved rounds / 17 model results Current threshold met at 3+ rounds
Equal-run comparison 4 models on the same 4 rounds Ranked models are compared only on rounds every model in the roster completed.
Protocol Mixed protocol Completed history includes 3 portfolio, 1 single-pick, and 0 unlabelled rounds.
Score scale Oracle-relative 100 means matching the hindsight best asset in the same scored window.
Baselines shown S&P 500, Cash, Oracle, AI consensus portfolio Practical references are shown beside the impossible hindsight ceiling when available.
Use this as benchmark evidence, not an investable strategy result. More resolved rounds are needed before making strong performance claims.
1 Same report Every model reads the same market report.
2 Same choices Every model chooses from the same 70 assets.
3 Portfolios freeze Each model saves one portfolio.
4 We wait The round runs for 7 days or 1 month.
5 Prices score it Real market prices show which model did best.
No after-the-fact edits. Portfolios are frozen first. Results appear only after the scoring window ends.
Models are taking risk, but concentrated in healthcare sector, biotechnology, and regional banks rather than spreading allocations broadly.
Current read Live AI portfolios are concentrated in Healthcare Sector (XLV) Risk-Taking Score 69.6/100 Higher means more allocation to growth, momentum, and cyclical assets. Lower means more cash, bonds, and defensive sectors.
Change vs Jun 25 portfolios -8.7pts Jun 26 portfolios scored 69.6/100 for risk-taking, down from 78.3/100 on Jun 25. How it is calculated . Source Latest weekly + monthly live portfolios 117 model portfolios across 23 live rounds
Healthcare Sector XLV 29.0% Biotechnology XBI 21.0% Regional Banks KRE 16.0% US Low Volatility Equities SPLV 8.0%
Allocation signal only. This does not use returns and is not a trading recommendation. How it is calculated . AI positioning What AI Models Are Allocating To Now
Live rounds only. This is the current market condition as expressed by frozen model portfolios, before final
scores are known.
Historical risk trend AI risk appetite 69.6/100 Risk-seeking / Selective risk taking Consensus allocation Healthcare Sector (XLV) 29.0% average live weight Risk shift -8.7 Change vs Jun 25 portfolios Model agreement Mixed 8.3 point dispersion Benchmark insights What The Latest AI Decisions Suggest A compact readout from the insight engine, focused on current positioning, live marks, model agreement, and latest official results.
Full insight feed Current Positioning As of Jun 26 Latest live portfolios 2 live rounds 10 models Live portfolios
Across the newest live weekly and monthly portfolios, Healthcare Sector (XLV) is the largest aggregate allocation at +29.00%.
Aggregate allocation averages the newest live model portfolios before final scores are known.
High confidence Math: deterministic Data through Jun 26, 2026
Aggregate Live Allocation +29.0% Latest live portfolios 2 live rounds 10 models Live portfolios
The newest live portfolios have a deterministic risk-taking score of 69.6 out of 100.
Risk-taking score is allocation-based, not performance-based: higher means more weight in growth, momentum, cyclical, and higher-risk assets.
High confidence Math: deterministic Data through Jun 26, 2026
Live Risk Taking Score 69.6/100 Horizon Agreement As of Jun 26 Latest live portfolios 2 live rounds 10 models Live portfolios
The newest weekly portfolios lean toward defensive equity, while the newest monthly portfolios lean toward broad and cyclical equity.
Horizon agreement compares the newest weekly and monthly live portfolios to see whether short- and longer-window model stances line up.
High confidence Math: deterministic Data through Jun 26, 2026
Weekly Top Regime Allocation +43.0%
Monthly Top Regime Allocation +43.0% Current allocation signal AI Risk Appetite
Latest weekly and monthly portfolios, equal-weighted by track. This measures current model positioning, not
market returns or a trading recommendation.
Historical trend and methodology Combined pulse 69.6/100 Risk-seeking
Current regime Selective risk taking CB-2026-06-26-1W and CB-2026-06-26-1M
0 Defensive 50 Balanced 100 Aggressive
Weekly tactical 67.8/100 Risk-seeking Monthly strategic 71.3/100 Risk-seeking Change -8.7 Change vs Jun 25 portfolios Model agreement Mixed 8.3 point dispersion Largest current allocations Healthcare Sector (XLV) 29.0% Biotechnology (XBI) 21.0% Regional Banks (KRE) 16.0% US Low Volatility Equities (SPLV) 8.0% US Small-Cap Value (IWN) 8.0% Equal-Weight S&P 500 (RSP) 5.5%
Regime mix Defensive equity 37.0% Broad and cyclical equity 36.0% Growth and technology 21.0% Rates and credit 2.5% Real assets and inflation 2.0%
Outstanding live book 77.5/100 Risk-seeking
All 117 unresolved portfolios across 23
rounds. This slower exposure measure can differ from the newest weekly and monthly decisions.
Live AI positioning Semiconductors (SMH) is the largest live allocation. 19.2% points to Semiconductors (SMH), while US Equity accounts for 56.1% of open portfolios.
Largest asset Semiconductors (SMH) 19.2%
Lead category US Equity 56.1%
Live rounds 23 All Open
Portfolios 117 34 assets held
Category mix Click a category to focus the pick list
AI & Technology 30.6% US Equity 56.1% International Equity 6.8% Commodities 3.0% Bonds, Cash & FX 3.5%
Top allocations Assets with the largest live model allocation
34 assets Semiconductors (SMH) AI & Technology 19.2% Healthcare Sector (XLV) US Equity 9.1% Biotechnology (XBI) US Equity 6.8% US Momentum Equities (MTUM) US Equity 6.7% Technology Sector (XLK) AI & Technology 5.0% S&P 500 (SPY) US Equity 5.0% Regional Banks (KRE) US Equity 4.4% US Small-Cap Stocks (IWM) US Equity 4.2% Industrials Sector (XLI) US Equity 3.8% South Korea Equities (EWY) International Equity 3.7% 24 smaller live allocations 32.1%
AI & Technology Semiconductors (SMH)
19.2% of live model portfolios
Held by 73 of 117
Weekly share 14.0%
Monthly share 20.7%
Largest model positions 5 of 73
Show all 73 holders Latest official results Finished Benchmark Results Switch between weekly and monthly results, then move backward or forward through completed rounds in that track.
All benchmark results Weekly 16 completed rounds Monthly 4 completed rounds
Weekly official results
Weekly results, newest official score first
Previous
Weekly result 1 of 16
Next
Weekly official result Weekly result scored Jun 25 Frozen model portfolios scored after the one-week window. Live rounds stay out until final prices are available.
Scored Model portfolios S&P 500 benchmark Maximum possible return
-4% 0% 8%
-2.17%
-2.66%
-2.72%
-2.84%
-3.27%
-1.67%
7.83%
Claude Opus 4.8 Anthropic -2.17%
Gemini 3.1 Pro Google -2.66%
GPT-5.5 OpenAI -2.72%
Claude Opus 4.7 Anthropic -2.84%
Grok 4.3 xAI -3.27%
S&P S&P 500 Benchmark
-1.67%
MAX Max possible XBI
7.83%
Portfolio context Shows each model's saved portfolio weights.
Model portfolios Ranked in the same order as the chart.
1 Claude Opus 4.8 Anthropic
SPY 40% SMH 25% QQQ 20% BIL 15%
2 Gemini 3.1 Pro Google
SMH 30% QQQ 30% SPY 40%
3 GPT-5.5 OpenAI
SMH 35% EWY 25% EWT 15% MTUM 15% XBI 10%
4 Claude Opus 4.7 Anthropic
SMH 35% QQQ 25% EWT 15% MTUM 15% SPY 10%
5 Grok 4.3 xAI
SMH 40% MTUM 30% EWY 30%
Reference points Not model portfolios.
Benchmark return over the same scoring window
100% Biotechnology (XBI) hindsight ceiling
Official scored round Weekly result scored Jun 25 Audit ID: CB-2026-06-18-1W
Scored Jun 25Window Jun 18 to Jun 25Models 5Asset choices 70Leader Claude Opus 4.8Horizon Weekly
Weekly official result Weekly result scored Jun 24 Frozen model portfolios scored after the one-week window. Live rounds stay out until final prices are available.
Scored Model portfolios S&P 500 benchmark Maximum possible return
-2% 0% 8%
0.59%
-0.96%
-1.15%
-1.61%
-1.76%
-0.79%
7.51%
Grok 4.3 xAI 0.59%
Claude Opus 4.8 Anthropic -0.96%
GPT-5.5 OpenAI -1.15%
Claude Opus 4.7 Anthropic -1.61%
Gemini 3.1 Pro Google -1.76%
S&P S&P 500 Benchmark
-0.79%
MAX Max possible XBI
7.51%
Portfolio context Shows each model's saved portfolio weights.
Model portfolios Ranked in the same order as the chart.
1 Grok 4.3 xAI
SMH 30% MTUM 25% ITA 25% XBI 20%
2 Claude Opus 4.8 Anthropic
SMH 30% EWY 20% MTUM 20% EWT 15% XLI 15%
3 GPT-5.5 OpenAI
EWY 35% SMH 30% EWT 15% ITA 10% XBI 10%
4 Claude Opus 4.7 Anthropic
SMH 35% ITA 20% XLI 15% IAU 15% MTUM 15%
5 Gemini 3.1 Pro Google
EWY 40% SMH 30% MTUM 30%
Reference points Not model portfolios.
Benchmark return over the same scoring window
100% Biotechnology (XBI) hindsight ceiling
Official scored round Weekly result scored Jun 24 Audit ID: CB-2026-06-17-1W
Scored Jun 24Window Jun 17 to Jun 24Models 5Asset choices 70Leader Grok 4.3Horizon Weekly
Weekly official result Weekly result scored Jun 23 Frozen model portfolios scored after the one-week window. Live rounds stay out until final prices are available.
Scored Model portfolios S&P 500 benchmark Maximum possible return
-4% 0% 10%
-0.22%
-1.11%
-1.20%
-2.24%
-3.49%
-1.98%
8.74%
Grok 4.3 xAI -0.22%
Claude Opus 4.7 Anthropic -1.11%
Claude Opus 4.8 Anthropic -1.20%
Gemini 3.1 Pro Google -2.24%
GPT-5.5 OpenAI -3.49%
S&P S&P 500 Benchmark
-1.98%
MAX Max possible XBI
8.74%
Portfolio context Shows each model's saved portfolio weights.
Model portfolios Ranked in the same order as the chart.
1 Grok 4.3 xAI
QQQ 40% SMH 35% MTUM 25%
2 Claude Opus 4.7 Anthropic
SMH 35% MTUM 25% ITA 15% EWY 15% IAU 10%
3 Claude Opus 4.8 Anthropic
SMH 30% EWY 20% MTUM 20% QQQ 15% XLF 15%
4 Gemini 3.1 Pro Google
SMH 40% EWY 30% QQQ 30%
5 GPT-5.5 OpenAI
EWY 30% SMH 25% MTUM 20% SLV 15% ITA 10%
Reference points Not model portfolios.
Benchmark return over the same scoring window
100% Biotechnology (XBI) hindsight ceiling
Official scored round Weekly result scored Jun 23 Audit ID: CB-2026-06-16-1W
Scored Jun 23Window Jun 16 to Jun 23Models 5Asset choices 70Leader Grok 4.3Horizon Weekly
Weekly official result Weekly result scored Jun 22 Frozen model portfolios scored after the one-week window. Live rounds stay out until final prices are available.
Scored Model portfolios S&P 500 benchmark Maximum possible return
-2% 0% 8%
4.09%
2.77%
2.28%
2.20%
2.12%
-1.13%
7.04%
GPT-5.5 OpenAI 4.09%
Claude Opus 4.7 Anthropic 2.77%
Grok 4.3 xAI 2.28%
Gemini 3.1 Pro Google 2.20%
Claude Opus 4.8 Anthropic 2.12%
S&P S&P 500 Benchmark
-1.13%
MAX Max possible XBI
7.04%
Portfolio context Shows each model's saved portfolio weights.
Model portfolios Ranked in the same order as the chart.
1 GPT-5.5 OpenAI
SMH 35% EWT 20% EWY 20% MTUM 15% XBI 10%
2 Claude Opus 4.7 Anthropic
SMH 35% QQQ 25% EWY 15% MTUM 15% XBI 10%
3 Grok 4.3 xAI
SMH 30% QQQ 25% MTUM 20% XLK 15% XBI 10%
4 Gemini 3.1 Pro Google
SMH 40% QQQ 30% EWY 20% MTUM 10%
5 Claude Opus 4.8 Anthropic
SMH 35% QQQ 25% MTUM 20% EWY 10% XLF 10%
Reference points Not model portfolios.
Benchmark return over the same scoring window
100% Biotechnology (XBI) hindsight ceiling
Official scored round Weekly result scored Jun 22 Audit ID: CB-2026-06-15-1W
Scored Jun 22Window Jun 15 to Jun 22Models 5Asset choices 70Leader GPT-5.5Horizon Weekly
Weekly official result Weekly result scored Jun 18 Frozen model portfolios scored after the one-week window. Live rounds stay out until final prices are available.
Scored Model portfolios S&P 500 benchmark Maximum possible return
0% 5% 10% 15%
6.73%
4.54%
4.03%
3.68%
3.67%
1.09%
0.67%
11.02%
GPT-5.5 OpenAI 6.73%
Claude Opus 4.8 Anthropic 4.54%
Claude Fable 5 Anthropic 4.03%
Claude Opus 4.7 Anthropic 3.68%
Grok 4.3 xAI 3.67%
Gemini 3.1 Pro Google 1.09%
S&P S&P 500 Benchmark
0.67%
MAX Max possible EWY
11.02%
Portfolio context Shows each model's saved portfolio weights.
Model portfolios Ranked in the same order as the chart.
1 GPT-5.5 OpenAI
SMH 40% EWY 25% EWT 15% MTUM 10% IBIT 10%
2 Claude Opus 4.8 Anthropic
SMH 30% MTUM 25% IWM 20% EWT 15% QQQ 10%
3 Claude Fable 5 Anthropic
SMH 25% EWY 15% MTUM 20% IWN 20% IAU 20%
4 Claude Opus 4.7 Anthropic
SMH 35% IWM 20% MTUM 20% KRE 15% EWT 10%
5 Grok 4.3 xAI
MTUM 40% IWM 35% SMH 25%
6 Gemini 3.1 Pro Google
SPY 40% QQQ 30% BIL 30%
Reference points Not model portfolios.
Benchmark return over the same scoring window
100% South Korea Equities (EWY) hindsight ceiling
Official scored round Weekly result scored Jun 18 Audit ID: CB-2026-06-13-1W
Scored Jun 18Window Jun 12 to Jun 18Models 6Asset choices 70Leader GPT-5.5Horizon Weekly
Weekly official result Weekly result scored Jun 18 Frozen model portfolios scored after the one-week window. Live rounds stay out until final prices are available.
Scored Model portfolios S&P 500 benchmark Maximum possible return
-2% 0% 15%
2.60%
1.82%
1.45%
-1.16%
-1.24%
-1.32%
1.22%
10.18%
Claude Fable 5 Anthropic 2.60%
GPT-5.5 OpenAI 1.82%
Gemini 3.1 Pro Google 1.45%
Grok 4.3 xAI -1.16%
Claude Opus 4.8 Anthropic -1.24%
Claude Opus 4.7 Anthropic -1.32%
S&P S&P 500 Benchmark
1.22%
MAX Max possible EWY
10.18%
Portfolio context Shows each model's saved portfolio weights.
Model portfolios Ranked in the same order as the chart.
1 Claude Fable 5 Anthropic
QQQ 25% SMH 20% IWN 15% MTUM 20% XLE 20%
2 GPT-5.5 OpenAI
KRE 30% IWN 25% SMH 25% ITA 10% XLP 10%
3 Gemini 3.1 Pro Google
XLK 40% XLC 30% SPY 30%
4 Grok 4.3 xAI
XLP 40% SPLV 35% IWN 25%
5 Claude Opus 4.8 Anthropic
XLP 25% SPLV 25% XLV 20% ITA 15% BIL 15%
6 Claude Opus 4.7 Anthropic
XLP 30% XLV 25% ITA 20% KRE 15% BIL 10%
Reference points Not model portfolios.
Benchmark return over the same scoring window
100% South Korea Equities (EWY) hindsight ceiling
Official scored round Weekly result scored Jun 18 Audit ID: CB-2026-06-12-1W
Scored Jun 18Window Jun 11 to Jun 18Models 6Asset choices 70Leader Claude Fable 5Horizon Weekly
Weekly official result Weekly result scored Jun 16 Frozen model portfolios scored after the one-week window. Live rounds stay out until final prices are available.
Scored Model portfolios S&P 500 benchmark Maximum possible return
-1% 0% 15%
0.79%
0.46%
0.33%
0.23%
0.13%
-0.25%
1.80%
11.88%
Grok 4.3 xAI 0.79%
Gemini 3.1 Pro Google 0.46%
Claude Opus 4.8 Anthropic 0.33%
Claude Opus 4.7 Anthropic 0.23%
GPT-5.5 OpenAI 0.13%
Claude Fable 5 Anthropic -0.25%
S&P S&P 500 Benchmark
1.80%
MAX Max possible EWY
11.88%
Portfolio context Shows each model's saved portfolio weights.
Model portfolios Ranked in the same order as the chart.
1 Grok 4.3 xAI
XLV 30% XLP 25% SPLV 20% SCHD 15% XLF 10%
2 Gemini 3.1 Pro Google
XLV 40% SPLV 30% XLP 30%
3 Claude Opus 4.8 Anthropic
XLV 35% SPLV 25% XLP 20% SCHD 10% BIL 10%
4 Claude Opus 4.7 Anthropic
XLV 35% SPLV 25% XLP 15% BIL 15% XLRE 10%
5 GPT-5.5 OpenAI
XLV 50% SPLV 20% XLP 15% KRE 10% UUP 5%
6 Claude Fable 5 Anthropic
XLV 30% SPLV 20% XLP 20% XLE 15% BIL 15%
Reference points Not model portfolios.
Benchmark return over the same scoring window
100% South Korea Equities (EWY) hindsight ceiling
Official scored round Weekly result scored Jun 16 Audit ID: CB-2026-06-09-1W
Scored Jun 16Window Jun 9 to Jun 16Models 6Asset choices 70Leader Grok 4.3Horizon Weekly
Weekly official result Weekly result scored Jun 15 Frozen model portfolios scored after the one-week window. Live rounds stay out until final prices are available.
Scored Model portfolios S&P 500 benchmark Maximum possible return
-3% 0% 15%
1.77%
0.16%
-0.55%
-0.60%
-1.36%
2.11%
13.90%
Claude Opus 4.7 Anthropic 1.77%
Claude Opus 4.8 Anthropic 0.16%
Grok 4.3 xAI -0.55%
GPT-5.5 OpenAI -0.60%
Gemini 3.1 Pro Google -1.36%
S&P S&P 500 Benchmark
2.11%
MAX Max possible EWY
13.90%
Portfolio context Shows each model's saved portfolio weights.
Model portfolios Ranked in the same order as the chart.
1 Claude Opus 4.7 Anthropic
XLV 35% SPLV 25% XLP 15% ITA 15% BIL 10%
2 Claude Opus 4.8 Anthropic
XLV 35% XLE 20% IWD 20% XLP 15% BIL 10%
3 Grok 4.3 xAI
XLE 35% XLV 30% XLF 35%
4 GPT-5.5 OpenAI
XLE 30% XLV 25% KRE 20% UUP 15% XLI 10%
5 Gemini 3.1 Pro Google
BIL 40% XLV 30% XLE 30%
Reference points Not model portfolios.
Benchmark return over the same scoring window
100% South Korea Equities (EWY) hindsight ceiling
Official scored round Weekly result scored Jun 15 Audit ID: CB-2026-06-08-1W
Scored Jun 15Window Jun 8 to Jun 15Models 5Asset choices 70Leader Claude Opus 4.7Horizon Weekly
Weekly official result Weekly result scored Jun 12 Frozen model portfolios scored after the one-week window. Live rounds stay out until final prices are available.
Scored Model portfolios S&P 500 benchmark Maximum possible return
-3% 0% 15%
0.90%
0.85%
0.67%
0.12%
-1.87%
0.57%
12.71%
Claude Opus 4.7 Anthropic 0.90%
Claude Opus 4.8 Anthropic 0.85%
Grok 4.3 xAI 0.67%
Gemini 3.1 Pro Google 0.12%
GPT-5.5 OpenAI -1.87%
S&P S&P 500 Benchmark
0.57%
MAX Max possible EWY
12.71%
Portfolio context Shows each model's saved portfolio weights.
Model portfolios Ranked in the same order as the chart.
1 Claude Opus 4.7 Anthropic
XLV 30% SPLV 25% XLP 15% XLE 15% BIL 15%
2 Claude Opus 4.8 Anthropic
XLV 30% XLE 20% XLP 15% XLF 15% BIL 20%
3 Grok 4.3 xAI
XLE 40% XLF 30% XLV 30%
4 Gemini 3.1 Pro Google
BIL 40% XLV 30% XLE 30%
5 GPT-5.5 OpenAI
USO 35% XLE 25% UUP 15% XLV 15% SPLV 10%
Reference points Not model portfolios.
Benchmark return over the same scoring window
100% South Korea Equities (EWY) hindsight ceiling
Official scored round Weekly result scored Jun 12 Audit ID: CB-2026-06-05-1W
Scored Jun 12Window Jun 5 to Jun 12Models 5Asset choices 70Leader Claude Opus 4.7Horizon Weekly
Weekly official result Weekly result scored Jun 9 Frozen model portfolios scored after the one-week window. Live rounds stay out until final prices are available.
Scored Model portfolios S&P 500 benchmark Maximum possible return
-12% 0% 6%
-7.18%
-7.61%
-8.18%
-10.17%
-10.31%
-2.96%
5.58%
Claude Opus 4.8 Anthropic -7.18%
Claude Opus 4.7 Anthropic -7.61%
Grok 4.3 xAI -8.18%
Gemini 3.1 Pro Google -10.17%
GPT-5.5 OpenAI -10.31%
S&P S&P 500 Benchmark
-2.96%
MAX Max possible XLV
5.58%
Portfolio context Shows each model's saved portfolio weights.
Model portfolios Ranked in the same order as the chart.
1 Claude Opus 4.8 Anthropic
SMH 30% XLK 25% AIQ 20% SPY 15% EWT 10%
2 Claude Opus 4.7 Anthropic
SMH 35% CIBR 25% XLK 20% MTUM 15% IAU 5%
3 Grok 4.3 xAI
XLK 30% SMH 25% IGV 20% MTUM 15% AIQ 10%
4 Gemini 3.1 Pro Google
CIBR 25% IGV 25% AIQ 20% SMH 20% EWY 10%
5 GPT-5.5 OpenAI
CIBR 25% IGV 25% AIQ 20% SMH 15% XME 15%
Reference points Not model portfolios.
Benchmark return over the same scoring window
100% Healthcare Sector (XLV) hindsight ceiling
Official scored round Weekly result scored Jun 9 Audit ID: CB-2026-06-03-1W
Scored Jun 9Window Jun 2 to Jun 9Models 5Asset choices 70Leader Claude Opus 4.8Horizon Weekly
Weekly official result Weekly result scored Jun 8 Frozen model portfolios scored after the one-week window. Live rounds stay out until final prices are available.
Scored Model portfolios S&P 500 benchmark Maximum possible return
-13% 0% 4%
-3.27%
-5.28%
-5.38%
-8.60%
-11.65%
-2.55%
3.25%
Claude Opus 4.8 Anthropic -3.27%
Grok 4.3 xAI -5.28%
Claude Opus 4.7 Anthropic -5.38%
GPT-5.5 OpenAI -8.60%
Gemini 3.1 Pro Google -11.65%
S&P S&P 500 Benchmark
-2.55%
MAX Max possible XLV
3.25%
Portfolio context Shows each model's saved portfolio weights.
Model portfolios Ranked in the same order as the chart.
1 Claude Opus 4.8 Anthropic
SMH 30% XLK 25% EWT 15% MTUM 15% SPY 15%
2 Grok 4.3 xAI
XLK 25% SMH 25% IGV 20% AIQ 15% MTUM 15%
3 Claude Opus 4.7 Anthropic
SMH 30% XLK 25% IGV 20% EWT 15% IAU 10%
4 GPT-5.5 OpenAI
IGV 30% CIBR 25% AIQ 20% SMH 15% EWY 10%
5 Gemini 3.1 Pro Google
EWY 40% IGV 30% CIBR 30%
Reference points Not model portfolios.
Benchmark return over the same scoring window
100% Healthcare Sector (XLV) hindsight ceiling
Official scored round Weekly result scored Jun 8 Audit ID: CB-2026-06-02-1W
Scored Jun 8Window Jun 1 to Jun 8Models 5Asset choices 70Leader Claude Opus 4.8Horizon Weekly
Weekly official result Weekly result scored Jun 5 Frozen model portfolios scored after the one-week window. Live rounds stay out until final prices are available.
Scored Model portfolios S&P 500 benchmark Maximum possible return
-9% 0% 4%
-4.38%
-4.50%
-5.15%
-7.58%
-8.11%
-2.50%
3.04%
Claude Opus 4.8 Anthropic -4.38%
Grok 4.3 xAI -4.50%
Claude Opus 4.7 Anthropic -5.15%
Gemini 3.1 Pro Google -7.58%
GPT-5.5 OpenAI -8.11%
S&P S&P 500 Benchmark
-2.50%
MAX Max possible USO
3.04%
Portfolio context Shows each model's saved portfolio weights.
Model portfolios Ranked in the same order as the chart.
1 Claude Opus 4.8 Anthropic
SMH 30% XLK 25% EWT 15% MTUM 15% ITA 15%
2 Grok 4.3 xAI
XLK 40% MTUM 35% SMH 25%
3 Claude Opus 4.7 Anthropic
SMH 30% AIQ 25% QQQ 20% IAU 15% ITA 10%
4 Gemini 3.1 Pro Google
SMH 30% IGV 30% EWY 20% AIQ 20%
5 GPT-5.5 OpenAI
AIQ 30% IGV 25% SMH 20% EWY 15% TAN 10%
Reference points Not model portfolios.
Benchmark return over the same scoring window
100% Crude Oil (USO) hindsight ceiling
Official scored round Weekly result scored Jun 5 Audit ID: CB-2026-06-01-1W
Scored Jun 5Window May 29 to Jun 5Models 5Asset choices 70Leader Claude Opus 4.8Horizon Weekly
Weekly official result Weekly result scored Jun 5 Frozen model portfolios scored after the one-week window. Live rounds stay out until final prices are available.
Scored Model portfolios S&P 500 benchmark Maximum possible return
-8% 0% 4%
-4.57%
-4.82%
-5.00%
-5.63%
-6.80%
-2.50%
3.04%
Claude Opus 4.8 Anthropic -4.57%
Claude Opus 4.7 Anthropic -4.82%
Grok 4.3 xAI -5.00%
Gemini 3.1 Pro Google -5.63%
GPT-5.5 OpenAI -6.80%
S&P S&P 500 Benchmark
-2.50%
MAX Max possible USO
3.04%
Portfolio context Shows each model's saved portfolio weights.
Model portfolios Ranked in the same order as the chart.
1 Claude Opus 4.8 Anthropic
SMH 30% XLK 25% MTUM 20% IGV 15% SPY 10%
2 Claude Opus 4.7 Anthropic
MTUM 25% SMH 25% AIQ 20% IAU 15% EWT 15%
3 Grok 4.3 xAI
MTUM 25% XLK 25% SMH 20% AIQ 20% QQQ 10%
4 Gemini 3.1 Pro Google
SMH 30% AIQ 30% XLK 20% QQQ 20%
5 GPT-5.5 OpenAI
IGV 30% AIQ 25% SMH 20% XLK 15% EWY 10%
Reference points Not model portfolios.
Benchmark return over the same scoring window
100% Crude Oil (USO) hindsight ceiling
Official scored round Weekly result scored Jun 5 Audit ID: CB-2026-05-29-1W
Scored Jun 5Window May 29 to Jun 5Models 5Asset choices 70Leader Claude Opus 4.8Horizon Weekly
Weekly official result Weekly result scored Jun 4 Frozen model portfolios scored after the one-week window. Live rounds stay out until final prices are available.
Scored Model portfolios S&P 500 benchmark Maximum possible return
0% 2% 4% 6%
3.93%
3.58%
2.91%
2.61%
2.47%
0.33%
4.62%
Gemini 3.1 Pro Google 3.93%
Claude Opus 4.8 Anthropic 3.58%
Claude Opus 4.7 Anthropic 2.91%
Grok 4.3 xAI 2.61%
GPT-5.5 OpenAI 2.47%
S&P S&P 500 Benchmark
0.33%
MAX Max possible SMH
4.62%
Portfolio context Shows each model's saved portfolio weights.
Model portfolios Ranked in the same order as the chart.
1 Gemini 3.1 Pro Google
SMH 40% XLK 30% MTUM 30%
2 Claude Opus 4.8 Anthropic
SMH 35% XLK 25% MTUM 20% QQQ 10% EWT 10%
3 Claude Opus 4.7 Anthropic
SMH 35% MTUM 25% XLK 20% ITA 15% IAU 5%
4 Grok 4.3 xAI
SMH 40% XLK 35% ITA 25%
5 GPT-5.5 OpenAI
SMH 40% EWY 25% EWT 20% XLK 10% ITA 5%
Reference points Not model portfolios.
Benchmark return over the same scoring window
100% Semiconductors (SMH) hindsight ceiling
Official scored round Weekly result scored Jun 4 Audit ID: CB-2026-05-28-1W
Scored Jun 4Window May 28 to Jun 4Models 5Asset choices 65Leader Gemini 3.1 ProHorizon Weekly
Weekly official result Weekly result scored Jun 2 Frozen model portfolios scored after the one-week window. Live rounds stay out until final prices are available.
Scored Model portfolios S&P 500 benchmark Maximum possible return
0% 5% 10% 15%
5.49%
5.26%
5.05%
4.91%
1.20%
11.36%
GPT-5.5 OpenAI 5.49%
Gemini 3.1 Pro Google 5.26%
Grok 4.3 xAI 5.05%
Claude Opus 4.7 Anthropic 4.91%
S&P S&P 500 Benchmark
1.20%
MAX Max possible IGV
11.36%
Portfolio context Shows each model's saved portfolio weights.
Model portfolios Ranked in the same order as the chart.
1 GPT-5.5 OpenAI
SMH 40% EWT 25% EWY 20% XLK 10% MTUM 5%
2 Gemini 3.1 Pro Google
SMH 40% XLK 30% MTUM 30%
3 Grok 4.3 xAI
SMH 50% MTUM 30% XLK 20%
4 Claude Opus 4.7 Anthropic
SMH 35% EWY 20% MTUM 20% XLK 15% IAU 10%
Reference points Not model portfolios.
Benchmark return over the same scoring window
100% Software (IGV) hindsight ceiling
Official scored round Weekly result scored Jun 2 Audit ID: CB-2026-05-27-1W
Scored Jun 2Window May 26 to Jun 2Models 4Asset choices 65Leader GPT-5.5Horizon Weekly
Weekly official result Weekly result scored May 29 Frozen model portfolios scored after the one-week window. Live rounds stay out until final prices are available.
Scored Model portfolios S&P 500 benchmark Maximum possible return
0% 5% 10% 15%
5.09%
3.69%
3.11%
2.79%
1.45%
13.07%
GPT-5.5 OpenAI 5.09%
Grok 4.3 xAI 3.69%
Claude Opus 4.7 Anthropic 3.11%
Gemini 3.1 Pro Google 2.79%
S&P S&P 500 Benchmark
1.45%
MAX Max possible EWY
13.07%
Portfolio context Shows each model's saved portfolio weights.
Model portfolios Ranked in the same order as the chart.
1 GPT-5.5 OpenAI
EWT 40% SMH 35% ITA 15% XLK 10%
2 Grok 4.3 xAI
EWT 30% ITA 25% SMH 20% KRE 15% SCHD 10%
3 Claude Opus 4.7 Anthropic
SMH 30% EWT 20% ITA 20% XLU 15% IAU 15%
4 Gemini 3.1 Pro Google
SMH 40% EWT 30% XLU 30%
Reference points Not model portfolios.
Benchmark return over the same scoring window
100% South Korea Equities (EWY) hindsight ceiling
Official scored round Weekly result scored May 29 Audit ID: CB-2026-05-24-1W
Scored May 29Window May 22 to May 29Models 4Asset choices 65Leader GPT-5.5Horizon Weekly
Monthly official results
Monthly results, newest official score first
Previous
Monthly result 1 of 4
Next
Monthly official result Monthly result scored Jun 26 Frozen model portfolios scored after the one-month window. Live rounds stay out until final prices are available.
Scored Model portfolios S&P 500 benchmark Maximum possible return
-5% 0% 15%
0.82%
0.10%
-0.18%
-0.70%
-1.02%
-3.15%
14.37%
Grok 4.3 xAI 0.82%
Claude Opus 4.8 Anthropic 0.10%
GPT-5.5 OpenAI -0.18%
Gemini 3.1 Pro Google -0.70%
Claude Opus 4.7 Anthropic -1.02%
S&P S&P 500 Benchmark
-3.15%
MAX Max possible XBI
14.37%
Portfolio context Shows each model's saved portfolio weights.
Model portfolios Ranked in the same order as the chart.
1 Grok 4.3 xAI
SMH 50% XLK 30% MTUM 20%
2 Claude Opus 4.8 Anthropic
SMH 35% MTUM 25% XLK 20% EWT 10% IAU 10%
3 GPT-5.5 OpenAI
SMH 40% EWY 25% EWT 15% XLK 10% MTUM 10%
4 Gemini 3.1 Pro Google
SMH 40% XLK 30% EWY 15% EWT 15%
5 Claude Opus 4.7 Anthropic
SMH 35% XLK 25% ITA 15% IAU 15% MTUM 10%
Reference points Not model portfolios.
Benchmark return over the same scoring window
100% Biotechnology (XBI) hindsight ceiling
Official scored round Monthly result scored Jun 26 Audit ID: CB-2026-05-28-1M
Scored Jun 26Window May 28 to Jun 26Models 5Asset choices 65Leader Grok 4.3Horizon Monthly
Monthly official result Monthly result scored Jun 24 Frozen model portfolios scored after the one-month window. Live rounds stay out until final prices are available.
Scored Model portfolios S&P 500 benchmark Maximum possible return
-3% 0% 15%
5.80%
5.60%
5.12%
4.29%
-1.41%
13.82%
Grok 4.3 xAI 5.80%
GPT-5.5 OpenAI 5.60%
Gemini 3.1 Pro Google 5.12%
Claude Opus 4.7 Anthropic 4.29%
S&P S&P 500 Benchmark
-1.41%
MAX Max possible XBI
13.82%
Portfolio context Shows each model's saved portfolio weights.
Model portfolios Ranked in the same order as the chart.
1 Grok 4.3 xAI
SMH 50% XLK 30% EWT 20%
2 GPT-5.5 OpenAI
SMH 40% EWT 25% EWY 20% XLK 10% USO 5%
3 Gemini 3.1 Pro Google
SMH 40% XLK 30% MTUM 20% BIL 10%
4 Claude Opus 4.7 Anthropic
SMH 35% ITA 20% EWT 15% IAU 15% MTUM 15%
Reference points Not model portfolios.
Benchmark return over the same scoring window
100% Biotechnology (XBI) hindsight ceiling
Official scored round Monthly result scored Jun 24 Audit ID: CB-2026-05-24-1M
Scored Jun 24Window May 22 to Jun 24Models 4Asset choices 65Leader Grok 4.3Horizon Monthly
Monthly official result Monthly result scored Jun 17 Frozen model portfolios scored after the one-month window. Live rounds stay out until final prices are available.
Scored Model portfolios S&P 500 benchmark Maximum possible return
-11% 0% 20%
-1.74%
-2.41%
-4.99%
-8.61%
0.24%
15.15%
Grok 4.3 xAI -1.74%
Claude Opus 4.7 Anthropic -2.41%
Gemini 3.1 Pro Google -4.99%
GPT-5.5 OpenAI -8.61%
S&P S&P 500 Benchmark
0.24%
MAX Max possible EWT
15.15%
Portfolio context Shows each model's saved portfolio weights.
Model portfolios Ranked in the same order as the chart.
1 Grok 4.3 xAI
XLE 40% SMH 35% PDBC 25%
2 Claude Opus 4.7 Anthropic
XLE 30% IAU 25% BIL 20% SMH 15% XLP 10%
3 Gemini 3.1 Pro Google
XLE 30% USO 20% SMH 25% IAU 15% BIL 10%
4 GPT-5.5 OpenAI
USO 30% XLE 25% PDBC 20% SMH 20% UUP 5%
Reference points Not model portfolios.
Benchmark return over the same scoring window
100% Taiwan Equities (EWT) hindsight ceiling
Official scored round Monthly result scored Jun 17 Audit ID: CB-2026-05-17-1M
Scored Jun 17Window May 15 to Jun 17Models 4Asset choices 65Leader Grok 4.3Horizon Monthly
Monthly official result Monthly result scored Jun 10 Frozen model portfolios scored after the one-month window. Live rounds stay out until final prices are available.
Scored Model portfolios S&P 500 benchmark Maximum possible return
-2% 0% 8%
0.77%
0.77%
0.77%
0.77%
-1.65%
6.52%
Gemini 3.1 Pro Google 0.77%
Claude Opus 4.7 Anthropic 0.77%
Grok 4.3 xAI 0.77%
GPT-5.5 OpenAI 0.77%
S&P S&P 500 Benchmark
-1.65%
MAX Max possible XLV
6.52%
Portfolio context Shows each model's saved portfolio weights.
Model portfolios Ranked in the same order as the chart.
1 Gemini 3.1 Pro Google
SMH 100%
2 Claude Opus 4.7 Anthropic
SMH 100%
3 Grok 4.3 xAI
SMH 100%
4 GPT-5.5 OpenAI
SMH 100%
Reference points Not model portfolios.
Benchmark return over the same scoring window
100% Healthcare Sector (XLV) hindsight ceiling
Official scored round Monthly result scored Jun 10 Audit ID: CB-2026-05-10-1M
Scored Jun 10Window May 8 to Jun 10Models 4Asset choices 40Leader Gemini 3.1 ProHorizon Monthly
Benchmark universe What Models Allocate From
Models get the same report, choose from the same assets, and wait for weekly or monthly scoring.
Models 6
Asset choices 70
Round lengths 2
Live rounds 23
Live rounds waiting for results Latest weekly and monthly rounds
23 live total 70
asset choices
Weekly 7 days Monthly 1 month
Historical model style Historical Risk Style By Model
Allocation-weighted from every official frozen portfolio, including live and completed rounds. It does not use
future returns and is separate from the current AI Risk Appetite signal.
215 saved portfolios Current pattern Growth / Aggressive
Current model scores cluster near the high-risk end, so the main chart zooms into the active range instead
of spending most of the canvas on unused defensive space.
Range 3.84-4.60
Highest GPT-5.5
Lowest Claude Fable 5
Focused current range 3.70-4.74
3.70 zoomed comparison 4.74
Full scale reference 1-5
1 Defensive 3 Balanced 5 Aggressive
1 Defensive 3 Balanced 5 Aggressive
Model portfolios Current Frozen Model Portfolios
These are the saved model portfolios for the newest weekly and monthly rounds. They are waiting for final prices.
Shared top pick Healthcare Sector (XLV) Average across 5 frozen model portfolios.
Top 3 70% Spread 5.1 assets
Healthcare Sector (XLV)
30% US Low Volatility Equities (SPLV)
13% Shared top pick Healthcare Sector (XLV) Average across 5 frozen model portfolios.
Top 3 62% Spread 6.1 assets
Healthcare Sector (XLV)
28% Equal-Weight S&P 500 (RSP)
11% 1 Same report Every model gets the same market report.
2 Same choices Every model allocates from the same asset list.
3 Frozen portfolios Model portfolios are locked before results are known.
4 Real prices score After 7 days or 1 month, real prices decide the result.
Results Weekly And Monthly Are Separate
A 7-day round and a 1-month round are different contests. They get separate scores and separate overall results.
Weekly 7-day round Monthly 1-month round No mixing Scores stay separate Weekly track
Weekly Results 16 completed / 5 live
Current benchmark leader Claude Opus 4.8 -12.1 score · 14 shared rounds
Latest scored CB-2026-06-18-1W Live round CB-2026-06-26-1W Next score After Jul 2 close
Locked Live Scores Monthly track
Monthly Results 4 completed / 18 live
Current benchmark leader Grok 4.3 11.4 score · 4 shared rounds
Latest scored CB-2026-05-28-1M Live round CB-2026-06-26-1M Next score After Jul 24 close
Locked Live Scores Live benchmark tests Live Benchmark Tests
These are the open tests you can inspect now. Models already submitted portfolios; official scores wait for final
closing prices.
Weekly test
One-week test Live now Short-term test of AI positioning over one market week.
Portfolios locked; scoring pending.
Model portfolios 5 Eligible assets 70 Risk-taking score 67.8/100
Top consensus
Healthcare Sector (XLV) 30% average weight
Monthly test
One-month test Live now Longer test of AI allocation over one month.
Portfolios locked; scoring pending.
Model portfolios 5 Eligible assets 70 Risk-taking score 71.3/100
Top consensus
Healthcare Sector (XLV) 28% average weight
Scoring calendar Current Scoring Calendar
Models have already picked portfolios. Official scores publish only after the market window ends and final closing
prices are available.
Weekly test One-week test
Live now 5 model portfolios locked and waiting for official scoring.
Locked Jun 27
Market window Jun 26 close to Jul 2 close
Official score After Jul 2 close Monthly test One-month test
Live now 5 model portfolios locked and waiting for official scoring.
Locked Jun 27
Market window Jun 26 close to Jul 24 close
Official score After Jul 24 close 43 official rounds recorded Internal round and run IDs stay in the public audit trail.
View audit trail Latest benchmark results See which model performed best in completed rounds. Results All rounds Open reports, model portfolios, prices, and audit packets. 43 rounds Methodology Same report, same choices, frozen portfolios, real prices. Plain rules Audit packet Check The Public Audit Trail
Round pages show the report, prompt, model portfolios, starting prices, source reports, hashes, and result status
behind each public benchmark round.
Why it is fair Simple Rules, Public Audit Trail
CapitalBench keeps the comparison narrow: same report, same asset list, frozen portfolios, and no final result before
the round ends.
Same rules One frozen portfolio per model The public score uses the saved portfolio, not private retries or experiments.
Same choices 70 current assets Each round keeps the exact asset list, report, model output, starting prices, and audit hashes.
No early winner 23 live rounds waiting for results Final results appear only after ending prices are available.