Model behavior patterns

How The AI Allocators Differ

A dynamic comparison of each model's allocation style across official frozen portfolios. The report separates risk appetite, concentration, defensive ballast, peer overlap, turnover, and resolved performance profile in one view.

Most risk-seeking GPT-5.5 Highest average risk-taking score Most concentrated Grok 4.3 Highest concentration across saved portfolios Most defensive Claude Opus 4.7 Largest defensive allocation Lowest turnover Claude Opus 4.8 Smallest average portfolio turnover Closest to peers Claude Opus 4.7 Highest average peer overlap Most distinctive Claude Fable 5 Lowest average peer overlap
Comparison matrix

Distinct Behavior By Model

Summaries are generated from deterministic metrics. NVIDIA-assisted wording can polish future summaries, but the metric rows below remain the source of truth.

Calculation method
Model Distinct behavior Evidence Common exposures
GPT-5.5 OpenAI Established sample
Aggressive upside hunter

Most aggressive allocation profile. It averages 84.8 / 100 risk-taking, 91.4% high-risk exposure, and only 1.9% defensive ballast. Results have been fragile, with 6 last-place finishes.

Highest risk-takingTechnology tiltInternational tiltBinary results
Semiconductors (SMH) 27.6% avg South Korea Equities (EWY) 11.6% avg Taiwan Equities (EWT) 7.2% avg
Grok 4.3 xAI Established sample
High-conviction concentrator

Most concentrated structure. It averages 3.62 holdings and a 38.6% largest position, so its portfolios express fewer, higher-conviction views.

Most concentratedTechnology tiltMiddle-stable resultsOften different from peers
Semiconductors (SMH) 25.5% avg Technology Sector (XLK) 13.4% avg US Momentum Equities (MTUM) 11.7% avg
Gemini 3.1 Pro Google Established sample
High-conviction concentrator

Concentrated high-conviction style. It uses fewer holdings than diversified peers and gives more weight to the largest position in each portfolio.

Technology tiltBinary resultsOften different from peers
Semiconductors (SMH) 22.8% avg Technology Sector (XLK) 12.1% avg S&P 500 (SPY) 10.3% avg
Claude Opus 4.8 Anthropic Established sample
Risk-managed allocator

More institutional allocation style. It keeps broader portfolios, carries 15.0% defensive exposure on average, and changes positions less than more tactical peers.

Lowest turnover
Semiconductors (SMH) 21.5% avg Technology Sector (XLK) 10.4% avg Healthcare Sector (XLV) 9.8% avg
Claude Opus 4.7 Anthropic Established sample
Risk-managed allocator

More institutional allocation style. It keeps broader portfolios, carries 21.9% defensive exposure on average, and changes positions less than more tactical peers.

Most defensive ballastMost consensus-alignedOften different from peers
Semiconductors (SMH) 26.6% avg Aerospace and Defense (ITA) 9.7% avg US Momentum Equities (MTUM) 8.6% avg
Claude Fable 5 Anthropic Early sample
Early sample

Early allocations lean toward Energy Sector (XLE), Semiconductors (SMH), US Momentum Equities (MTUM). The behavioral read should stay provisional until more official portfolios resolve.

Only 5 official saved portfolios so far. Treat the behavior label as provisional.
Early sampleMost distinctiveReal-asset tilt
Energy Sector (XLE) 17.0% avg Semiconductors (SMH) 15.0% avg US Momentum Equities (MTUM) 12.0% avg
Key numbers

Behavior Metrics In One Table

These are cumulative model-level measures across official saved portfolios and resolved results. A lower average rank is better. Beat S&P shows resolved rounds where model return exceeded the S&P 500.

Model RiskHoldingsTop holdingHigh riskDefensivePeer overlapTurnoverAvg rank1st / lastBeat S&P
GPT-5.5 84.8 / 1004.8336.5%91.4%1.9%57.5%44.8%3.902 / 64/10
Grok 4.3 81.4 / 1003.6238.6%85.7%5.0%62.3%48.3%2.800 / 05/10
Gemini 3.1 Pro 76.3 / 1003.6638.8%71.0%13.3%52.9%52.6%3.402 / 34/10
Claude Opus 4.8 74.8 / 1005.0030.6%69.2%15.0%62.6%41.8%1.434 / 02/7
Claude Opus 4.7 73.8 / 1004.8633.6%69.0%21.9%63.1%49.6%2.402 / 15/10
Claude Fable 5 69.7 / 1005.0028.0%64.0%21.0%49.0%68.3%n/a0 / 00/0
Comparative findings

What Stands Out

Each finding is tied to model IDs and metric keys in the generated report.

GPT-5.5, Grok 4.3

GPT-5.5 and Grok 4.3 are different in different ways

GPT-5.5 stands out by risk appetite at 84.8 / 100, while Grok 4.3 stands out by portfolio structure with a 38.6% average largest holding.

risk taking scoreaverage top allocation pct
Claude Opus 4.7, Claude Opus 4.8

Claude Opus 4.7 and Claude Opus 4.8 look more risk-managed than the aggressive cohort

Claude Opus 4.7 has the highest defensive allocation at 21.9%. Claude Opus 4.8 has the lowest measured turnover at 41.8%.

defensive pctaverage turnover pct
GPT-5.5, Gemini 3.1 Pro

Some aggressive or concentrated models have more binary outcomes

GPT-5.5 has 2 first-place and 6 last-place finishes. Gemini 3.1 Pro has 2 first-place and 3 last-place finishes.

first place countlast place count
Grok 4.3

Grok 4.3 has been steadier than its risk score suggests

Grok 4.3 averages 81.4 / 100 risk-taking but has no first-place or last-place finishes across 10 resolved rounds.

risk taking scorefirst place countlast place countresolved round count
Claude Opus 4.7

Claude Opus 4.7 is closest to the model crowd

Claude Opus 4.7 has the highest average peer overlap at 63.1%. This means its allocation weights have looked more like the rest of the roster than the most distinctive models.

peer similarity
Methodology

How The Pattern Report Is Calculated

The report is generated from official frozen portfolios and resolved result rows during the website build. No page copy is manually assigned to a model. New models appear automatically after they have official saved portfolios.

Models with fewer than 8 saved portfolios are marked as early sample. Performance language is caveated until at least 3 resolved results are available.

NVIDIA-assisted text is allowed only as a rewrite layer. The prompt receives the structured model rows, traits, metric keys, top assets, and comparative candidates. It is not allowed to add unsupported numbers, assets, causes, stale dates, or investment advice.

Prompt contract: capitalbench_model_patterns_prompt_v1

0-100 Risk-taking score

Average allocation-weighted risk appetite across all official saved portfolios. Higher means more growth, momentum, cyclical, and high-risk exposure.

count Avg holdings

Average number of non-zero assets in the model's official saved portfolios.

percentage_points Avg top holding

Average size of the largest single holding in each official saved portfolio.

percentage_points High-risk allocation

Average allocation to assets rated as higher risk by the CapitalBench asset risk model.

percentage_points Defensive allocation

Average allocation to cash, bonds, defensive sectors, and other lower-risk ballast.

percentage_points Technology allocation

Average allocation to technology, semiconductors, Nasdaq-style growth, and AI-linked technology exposure.

percentage_points Cash/duration allocation

Average allocation to cash-like assets and duration-sensitive bond exposure.

percentage_points International allocation

Average allocation to non-U.S. country, regional, or international equity exposure.

percentage_points Real assets allocation

Average allocation to commodities, crypto, energy, gold, and other inflation-linked or real-asset groups.

0-1 Peer overlap

Average cosine similarity between this model's allocation weights and peer model portfolios in the same rounds.

percentage_points Avg turnover

Average one-half summed absolute allocation change between consecutive same-track portfolios.

rank Avg rank

Average finishing rank across resolved rounds. Lower is better.

points Avg CapitalBench Score

Average model score versus the hindsight-best eligible asset in each resolved round.