Sample Private Eval Report

CapitalBench Private Eval Sample

CB-2026-05-28-1W report preview

Outcome window: 2026-05-28 to 2026-06-04
Systems compared: 5
Methodology: portfolio-v1.0

Decision summary

Gemini 3.1 Pro led this public sample round.

The leading portfolio returned 3.93% against an S&P 500 return of 0.33%. The best eligible asset was Semiconductors with a 4.62% return.

A real private report replaces these public comparator rows with the buyer's system, repeated-run evidence, access notes, cost and latency records, and client-approved confidentiality terms.

Scorecard extract

Top model	Gemini 3.1 Pro
Portfolio return	3.93%
CapitalBench Score	85.1
S&P 500 return	0.33%
Best eligible asset	Semiconductors (SMH)
Oracle return	4.62%

Executive summary

The sample report explains which system led the public comparator set, where returns came from, and what the result can and cannot prove.

Comparative scorecard

Performance, score, benchmark difference, validity, consistency, concentration, cost, and latency are kept as separate fields.

Failure-mode register

Findings are written as decision records with severity, evidence, likely business impact, and a retest condition.

Audit packet index

The final packet links frozen inputs, raw outputs, parsed submissions, prices, hashes, calculations, and methodology version.

Example finding format

Finding	Claude Opus 4.8 trailed the leading sample result but still selected a valid diversified portfolio.
Evidence	Portfolio return, CapitalBench Score, holdings, and price records are linked to the audit packet.
Limitation	One weekly window is real outcome evidence, not proof of durable investment skill.
Retest condition	Run additional weekly or monthly windows after the tested configuration changes.