CB-2026-05-28-1W report preview
- Outcome window
- 2026-05-28 to 2026-06-04
- Systems compared
- 5
- Methodology
- portfolio-v1.0
Gemini 3.1 Pro led this public sample round.
The leading portfolio returned 3.93% against an S&P 500 return of 0.33%. The best eligible asset was Semiconductors with a 4.62% return.
A real private report replaces these public comparator rows with the buyer's system, repeated-run evidence, access notes, cost and latency records, and client-approved confidentiality terms.
Scorecard extract
| Top model | Gemini 3.1 Pro |
|---|---|
| Portfolio return | 3.93% |
| CapitalBench Score | 85.1 |
| S&P 500 return | 0.33% |
| Best eligible asset | Semiconductors (SMH) |
| Oracle return | 4.62% |
Executive summary
The sample report explains which system led the public comparator set, where returns came from, and what the result can and cannot prove.
Comparative scorecard
Performance, score, benchmark difference, validity, consistency, concentration, cost, and latency are kept as separate fields.
Failure-mode register
Findings are written as decision records with severity, evidence, likely business impact, and a retest condition.
Audit packet index
The final packet links frozen inputs, raw outputs, parsed submissions, prices, hashes, calculations, and methodology version.
Example finding format
| Finding | Claude Opus 4.8 trailed the leading sample result but still selected a valid diversified portfolio. |
|---|---|
| Evidence | Portfolio return, CapitalBench Score, holdings, and price records are linked to the audit packet. |
| Limitation | One weekly window is real outcome evidence, not proof of durable investment skill. |
| Retest condition | Run additional weekly or monthly windows after the tested configuration changes. |