// METHODOLOGY · PILLAR 01

Calibrated rubric design.

The difference between a rubric and a calibrated rubric is the same as the difference between a ruler with the right markings and a ruler that has actually been compared to a standard meter.

Request demo Read the methodology

Calibrated rubric design — A calibrated rubric is a scoring instrument whose scale itself has been calibrated — not just the wording — so each score maps to a defined, defensible performance state. Versioned, owned, and anchored to published task standards. The instrument behind every defensible assessment.

// 01 — WHAT MAKES A RUBRIC CALIBRATED

Three properties.

Anchored. Every score on the scale corresponds to a defined performance state, tied to the operator's published task standard. Not 'good / better / best' — 'meets task standard with no margin / meets task standard with margin / exceeds task standard in measurable ways.'

Versioned. Every edit to the rubric produces a new version. Historical scores remain attached to the version that produced them. A rubric that loses version history loses calibration the moment it changes.

Owned. A rubric without a named owner drifts. Ownership means a specific person is accountable for the calibration of the instrument. Without that, the rubric is a Google doc.

Anchored. Versioned. Owned. Three properties. Without all three, calibration is a claim, not a state.

// 02 — RUBRIC DESIGN UNDER OCTAAR

The substrate around the rubric.

OCTAAR does not impose a rubric. The operator's published task standards become the calibrated scale. What OCTAAR provides is the substrate: rubric authorship, version control, ownership assignment, calibration sessions, and the linkage from each score back to the rubric definition that produced it.

The substrate is what makes the rubric durable. The rubric without the substrate is a Word file that will drift inside three quarters.

// 03 — WHAT NOT TO DO

Three failure modes in rubric design.

Failure mode one: the rubric scale describes effort instead of outcome. 'Demonstrated significant effort to meet the standard' is not a calibrated score. The standard either was met or was not.

Failure mode two: the rubric anchors to ambiguous language. 'Adequate,' 'satisfactory,' 'effective' — each invites observer interpretation. Calibrated language anchors to observable behavior or measurable result.

Failure mode three: the rubric is too long. A 40-item rubric administered under field conditions degrades into observer summary judgment. The calibrated short rubric beats the comprehensive long one.

// 04 — RUBRIC CALIBRATION SESSIONS

How the instrument gets calibrated.

Pre-cycle calibration session: evaluators review recorded or live performances, score them, compare scores, discuss divergence, agree on anchored interpretation. The output is shared understanding of what the rubric means on this cycle.

Mid-cycle calibration check: a small subset of in-cycle observations are second-scored. Variance is surfaced. Out-of-tolerance scoring is flagged.

Post-cycle calibration review: aggregate IRR is examined. Drift events at the evaluator-pool level become findings. The rubric itself can become a finding — if scoring divergence concentrates on a specific item, the item may need recalibration or revision.

// READ NEXT

Evaluator calibration guide

The other half of the instrument.

Methodology overview

All six pillars.

Glossary: calibrated rubric

Canonical definition.

Audit-defensible assessment

Why versioned rubrics matter under audit.

// Last updated · May 19, 2026 · OCTAAR Methodology Team

// FAQ

Direct answers.

How many scoring levels should a rubric have?

Domain-dependent. Three to five anchored levels typically outperforms seven or more. The constraint is observer reliability under field conditions, not theoretical resolution.

Can we use a rubric we already wrote?

Yes. The first action OCTAAR runs on an inherited rubric is calibration: review the language for anchoring, the scale for outcome-vs-effort framing, the items for length, and the ownership for accountability. Most inherited rubrics survive that review with edits, not replacement.

Who owns a rubric inside OCTAAR?

A named person designated by the operator. Ownership includes responsibility for calibration cadence, version review, and finding-triggered revision. The owner is part of the rubric's metadata.

What happens to old scores when the rubric is revised?

They stay attached to the version that produced them. Cross-version comparison is possible where the items map cleanly; where they do not, the comparison is flagged. Versioning is what preserves defensibility under audit.

// READY

See the discipline as an operating cycle.

Request Operational Readiness Demo All resources

// OR · CONTACT THE TEAM