Inter-rater reliability (IRR) is the formal measure of whether a scoring system is consistent across people. High IRR means the rubric and the calibration practice are producing the same score for the same performance, regardless of who is observing.
OCTAAR treats IRR as an operational metric, not a research artifact. It is monitored across the cycle, surfaced as a leading indicator, and used to trigger evaluator recalibration before drift becomes a finding.