13 Jan 2026
Inter-Rater Variability in CTCAE Grading
Introduction: The Invisible Source of Noise
Inter-rater variability is one of the least discussed yet most influential factors in oncology safety data. Two experienced clinicians can interpret the same clinical scenario differently, particularly at the boundaries between CTCAE grades.
These differences rarely reflect negligence. They emerge from how symptoms are elicited, documented, and interpreted under real-world constraints.
Where Variability Comes From
CTCAE definitions often hinge on functional impact, such as interference with activities of daily living. Patients describe limitations differently, clinicians probe with varying depth, and documentation styles differ across sites. Small differences early in this chain lead to divergent grades later.
Additionally, local practice norms influence grading culture. What is considered "moderate" at one site may be considered "severe" at another.
Why Variability Matters
Inter-rater variability introduces noise into trial safety data. It can exaggerate or mask toxicity signals, complicate cross-site comparisons, and increase the burden of data queries. Importantly, variability is rarely random; it clusters by site, role, or experience level.
Measuring Rather Than Ignoring
Treating inter-rater variability as inevitable is a missed opportunity. Agreement metrics, targeted training, and structured review processes can identify where grading diverges most. Visibility is the first step toward improvement.
The Role of Decision Support
Human-in-the-loop tools can reduce variability by consistently surfacing definitions, evidence, and comparable cases. They do not eliminate judgment, but they create a more standardized starting point. This shifts variability from hidden noise to explicit, reviewable decisions.
Back to Blog














