Large sample standard errors of kappa and weighted kappa.
Citations Over TimeTop 1% of 1969 papers
Abstract
The statistics kappa (Cohen, 1960) and weighted kappa (Cohen, 1968) were introduced to provide coefficients of agreement between two raters for nominal scales. Kappa is appropriate when all disagreements may be considered equally serious, and weighted kappa is appropriate when the relative seriousness of the different possible disagreements can be specified. The papers describing these two statistics also present expressions for their standard errors. These expressions are incorrect, having been derived from the contradictory assumptions of fixed marginal totals and binomial variation of cell frequencies. Everitt (1968) derived the exact variances of weighted and unweighted kappa when the parameters are zero by assuming a generalized hypergeometric distribution. He found these expressions to be far too complicated for routine use, and offered, as alternatives, expressions derived by assuming binomial distributions. These alternative expressions are incorrect, essentially for the same reason as above. Assume that N subjects are distributed into k* cells by each of them being assigned to one of k categories by one rater and, independently, to one of the same k categories by a second
Related Papers
- Understanding interobserver agreement: the kappa statistic.(2005)
- → Statistical methods in epidemiology. v. Towards an understanding of the kappa coefficient(2000)161 cited
- → The Kappa Coefficient and the Prevalence of a Diagnosis(1988)70 cited
- → Count on kappa(2014)52 cited
- → The Problems with the Kappa Statistic as a Metric of Interobserver Agreement on Lesion Detection Using a Third-reader Approach When Locations Are Not Prespecified(2018)11 cited