Goal: quantify agreement between two raters beyond chance.
Rater 1: yes no yes yes no yes no yes
Rater 2: yes no no yes no yes no yes
Use it for categorical labels from two raters (or two classification methods) on the same set of items.
Kappa can be sensitive to prevalence and marginal distributions. Consider reporting the confusion table too.
This page implements unweighted Cohen’s kappa. Weighted kappa is used for ordered categories and can be added later.