01111. Crosstabs: Kappa

Reading: SPSS Base 8.0 User's Guide: Chapter 13, Crosstabs
                SPSS Base 8.0 User's Guide: Chapter 6, File Handling and File Transformations -Weight Cases (pp. 111-112)
Download: kappa.sav        (Download Tips)

  1. Overview
  2. Define the Weight Variable
  3. Select Kappa the Crosstabs Procedure
  4. The Output
  5. Computation and Interpretation of the Kappa Statistic
  6. References

          Crosstabs: Measures for Nominal Data
          Crosstabs: Measures for Ordinal Data

 

 

1. Overview

Kappa is a measure of agreement. It is currently gaining popularity as a measure of scorer reliability. The following example is based on Cohen (1960). Suppose that you ask 200 sets of fathers and mothers to identify which of three personality descriptions best describes their oldest child. The data are saved in the file kappa.sav. The variables in kappa.sav are given in Table 1.

Table 1. The variables in kappa.sav
Variable Name Variable Label / Value Label
fathers Father's description of the personality of their oldest child/
 1 'Personality #1'
 2 'Personality #2'
 3 'Personality #3'
mothers Mother's description of the personality of their oldest child/
 1 'Personality #1'
 2 'Personality #2'
 3 'Personality #3'
count Weighting variable (the counts for each cell)

This data file makes use of a weight variable, called count, to specify the number of cases for each cell in the 3 (fathers' descriptions) x 3 (mothers' descriptions) contingency table. The count variable specifies the number of cases in each cell of the 3 x 3 table. The values in kappa.sav are shown in Table 2.

Table 2. The values in kappa.sav
fathers mothers count
1
1
1
2
2
2
3
3
3
1
2
3
1
2
3
1
2
3
88
10
  2
14
40
  6
18
10
12

The count value of 88 for the first row indicates that there are 88 cases in cell 11 (fathers 1, mothers1). That is, the fathers and mothers agree that personality description #1 is the best description for their oldest child. There are 10 cases in cell 12 (fathers1, mothers 2); 2 cases in cell 13 (fathers1, mothers3); and 14 cases in cell 21 (fathers 2, mothrs 1), and so forth. By adding all the values for count you find the total number of cases, N = 200.

Please note that normally you would enter the data for each case. If the weight variable were not used, then the dataset would need to have 200 cases. The kappa.sav file uses only 9 cases to represent the entire set of data points. The weight variable is useful if you wish to reanalyze a published set of data. As in this example, you could enter the cell frequences as a weight variable rather than entering all the individual cases.

The kappa statistic is found in crosstabs. But there is an additional step that needs to be taken before you enter the crosstabs dialog menu, you need to tell SPSS that the variable count is to be used as a weighting variable.

top


2. Define the Weight Variable

Select

Data
    Weight Cases

In the dialog box select the Weight cases by radio box and then move the count variable to the Frequency Variable: window. The press OK to run the command.

top


3. Select Kappa in the Crosstabs Procedure

Kappa is in the statistics option box in the crosstabs procedure. Click on

Statistics
      Summarize

            Crosstabs
                 Statistics
                      Kappa

In this discussion of kappa the variable fathers is designated as the row variable and the variable mothers is designated as the column variable. The cell options selected are observed counts, expected counts, and row (father) percents. All statisitics other than kappa have been suppressed.

top


4. The Output

The frequencies table for the kappa data is shown in Table 3 and the kappa statistic output is shown in Table 4.

Table 3. Fathers description of eldest child * mothers description of oldest child Crosstabulation
  mothers description of oldest child Total
personality type 1 personality type 2 personality type 3
Fathers description of eldest child personality type 1 Count 88 10 2 100
Expected Count 60.0 30.0 10.0 100.0
% within Fathers description of eldest child 88.0% 10.0% 2.0% 100.0%
personality type 2 Count 14 40 6 60
Expected Count 36.0 18.0 6.0 60.0
% within Fathers description of eldest child 23.3% 66.7% 10.0% 100.0%
personality type 3 Count 18 10 12 40
Expected Count 24.0 12.0 4.0 40.0
% within Fathers description of eldest child 45.0% 25.0% 30.0% 100.0%
Total Count 120 60 20 200
Expected Count 120.0 60.0 20.0 200.0
% within Fathers description of eldest child 60.0% 30.0% 10.0% 100.0%

 

Table 4. Kappa (Symmetric Measures)
  Value Asymp. Std. Error(a) Approx. T(b) Approx. Sig.
Measure of Agreement Kappa .492 .051 9.456 .000
N of Valid Cases 200      
a Not assuming the null hypothesis.
b Using the asymptotic standard error assuming the null hypothesis.

The proportion of agreements after chance has been excluded is 49%, kappa (N = 200) = .492, p < .0005.

top


5. Computation and Interpretation of the Kappa Statistic

One way to compute an index of agreement between the parents is to find the percent of times that each of them agree on the personality description. Agreements are shown in the diagonal cells (cells 11, 22 and 33) in Table 3. In this example 88 parents agree on "1", 40 parents agree on "2" and 12 parents agree on "3". So, 140 out of 200 parents, or 70% agree on the description. But, we don't know if this is good or not, because we don't know what the level of agreement would be "by chance." The "by chance" levels of agreement are given by the expected counts for those cells. The expected counts are found in the same manner that we found expected frequencies for Chi Square,

E = (row total*column total)/N.

In the above example the sum of the expected counts in the diagonal cells (cells 11, 22, and 33) gives us the expected frequency of agreement (60+18+4=82) . So, now we can compare the observed levels of agreement with the levels of agreement expected by chance. Kappa provides the means for doing this. The formula for kappa is

k = (Oa - Ea)/(N - Ea)

where Oa is the observed count of agreement, Ea is the expected count of agreement, and N is the total number of respondent pairs.

k = (Oa - Ea)/(N - Ea)
   = (140 - 82)/(200 - 82) = 58 / 118 = .492

Kappa is the proportion of agreements after chance agreement has been excluded. Its upper limit is +1.00 (total agreement). If judges agree at a chance level, k = 0.00. The lower limit of kappa depends on the distribution of row and column marginals and can fall between 0 and -1.00. (Normally we are interested in levels of agreement greater than chance rather than smaller than chance.)

If you are interested in additional reading about kappa see Cohen (1960) or Kraemer (1982).

top


6. References

Cohen, J. (1960). A coefficient of agreement for nominal scales. Educational and Psychological Measurement, 20, 37-46.

Kraemer, H. C. (1982). Kappa coefficient. In S. Kotz and N. L. Johnson (Eds.), Encyclopedia of Statistical Sciences. New York: John Wiley & Sons.

top


Lee A. Becker, 1997, 1998 -revised