Datafile: matching.sav (Download Tips) 
I. Overview
The purpose of the blocking, matching, analysis of covariance, and gain score procedures is to control for unwanted differences in pretreatment scores. These procedures are typically used in quasiexperimental designs because pretest differences are more likely to occur when participants are not randomly assigned to conditions. However, these procedures can be used in experimental designs when participants have been randomly assigned to conditions.
When there is a correlation between pretest and posttest scores, all of these procedures reduce error variance. The consequence is that these procedures will increase the power of your statistical test.
The blocking procedure selected treatment and control participants who were similar on their pretest scores and then analyzed the posttest scores of those selected in a betweensubjects design (e.g., an independent t test or an analysis of variance with group (experimental vs. control) as the independent variable.
The matching procedure uses the pretest scores to pair an experimental participant with a control participant. The posttest scores of the matched pairs are analyzed in a withinsubjects design (e.g., a dependent t test or an analysis of variance with group (experimental vs. control) as a repeated measures independent variable.
II. How to Match
A. Exact Matching
1. The first step is to rank order the participants according to their pretest scores. You can rank in either ascending (lowest to highest) or descending (highest to lowest) order. The ranking should be done within each of the conditions in the study.
Example Data
The data used in this example are stored in the file matching.dat. If you wish you can download the data file, see the download instructions in the outline at the beginning of this set of notes. The ranked data from the experimental group are shown in Table 1; the data from the control group are shown in Table 2. In both tables the pretest scores have been ranked in ascending order, from the lowest the highest score.
Table 1. Experimental (treatment) Participants

Table 2. Control Participants

2. Take the person with lowest score in the experimental group and match that person with someone from the control group who has an identical pretest score. If there is more than one person in the control group with an identical score then randomly select one of them. Then take the experimental person with the next lowest score and match that person with someone from the control group who has an identical pretest score. Continue this process until all experimental participants have been match with a control person.
In our data example the experimental scores begin at 9. None of the control participants with pretest scores less than 9 will be used in this matching procedure.
You could make a table to keep track of the matching process:
Table 3. Exact Matching
ID # of the
Treatment Participant 
ID # of the
Control Participant (Randomly chosen from the available data) 
Treatment Group
Pretest Score 
Posttest Score
for the Treatment Participant 
Posttest Score
for the Control Participant 

17 
12 
9 
13 
9 
20 
11 
9 
10 
13 
21 
13 
9 
11 
12 
24 
15 
9 
11 
6 
25 
no match 
9 
7 

26 
2 
11 
15 
13 
28 
3 
11 
8 
8 
18 
no match 
12 
11 

22 
no match 
12 
15 

23 
no match 
12 
9 

27 
no match 
12 
9 

30 
no match 
13 
10 

19 
no match 
14 
14 

16 
no match 
15 
12 

29 
no match 
15 
15 
In the experimental group there are 5 pretest scores of "9." In the control group there are only 4 pretest scores of "9." Therefore only four of the five "9"s will be matched. In the experimental group there are 2 pretest scores of "11." There are 4 pretest scores of "11" in the control group, but only two will be randomly chosen at matches. Posttest scores with a strikethough are not used in the data analysis.
The scores that will be entered into the computer program to be analyzed will be the two posttest scores. Data from cases without a match will not be entered into the analysis. You can use a paired t test or a repeated measures analysis of variance to analyze the posttest scores. The results could be reported as follows:
Pretest differences between the treatment and control groups were controlled by an exact matching procedure. Exact matching was accomplished for 40% of the participants (6 of the 15 possible pairs). After matching there was no difference between the posttest scores of experimental group (M = 11.33, SD = 2.42) and the control group (M = 10.17, SD = 2.25), t (5) = 0.93, p = .393. This result should be treated with caution due the large number participants who could not be matched.
Q: What happens if you have more participants in one group than the other?
A: The "extra" participants are dropped from consideration. If you have 10 people in one group and 15 people in the other, then, at most, you can have only 10 pairs of data to analyze.
Q: What happens if there at not exact matches for everyone?
A: In reality, it is unlikely that you would find an exact match for everyone. How likely is it that the distribution of scores for one group will exactly match the distribution of scores for the other group? The probability is near zero. Lets look at several possibilities.
First, consider the situation where the distributions have little overlap. There will be some cases each group that are out of range of the cases from the other group. Those cases will have no matches and they will be discarded from the analysis.
In this example, the pretest score overlap between the two groups is not very large. Only 6 of the 15 scores can be matched. The data loss is so large that that exact matching does not seem like a reasonable approach. The loss of power is too high. And any "unusual" data points could have a large influence on the statistical test.
Second, consider those cases where the distributions have moderate overlap. Even where the distributions overlap it is unlikely that you will be able to find an exact match for everyone. One alternative is to use caliper matching.
B. Caliper Matching
In caliper matching you establish a range of scores that you are willing to consider as "close enough for a match." For example, if you are matching IQ scores then there is not much difference between a score of 105 and a score of 104, you would probably be willing to say that those two scores were about equally matched. For IQ you might be willing to match any score that was within plus or minus 5 points of any given score.
Lets try caliper matching for this set of data where the width of the caliper is defined as the pretest score � 1. The experimental data are reproduced in Table 4.
Table 4. Caliper Matching
ID # of the
Treatment Participant 
ID # of the
Control Participant  Randomly chosen (pretest) 
Treatment Group
Pretest Score 
Caliper Range
(score � 1) 
Posttest Score
for the Treatment Participant 
Posttest Score
for the Control Participant 

17 
12 ( 9) 
9 
810 
13 
9 
20 
10 ( 8) 
9 
810 
10 
9 
21 
13 ( 9) 
9 
810 
11 
12 
24 
11 ( 9) 
9 
810 
11 
13 
25 
5 ( 8) 
9 
810 
7 
9 
26 
14 (11) 
11 
1012 
15 
8 
28 
1 (10) 
11 
1012 
8 
13 
18 
2 (11) 
12 
1113 
11 
13 
22 
3 (11) 
12 
1113 
15 
8 
23 
9 (11) 
12 
1113 
9 
8 
27 
no match 
12 
1113 
9 

30 
no match 
13 
1214 
10 

19 
no match 
14 
1315 
14 

16 
no match 
15 
1416 
12 

29 
no match 
15 
1416 
15 
Caliper matching with a range of � 1 produced matched scores for 10 of the 15 experimental participants, twice as many as exact matching. Note that cases without a match (ID #s 27, 30, 19, 16, and 29 in Table 4) are not used in the statistical analysis.
The results of the analysis might be reported as follows:
Preexisting differences between the experimental and control groups were controlled by using caliper matching with a caliper width equal to the pretest score � 1. Ten of the 15 possible pairs of data (67%) remained after matching. After caliper matching there was no difference between the posttest scores of the experimental group (M = 11.0, SD = 2.71) and the control group (M = 10.20, SD = 2.25), t (9) = 0.63, p = .548.
Matching can reduce the error variance, making your test more sensitive to differences between the groups. The magnitude of the reduction in error variance is partly a function of (a) the width of the caliper you use and (b) the correlation between the pretest and posttest scores. The narrower the caliper, the greater the reduction in error variance. So you should choose a caliper size that is as small as possible. You want to maximize the number of pairs of data while keeping the caliper as small as possible. The larger the correlation between the pretest and posttest scores, the smaller the error variance after matching.
1998, 1999, Lee A. Becker  03/15/99