Extending Statistical Packages to Perform Optimal Matching
For case-control studies, prospective studies with a treatment and a
control group, and studies of a few other types, a statistician
may wish to match individuals from one predesignated group, the cases
or the treatment group, to similar members of another group of control
subjects. This is
bipartite matching.
Although computer algorithms to solve bipartite matching problems
optimally have been available for some time, relatively few software
packages implement these algorithms in a manner readily adaptible to
statistical uses. I am aware of two: Eric Bergstralh and Jon
Kosanke's SAS macro, ``vmatch,'' which supports optimal matching with a
variable number of controls; and my add-on package to R, ``optmatch,''
which supports matching with a variable number of controls and full
matching.
vmatch and related macros for SAS
A routine to match cases (treated units) to some number of controls,
which may vary in a fashion specified by the user. Requires SAS software.
Please visit the
Statistical Software
section of the Mayo Clinic's Web Site
for more information and for the macros. Bergstrahl and Kosanke's
SAS macros for optimal matching have been available for some years
now, and they appear to lack the usage restrictions of my package; on
the other hand, they handle a somewhat narrower range of problems
than does my package.
Optmatch add-on package for R
This package
contains a function
that groups treatment units (cases) and controls into matched sets
that may contain multiple controls and one treatment unit (case), a
treatment-control (case-control) pair, or one control and several
treatment units or cases, with optional restrictions on the balance
between units of the two types in the matched sets to be formed. By
setting these restrictions appropriately, the function can be made to
perform full matching, matching with a varying number of controls,
matching with multiple controls, or pair matching. Requires
R software (which is free).
The package solves matching problems by
translating them into
minimum-cost flow problems, which are in turn solved optimally by the
RELAX-IV codes
of Bertsekas and Tseng. Bertsekas and Tseng permit their codes to be
used freely for research, but ask that agreement be secured with them
before using the codes for commercial or other purposes; this requirement
extends to users of the optmatch R package.
Extending R to include the optmatch package
It's easiest to
install if you have admin access to the machine, in which case it
should do the trick to log in as root (Linux, Solaris or OS X) or as a user
with administrative privileges (Windows), start R, then enter
-
> install.packages("optmatch")
at the R command line. If you don't have
an account with administrative privileges, you'll need to set up a
temporary or user-specific library tree and invoke install.packages with a
``lib='' argument. (The R GUI for Mac OS X does this automatically if you
install using the Installation
Manager tool.) This method of installation retrieves the package from
CRAN, where both source codes for the package and precompiled binary
versions (suitable for recent versions of R) are kept; source codes for
current and earlier versions of the package are also kept
here.
Should you find a bug, please record
it in an email containing as much relevant detail as you can muster
and sent to
optmatch -at- ctools.umich.edu.