Extending Statistical Packages to Perform Optimal Matching

For case-control studies, prospective studies with a treatment and a control group, and studies of a few other types, a statistician may wish to match individuals from one predesignated group, the cases or the treatment group, to similar members of another group of control subjects. This is bipartite matching. Although computer algorithms to solve bipartite matching problems optimally have been available for some time, relatively few software packages implement these algorithms in a manner readily adaptible to statistical uses. I am aware of two: Eric Bergstralh and Jon Kosanke's SAS macro, ``vmatch,'' which supports optimal matching with a variable number of controls; and my add-on package to R, ``optmatch,'' which supports matching with a variable number of controls and full matching.

vmatch and related macros for SAS

A routine to match cases (treated units) to some number of controls, which may vary in a fashion specified by the user. Requires SAS software. Please visit the Statistical Software section of the Mayo Clinic's Web Site for more information and for the macros. Bergstrahl and Kosanke's SAS macros for optimal matching have been available for some years now, and they appear to lack the usage restrictions of my package; on the other hand, they handle a somewhat narrower range of problems than does my package.

Optmatch add-on package for R

This package contains a function that groups treatment units (cases) and controls into matched sets that may contain multiple controls and one treatment unit (case), a treatment-control (case-control) pair, or one control and several treatment units or cases, with optional restrictions on the balance between units of the two types in the matched sets to be formed. By setting these restrictions appropriately, the function can be made to perform full matching, matching with a varying number of controls, matching with multiple controls, or pair matching. Requires R software (which is free).
The package solves matching problems by translating them into minimum-cost flow problems, which are in turn solved optimally by the RELAX-IV codes of Bertsekas and Tseng. Bertsekas and Tseng permit their codes to be used freely for research, but ask that agreement be secured with them before using the codes for commercial or other purposes; this requirement extends to users of the optmatch R package.

Extending R to include the optmatch package

It's easiest to install if you have admin access to the machine, in which case it should do the trick to log in as root (Linux, Solaris or OS X) or as a user with administrative privileges (Windows), start R, then enter
> install.packages("optmatch")

at the R command line. If you don't have an account with administrative privileges, you'll need to set up a temporary or user-specific library tree and invoke install.packages with a ``lib='' argument. (The R GUI for Mac OS X does this automatically if you install using the Installation Manager tool.) This method of installation retrieves the package from CRAN, where both source codes for the package and precompiled binary versions (suitable for recent versions of R) are kept; source codes for current and earlier versions of the package are also kept here. Should you find a bug, please record it in an email containing as much relevant detail as you can muster and sent to optmatch -at- ctools.umich.edu.