Statistical dataset formats and file conversion

Recommended software includes SAS (in both Windows and Sun versions), SPSSX (ditto, is a Mac version available?), the friendly and simple DOS package Arcus Pro Stat and the excellent public domain program DOS program Epi Info.

It is possible to transfer data between statistics packages using simple ASCII files which most packages can read and write (or be persuaded to read and write). However, this can lose helpful information such as variable names, labels and formats, value labels and, above all, it can lose crucial missing value labels. Use of dBase III or IV formats for interchange will generally be faster and easier and is often possible using the built in file exchange capabilities of the packages. Unfortunately this will always lose missing value information which will have to be reconstructed manually. Failing to remember this can produce either obviously incorrect results, or, much more seriously, could lead to hours of analyses looking fine but being wrong. The STW program on the network (in the "Utilities" section of Artemis) will translate between a number of statistical file formats including a number of quite esoteric ones such as Gauss, and the more obvious minor league such as Systat and Stata. However, its handling of SAS is restricted to version 5 transport files and so is not compatible with current Windows or Sun versions of SAS and the handling of SPSS files is restricted to export files which means native SPSS system files have to be exported by SPSS before being translated by STW. Subject to those limits, this very simple program may save you time translating between formats. Epi Info is also theoretically capable of exporting data in the formats of a number of other statistics packages but can only import ASCII, Lotus 1-2-3 and dBase formats.

The moral is to minimise use of multiple packages and exchanges between formats but to remember to pay particular attention to missing value labelling whenever exchanges have to be made.