File formats

Information in computers is stored in files, all files are just collections of zeros and ones ("bits") but these become useful as a wordprocessing document, a picture, a statistical dataset etc. only because the program operating on the file uses a dictionary to decode the bits. The idea of "file formats" is that data gets stored using a dictionary that is known and can be understood by someone else who wants to use the information.

Formats depend on the "operating system": the program that the computer runs to handle basic things like saving files to disc and reading them off disc. It also depends on the particular application package: word processor, graphics package or statistics package that you are using. Sometimes you may be presented with information from packages you do not use regularly and problems can arise. These can take two forms: either your software simply won't read the data at all; or else it reads the data but does it incorrectly. Very occasionally but particularly irritatingly, this can be such that the data in the file looks fine but after working with it for some time you find it is impossible to save the file.

The basic recipe for avoiding file format incompatibilities is to use only software recommended by the Information Policy Committee/Computer Unit and then use only formats preferred by those packages (their "native" formats). Sometimes it is essential to swap things across formats: you may collaborate with people whose institution has made different choices of recommended packages or you may, particularly in statistical and graphical areas, find that recommended packages are fine for many things but won't do what you need done. The following documents should help minimise problems.