The C Preprocessor: In depth

What is the C Preprocessor?

The C preprocessor is a tool which filters your source code before it is compiled. The preprocessor allows constants to be named using the #define notation. The preprocessor provides several other facilities which will be described here. It is particularly useful for selecting machine dependent pieces of code for different computer types, allowing a single program to be compiled and run on several different computers.
The C preprocessor isn’t restricted to use with C programs, and programmers who use other languages may also find it useful, however it is tuned to recognise features of the C language like comments and strings, so its use may be restricted in other circumstances.
The preprocessor is called cpp, however it is called automatically by the compiler so you will not need to call it while programming in C.

Using #define to Implement Constants

We have already met this facility, in its simplest form it allows us to define textual substitutions as follows.
#define MAXSIZE 256
This will lead to the value 256 being substituted for each occurrence of the word MAXSIZE in the file.

 

Using #define to Create Functional Macros

#define can also be given arguments which are used in its replacement. The definitions are then called macros. Macros work rather like functions, but with the following minor differences.
Since macros are implemented as a textual substitution, there is no effect on program performance (as with functions).
Recursive macros are generally not a good idea.
Macros don’t care about the type of their arguments. Hence macros are a good choice where we might want to operate on reals, integers or a mixture of the two. Programmers sometimes call such type flexibility polymorphism.
Macros are generally fairly small. Macros are full of traps for the unwary programmer. In particular the textual substitution means that arithmetic expressions are liable to be corrupted by the order of evaluation rules.
Here is an example of a macro which won’t work.
#define DOUBLE(x) x+x
Now if we have a statement
a = DOUBLE(b) * c;
This will be expanded to
a = b+b * c;
And since * has a higher priority than +, the compiler will treat it as.
a = b + (b * c);
The problem can be solved using a more robust definition of DOUBLE
#define DOUBLE(x) (x+x)
Here the brackets around the definition force the expression to be evaluated before any surrounding operators are applied. This should make the macro more reliable.
In general it is better to write a C function than risk using a macro.

 

Reading in Other Files using #include

The preprocessor directive #include is an instruction to read in the entire contents of another file at that point. This is generally used to read in header files for library functions. Header files contain details of functions and types used within the library. They must be included before the program can make use of the library functions.
Library header file names are enclosed in angle brackets, . These tell the preprocessor to look for the header file in the standard location for library definitions. This is /usr/include for most UNIX systems.
For example
#include
Another use for #include for the programmer is where multi-file programs are being written. Certain information is required at the beginning of each program file. This can be put into a file called globals.h and included in each program file. Local header file names are usually enclosed by double quotes, ” “. It is conventional to give header files a name which ends in .h to distinguish them from other types of file.
Our globals.h file would be included by the following line.
#include “globals.h”

Conditional selection of code using #ifdef

The preprocessor has a conditional statement similar to C’s if else. It can be used to selectively include statements in a program. This is often used where two different computer types implement a feature in different ways. It allows the programmer to produce a program which will run on either type.
The keywords for conditional selection are; #ifdef, #else and #endif.
#ifdef
takes a name as an argument, and returns true if the the name has a current definition. The name may be defined using a #define, the -d option of the compiler, or certain names which are automatically defined by the UNIX environment.
#else
is optional and ends the block beginning with #ifdef. It is used to create a 2 way optional selection.
#endif
ends the block started by #ifdef or #else.
Where the #ifdef is true, statements between it and a following #else or #endif are included in the program. Where it is false, and there is a following #else, statements between the #else and the following #endif are included.
This is best illustrated by an example.

Using #ifdef for Different Computer Types

Conditional selection is rarely performed using #defined values. A simple application using machine dependent values is illustrated below.

#include
main()
{
#ifdef vax
printf("This is a VAX\n");
#endif
#ifdef sun
printf("This is a SUN\n");
#endif
}

 


sun is defined automatically on SUN computers. vax is defined automatically on VAX computers.

Using #ifdef to Temporarily Remove Program Statements

#ifdef also provides a useful means of temporarily `blanking out’ lines of a program. The lines in question are preceeded by #ifdef NEVER and followed by #endif. Of course you should ensure that the name NEVER isn’t defined anywhere.
The preprocessor has several other useful facilities. If you are interested in these you can read more by typing
man cpp

Leave a Comment