Prev | Contents | Next

19 The C Preprocessor

Before your program gets compiled, it actually runs through a phase called preprocessing. It’s almost like there’s a language on top of the C language that runs first. And it outputs the C code, which then gets compiled.

We’ve already seen this to an extent with #include! That’s the C Preprocessor! Where it sees that directive, it includes the named file right there, just as if you’d typed it in there. And then the compiler builds the whole thing.

But it turns out it’s a lot more powerful than just being able to include things. You can define macros that are substituted… and even macros that take arguments!

19.1 #include

Let’s start with the one we’ve already seen a bunch. This is, of course, a way to include other sources in your source. Very commonly used with header files.

While the spec allows for all kinds of behavior with #include, we’re going to take a more pragmatic approach and talk about the way it works on every system I’ve ever seen.

We can split header files into two categories: system and local. Things that are built-in, like stdio.h, stdlib.h, math.h, and so on, you can include with angle brackets:

#include <stdio.h>
#include <stdlib.h>

The angle brackets tell C, “Hey, don’t look in the current directory for this header file—look in the system-wide include directory instead.”

Which, of course, implies that there must be a way to include local files from the current directory. And there is: with double quotes:

#include "myheader.h"

Or you can very probably look in relative directories using forward slashes and dots, like this:

#include "mydir/myheader.h"
#include "../someheader.py"

Don’t use a backslash (\) for your path separators in your #include! It’s undefined behavior! Use forward slash (/) only, even on Windows.

In summary, used angle brackets (< and >) for the system includes, and use double quotes (") for your personal includes.

19.2 Simple Macros

A macro is an identifier that gets expanded to another piece of code before the compiler even sees it. Think of it like a placeholder—when the preprocessor sees one of those identifiers, it replaces it with another value that you’ve defined.

We do this with #define (often read “pound define”). Here’s an example:

#include <stdio.h>

#define HELLO "Hello, world"
#define PI 3.14159

int main(void)
{
    printf("%s, %f\n", HELLO, PI);
}

On lines 3 and 4 we defined a couple macros. Wherever these appear elsewhere in the code (line 8), they’ll be substituted with the defined values.

From the C compiler’s perspective, it’s exactly as if we’d written this, instead:

#include <stdio.h>

int main(void)
{
    printf("%s, %f\n", "Hello, world", 3.14159);
}

See how HELLO was replaced with "Hello, world" and PI was replaced with 3.14159? From the compiler’s perspective, it’s just like those values had appeared right there in the code.

Note that the macros don’t have a specific type, per se. Really all that happens is they get replaced wholesale with whatever they’re #defined as. If the resulting C code is invalid, the compiler will puke.

You can also define a macro with no value:

#define EXTRA_HAPPY

in that case, the macro exists and is defined, but is defined to be nothing. So anyplace it occurs in the text will just be replaced with nothing. We’ll see a use for this later.

It’s conventional to write macro names in ALL_CAPS even though that’s not technically required.

Overall, this gives you a way to define constant values that are effectively global and can be used any place. Even in those places where a const variable won’t work, e.g. in switch cases and fixed array lengths.

That said, the debate rages online whether a typed const variable is better than #define macro in the general case.

It can also be used to replace or modify keywords, a concept completely foreign to const, though this practice should be used sparingly.

19.3 Conditional Compilation

It’s possible to get the preprocessor to decide whether or not to present certain blocks of code to the compiler, or just remove them entirely before compilation.

We do that by basically wrapping up the code in conditional blocks, similar to if-else statements.

19.3.1 If Defined, #ifdef and #endif

First of all, let’s try to compile specific code depending on whether or not a macro is even defined.

#include <stdio.h>

#define EXTRA_HAPPY

int main(void)
{

#ifdef EXTRA_HAPPY
    printf("I'm extra happy!\n");
#endif

    printf("OK!\n");
}

In that example, we define EXTRA_HAPPY (to be nothing, but it is defined), then on line 8 we check to see if it is defined with an #ifdef directive. If it is defined, the subsequent code will be included up until the #endif.

So because it is defined, the code will be included for compilation and the output will be:

I'm extra happy!
OK!

If we were to comment out the #define, like so:

//#define EXTRA_HAPPY

then it wouldn’t be defined, and the code wouldn’t be included in compilation. And the output would just be:

OK!

It’s important to remember that these decisions happen at compile time! The code actually gets compiled or removed depending on the condition. This is in contrast to a standard if statement that gets evaluated while the program is running.

19.3.2 If Not Defined, #ifndef

There’s also the negative sense of “if defined”: “if not defined”, or #ifndef. We could change the previous example to output different things based on whether or not something was defined:

#ifdef EXTRA_HAPPY
    printf("I'm extra happy!\n");
#endif

#ifndef EXTRA_HAPPY
    printf("I'm just regular\n");
#endif

We’ll see a cleaner way to do that in the next section.

Tying it all back in to header files, we’ve seen how we can cause header files to only be included one time by wrapping them in preprocessor directives like this:

#ifndef MYHEADER_H  // First line of myheader.h
#define MYHEADER_H

int x = 12;

#endif  // Last line of myheader.h

This demonstrates how a macro persists across files and multiple #includes. If it’s not yet defined, let’s define it and compile the whole header file.

But the next time it’s included, we see that MYHEADER_H is defined, so we don’t send the header file to the compiler—it gets effectively removed.

19.3.3 #else

But that’s not all we can do! There’s also an #else that we can throw in the mix.

Let’s mod the previous example:

#ifdef EXTRA_HAPPY
    printf("I'm extra happy!\n");
#else
    printf("I'm just regular\n");
#endif

Now if EXTRA_HAPPY is not defined, it’ll hit the #else clause and print:

I'm just regular

19.3.4 Else-If: #elifdef, #elifndef

This feature is new in C23!

What if you want something more complex, though? Perhaps you need an if-else cascade structure to get your code built right?

Luckily we have these directives at our disposal. We can use #elifdef for “else if defined”:

#ifdef MODE_1
    printf("This is mode 1\n");
#elifdef MODE_2
    printf("This is mode 2\n");
#elifdef MODE_3
    printf("This is mode 3\n");
#else
    printf("This is some other mode\n");
#endif

On the flipside, you can use #elifndef for “else if not defined”.

19.3.5 General Conditional: #if, #elif

This works very much like the #ifdef and #ifndef directives in that you can also have an #else and the whole thing wraps up with #endif.

The only difference is that the constant expression after the #if must evaluate to true (non-zero) for the code in the #if to be compiled. So instead of whether or not something is defined, we want an expression that evaluates to true.

#include <stdio.h>

#define HAPPY_FACTOR 1

int main(void)
{

#if HAPPY_FACTOR == 0
    printf("I'm not happy!\n");
#elif HAPPY_FACTOR == 1
    printf("I'm just regular\n");
#else
    printf("I'm extra happy!\n");
#endif

    printf("OK!\n");
}

Again, for the unmatched #if clauses, the compiler won’t even see those lines. For the above code, after the preprocessor gets finished with it, all the compiler sees is:

#include <stdio.h>

int main(void)
{

    printf("I'm just regular\n");

    printf("OK!\n");
}

One hackish thing this is used for is to comment out large numbers of lines quickly127.

If you put an #if 0 (“if false”) at the front of the block to be commented out and an #endif at the end, you can get this effect:

#if 0
    printf("All this code"); /* is effectively */
    printf("commented out"); // by the #if 0
#endif

What if you’re on a pre-C23 compiler and you don’t have #elifdef or #elifndef directive support? How can we get the same effect with #if? That is, what if I wanted this:

#ifdef FOO
    x = 2;
#elifdef BAR  // POTENTIAL ERROR: Not supported before C23
    x = 3;
#endif

How could I do it?

Turns out there’s a preprocessor operator called defined that we can use with an #if statement.

These are equivalent:

#ifdef FOO
#if defined FOO
#if defined(FOO)   // Parentheses optional

As are these:

#ifndef FOO
#if !defined FOO
#if !defined(FOO)   // Parentheses optional

Notice how we can use the standard logical NOT operator (!) for “not defined”.

So now we’re back in #if land and we can use #elif with impunity!

This broken code:

#ifdef FOO
    x = 2;
#elifdef BAR  // POTENTIAL ERROR: Not supported before C23
    x = 3;
#endif

can be replaced with:

#if defined FOO
    x = 2;
#elif defined BAR
    x = 3;
#endif

19.3.6 Losing a Macro: #undef

If you’ve defined something but you don’t need it any longer, you can undefine it with #undef.

#include <stdio.h>

int main(void)
{
#define GOATS

#ifdef GOATS
    printf("Goats detected!\n");  // prints
#endif

#undef GOATS  // Make GOATS no longer defined

#ifdef GOATS
    printf("Goats detected, again!\n"); // doesn't print
#endif
}

19.4 Built-in Macros

The standard defines a lot of built-in macros that you can test and use for conditional compilation. Let’s look at those here.

19.4.1 Mandatory Macros

These are all defined:

Macro Description
__DATE__ The date of compilation—like when you’re compiling this file—in Mmm dd yyyy format
__TIME__ The time of compilation in hh:mm:ss format
__FILE__ A string containing this file’s name
__LINE__ The line number of the file this macro appears on
__func__ The name of the function this appears in, as a string128
__STDC__ Defined with 1 if this is a standard C compiler
__STDC_HOSTED__ This will be 1 if the compiler is a hosted implementation129, otherwise 0
__STDC_VERSION__ This version of C, a constant long int in the form yyyymmL, e.g. 201710L

Let’s put these together.

#include <stdio.h>

int main(void)
{
    printf("This function: %s\n", __func__);
    printf("This file: %s\n", __FILE__);
    printf("This line: %d\n", __LINE__);
    printf("Compiled on: %s %s\n", __DATE__, __TIME__);
    printf("C Version: %ld\n", __STDC_VERSION__);
}

The output on my system is:

This function: main
This file: foo.c
This line: 7
Compiled on: Nov 23 2020 17:16:27
C Version: 201710

__FILE__, __func__ and __LINE__ are particularly useful to report error conditions in messages to developers. The assert() macro in <assert.h> uses these to call out where in the code the assertion failed.

19.4.1.1 __STDC_VERSION__s

In case you’re wondering, here are the version numbers for different major releases of the C Language Spec:

Release ISO/IEC version __STDC_VERSION__
C89 ISO/IEC 9899:1990 undefined
C89 ISO/IEC 9899:1990/Amd.1:1995 199409L
C99 ISO/IEC 9899:1999 199901L
C11 ISO/IEC 9899:2011/Amd.1:2012 201112L

Note the macro did not exist originally in C89.

Also note that the plan is that the version numbers will strictly increase, so you could always check for, say, “at least C99” with:

#if __STDC_VERSION__ >= 1999901L

19.4.2 Optional Macros

Your implementation might define these, as well. Or it might not.

Macro Description
__STDC_ISO_10646__ If defined, wchar_t holds Unicode values, otherwise something else
__STDC_MB_MIGHT_NEQ_WC__ A 1 indicates that the values in multibyte characters might not map equally to values in wide characters
__STDC_UTF_16__ A 1 indicates that the system uses UTF-16 encoding in type char16_t
__STDC_UTF_32__ A 1 indicates that the system uses UTF-32 encoding in type char32_t
__STDC_ANALYZABLE__ A 1 indicates the code is analyzable130
__STDC_IEC_559__ 1 if IEEE-754 (aka IEC 60559) floating point is supported
__STDC_IEC_559_COMPLEX__ 1 if IEC 60559 complex floating point is supported
__STDC_LIB_EXT1__ 1 if this implementation supports a variety of “safe” alternate standard library functions (they have _s suffixes on the name)
__STDC_NO_ATOMICS__ 1 if this implementation does not support _Atomic or <stdatomic.h>
__STDC_NO_COMPLEX__ 1 if this implementation does not support complex types or <complex.h>
__STDC_NO_THREADS__ 1 if this implementation does not support <threads.h>
__STDC_NO_VLA__ 1 if this implementation does not support variable-length arrays

19.5 Macros with Arguments

Macros are more powerful than simple substitution, though. You can set them up to take arguments that are substituted in, as well.

A question often arises for when to use parameterized macros versus functions. Short answer: use functions. But you’ll see lots of macros in the wild and in the standard library. People tend to use them for short, mathy things, and also for features that might change from platform to platform. You can define different keywords for one platform or another.

19.5.1 Macros with One Argument

Let’s start with a simple one that squares a number:

#include <stdio.h>

#define SQR(x) x * x  // Not quite right, but bear with me

int main(void)
{
    printf("%d\n", SQR(12));  // 144
}

What that’s saying is “everywhere you see SQR with some value, replace it with that value times itself”.

So line 7 will be changed to:

    printf("%d\n", 12 * 12);  // 144

which C comfortably converts to 144.

But we’ve made an elementary error in that macro, one that we need to avoid.

Let’s check it out. What if we wanted to compute SQR(3 + 4)? Well, \(3+4=7\), so we must want to compute \(7^2=49\). That’s it; 49—final answer.

Let’s drop it in our code and see that we get… 19?

    printf("%d\n", SQR(3 + 4));  // 19!!??

What happened?

If we follow the macro expansion, we get

    printf("%d\n", 3 + 4 * 3 + 4);  // 19!

Oops! Since multiplication takes precedence, we do the \(4\times3=12\) first, and get \(3+12+4=19\). Not what we were after.

So we have to fix this to make it right.

This is so common that you should automatically do it every time you make a parameterized math macro!

The fix is easy: just add some parentheses!

#define SQR(x) (x) * (x)   // Better... but still not quite good enough!

And now our macro expands to:

    printf("%d\n", (3 + 4) * (3 + 4));  // 49! Woo hoo!

But we actually still have the same problem which might manifest if we have a higher-precedence operator than multiply (*) nearby.

So the safe, proper way to put the macro together is to wrap the whole thing in additional parentheses, like so:

#define SQR(x) ((x) * (x))   // Good!

Just make it a habit to do that when you make a math macro and you can’t go wrong.

19.5.2 Macros with More than One Argument

You can stack these things up as much as you want:

#define TRIANGLE_AREA(w, h) (0.5 * (w) * (h))

Let’s do some macros that solve for \(x\) using the quadratic formula. Just in case you don’t have it on the top of your head, it says for equations of the form:

\(ax^2+bx+c=0\)

you can solve for \(x\) with the quadratic formula:

\(x=\displaystyle\frac{-b\pm\sqrt{b^2-4ac}}{2a}\)

Which is crazy. Also notice the plus-or-minus (\(\pm\)) in there, indicating that there are actually two solutions.

So let’s make macros for both:

#define QUADP(a, b, c) ((-(b) + sqrt((b) * (b) - 4 * (a) * (c))) / (2 * (a)))
#define QUADM(a, b, c) ((-(b) - sqrt((b) * (b) - 4 * (a) * (c))) / (2 * (a)))

So that gets us some math. But let’s define one more that we can use as arguments to printf() to print both answers.

//          macro              replacement
//      |-----------| |----------------------------|
#define QUAD(a, b, c) QUADP(a, b, c), QUADM(a, b, c)

That’s just a couple values separated by a comma—and we can use that as a “combined” argument of sorts to printf() like this:

printf("x = %f or x = %f\n", QUAD(2, 10, 5));

Let’s put it together into some code:

#include <stdio.h>
#include <math.h>  // For sqrt()

#define QUADP(a, b, c) ((-(b) + sqrt((b) * (b) - 4 * (a) * (c))) / (2 * (a)))
#define QUADM(a, b, c) ((-(b) - sqrt((b) * (b) - 4 * (a) * (c))) / (2 * (a)))
#define QUAD(a, b, c) QUADP(a, b, c), QUADM(a, b, c)

int main(void)
{
    printf("2*x^2 + 10*x + 5 = 0\n");
    printf("x = %f or x = %f\n", QUAD(2, 10, 5));
}

And this gives us the output:

2*x^2 + 10*x + 5 = 0
x = -0.563508 or x = -4.436492

Plugging in either of those values gives us roughly zero (a bit off because the numbers aren’t exact):

\(2\times-0.563508^2+10\times-0.563508+5\approx0.000003\)

19.5.3 Macros with Variable Arguments

There’s also a way to have a variable number of arguments passed to a macro, using ellipses (...) after the known, named arguments. When the macro is expanded, all of the extra arguments will be in a comma-separated list in the __VA_ARGS__ macro, and can be replaced from there:

#include <stdio.h>

// Combine the first two arguments to a single number,
// then have a commalist of the rest of them:

#define X(a, b, ...) (10*(a) + 20*(b)), __VA_ARGS__

int main(void)
{
    printf("%d %f %s %d\n", X(5, 4, 3.14, "Hi!", 12));
}

The substitution that takes place on line 10 would be:

    printf("%d %f %s %d\n", (10*(5) + 20*(4)), 3.14, "Hi!", 12);

for output:

130 3.140000 Hi! 12

You can also “stringify” __VA_ARGS__ by putting a # in front of it:

#define X(...) #__VA_ARGS__

printf("%s\n", X(1,2,3));  // Prints "1, 2, 3"

19.5.4 Stringification

Already mentioned, just above, you can turn any argument into a string by preceding it with a # in the replacement text.

For example, we could print anything as a string with this macro and printf():

#define STR(x) #x

printf("%s\n", STR(3.14159));

In that case, the substitution leads to:

printf("%s\n", "3.14159");

Let’s see if we can use this to greater effect so that we can pass any int variable name into a macro, and have it print out it’s name and value.

#include <stdio.h>

#define PRINT_INT_VAL(x) printf("%s = %d\n", #x, x)

int main(void)
{
    int a = 5;

    PRINT_INT_VAL(a);  // prints "a = 5"
}

On line 9, we get the following macro replacement:

    printf("%s = %d\n", "a", 5);

19.5.5 Concatenation

We can concatenate two arguments together with ##, as well. Fun times!

#define CAT(a, b) a ## b

printf("%f\n", CAT(3.14, 1592));   // 3.141592

19.6 Multiline Macros

It’s possible to continue a macro to multiple lines if you escape the newline with a backslash (\).

Let’s write a multiline macro that prints numbers from 0 to the product of the two arguments passed in.

#include <stdio.h>

#define PRINT_NUMS_TO_PRODUCT(a, b) do { \
    int product = (a) * (b); \
    for (int i = 0; i < product; i++) { \
        printf("%d\n", i); \
    } \
} while(0)

int main(void)
{
    PRINT_NUMS_TO_PRODUCT(2, 4);  // Outputs numbers from 0 to 7
}

A couple things to note there:

The latter point might be a little weird, but it’s all about absorbing the trailing ; the coder drops after the macro.

At first I thought that just using squirrely braces would be enough, but there’s a case where it fails if the coder puts a semicolon after the macro. Here’s that case:

#include <stdio.h>

#define FOO(x) { (x)++; }

int main(void)
{
    int i = 0;

    if (i == 0)
        FOO(i);
    else
        printf(":-(\n");

    printf("%d\n", i);
}

Looks simple enough, but it won’t build without a syntax error:

foo.c:11:5: error: ‘else’ without a previous ‘if’  

Do you see it?

Let’s look at the expansion:


    if (i == 0) {
        (i)++;
    };             // <-- Trouble with a capital-T!

    else
        printf(":-(\n");

The ; puts an end to the if statement, so the else is just floating out there illegally131.

So wrap that multiline macro with a do-while(0).

19.7 Example: An Assert Macro

Adding asserts to your code is a good way to catch conditions that you think shouldn’t happen. C provides assert() functionality. It checks a condition, and if it’s false, the program bombs out telling you the file and line number on which the assertion failed.

But this is wanting.

  1. First of all, you can’t specify an additional message with the assert.
  2. Secondly, there’s no easy on-off switch for all the asserts.

We can address the first with macros.

Basically, when I have this code:

ASSERT(x < 20, "x must be under 20");

I want something like this to happen (assuming the ASSERT() is on line 220 of foo.c):

if (!(x < 20)) {
    fprintf(stderr, "foo.c:220: assertion x < 20 failed: ");
    fprintf(stderr, "x must be under 20\n");
    exit(1);
}

We can get the filename out of the __FILE__ macro, and the line number from __LINE__. The message is already a string, but x < 20 is not, so we’ll have to stringify it with #. We can make a multiline macro by using backslash escapes at the end of the line.

#define ASSERT(c, m) \
do { \
    if (!(c)) { \
        fprintf(stderr, __FILE__ ":%d: assertion %s failed: %s\n", \
                        __LINE__, #c, m); \
        exit(1); \
    } \
} while(0)

(It looks a little weird with __FILE__ out front like that, but remember it is a string literal, and string literals next to each other are automagically concatenated. __LINE__ on the other hand, it’s just an int.)

And that works! If I run this:

int x = 30;

ASSERT(x < 20, "x must be under 20");

I get this output:

foo.c:23: assertion x < 20 failed: x must be under 20

Very nice!

The only thing left is a way to turn it on and off, and we could do that with conditional compilation.

Here’s the complete example:

#include <stdio.h>
#include <stdlib.h>

#define ASSERT_ENABLED 1

#if ASSERT_ENABLED
#define ASSERT(c, m) \
do { \
    if (!(c)) { \
        fprintf(stderr, __FILE__ ":%d: assertion %s failed: %s\n", \
                        __LINE__, #c, m); \
        exit(1); \
    } \
} while(0)
#else
#define ASSERT(c, m)  // Empty macro if not enabled
#endif

int main(void)
{
    int x = 30;

    ASSERT(x < 20, "x must be under 20");
}

This has the output:

foo.c:23: assertion x < 20 failed: x must be under 20

19.8 The #error Directive

This directive causes the compiler to error out as soon as it sees it.

Commonly, this is used inside a conditional to prevent compilation unless some prerequisites are met:

#ifndef __STDC_IEC_559__
    #error I really need IEEE-754 floating point to compile. Sorry!
#endif

Some compilers have a non-standard complementary #warning directive that will output a warning but not stop compilation, but this is not in the C11 spec.

19.9 The #embed Directive

New in C23!

And currently not yet working with any of my compilers, so take this section with a grain of salt!

The gist of this is that you can include bytes of a file as integer constants as if you’d typed them in.

For example, if you have a binary file named foo.bin that contains four bytes with decimal values 11, 22, 33, and 44, and you do this:

int a[] = {
#embed "foo.bin"
};

It’ll be just as if you’d typed this:

int a[] = {11,22,33,44};

This is a really powerful way to initialize an array with binary data without needing to convert it all to code first—the preprocessor does it for you!

A more typical use case might be a file containing a small image to be displayed that you don’t want to load at runtime.

Here’s another example:

int a[] = {
#embed <foo.bin>
};

If you use angle brackets, the preprocessor looks in a series of implementation-defined places to locate the file, just like #include would do. If you use double quotes and the resource is not found, the compiler will try it as if you’d used angle brackets in a last desperate attempt to find the file.

#embed works like #include in that it effectively pastes values in before the compiler sees them. This means you can use it in all kinds of places:

return
#embed "somevalue.dat"
;

or

int x =
#embed "xvalue.dat"
;

Now—are these always bytes? Meaning they’ll have values from 0 to 255, inclusive? The answer is definitely by default “yes”, except when it is “no”.

Technically, the elements will be CHAR_BIT bits wide. And this is very likely 8 on your system, so you’d get that 0-255 range in your values. (They’ll always be non-negative.)

Also, it’s possible that an implementation might allow this to be overridden in some way, e.g. on the command line or with parameters.

The size of the file in bits must be a multiple of the element size. That is, if each element is 8 bits, the file size (in bits) must be a multiple of 8. In regular everyday usage, this is a confusing way of saying that each file needs to be an integer number of bytes… which of course it is. Honestly, I’m not even sure why I bothered with this paragraph. Read the spec if you’re really that curious.

19.9.1 #embed Parameters

There are all kinds of parameters you can specify to the #embed directive. Here’s an example with the yet-unintroduced limit() parameter:

int a[] = {
#embed "/dev/random" limit(5)
};

But what if you already have limit defined somewhere else? Luckily you can put __ around the keyword and it will work the same way:

int a[] = {
#embed "/dev/random" __limit__(5)
};

Now… what’s this limit thing?

19.9.2 The limit() Parameter

You can specify a limit on the number of elements to embed with this parameter.

This is a maximum value, not an absolute value. If the file that’s embedded is shorter than the specified limit, only that many bytes will be imported.

The /dev/random example above is an example of the motivation for this—in Unix, that’s a character device file that will return an infinite stream of pretty-random numbers.

Embedding an infinite number of bytes is hard on your RAM, so the limit parameter gives you a way to stop after a certain number.

Finally, you are allowed to use #define macros in your limit, in case you were curious.

19.9.3 The if_empty Parameter

This parameter defines what the embed result should be if the file exists but contains no data. Let’s say that the file foo.dat contains a single byte with the value 123. If we do this:

int x = 
#embed "foo.dat" if_empty(999)
;

we’ll get:

int x = 123;   // When foo.dat contains a 123 byte

But what if the file foo.dat is zero bytes long (i.e. contains no data and is empty)? If that’s the case, it would expand to:

int x = 999;   // When foo.dat is empty

Notably if the limit is set to 0, then the if_empty will always be substituted. That is, a zero limit effectively means the file is empty.

This will always emit x = 999 no matter what’s in foo.dat:

int x = 
#embed "foo.dat" limit(0) if_empty(999)
;

19.9.4 The prefix() and suffix() Parameters

This is a way to prepend some data on the embed.

Note that these only affect non-empty data! If the file is empty, neither prefix nor suffix has any effect.

Here’s an example where we embed three random numbers, but prefix the numbers with 11, and suffix them with ,99:

int x[] = {
#embed "/dev/urandom" limit(3) prefix(11,) suffix(,99)
};

Example result:

int x[] = {11,135,116,220,99};

There’s no requirement that you use both prefix and suffix. You can use both, one, the other, or neither.

We can make use of the characteristic that these are only applied to non-empty files to neat effect, as shown in the following example shamelessly stolen from the spec.

Let’s say we have a file foo.dat that has some data it in. And we want to use this to initialize an array, and then we want a suffix on the array that is a zero element.

No problem, right?

int x[] = {
#embed "foo.dat" suffix(,0)
};

If foo.dat has 11, 22, and 33 in it, we’d get:

int x[] = {11,22,33,0};

But wait! What if foo.dat is empty? Then we get:

int x[] = {};

and that’s not good.

But we can fix it like this:

int x[] = {
#embed "foo.dat" suffix(,)
    0
};

Since the suffix parameter is omitted if the file is empty, this would just turn into:

int x[] = {0};

which is fine.

19.9.5 The __has_embed() Identifier

This is a great way to test to see if a particular file is available to be embedded, and also whether or not it’s empty.

You use it with the #if directive.

Here’s a chunk of code that will get 5 random numbers from the random number generator character device. If that doesn’t exist, it tries to get them from a file myrandoms.dat. If that doesn’t exist, it uses some hard-coded values:

    int random_nums[] = {
#if __has_embed("/dev/urandom")
    #embed "/dev/urandom" limit(5)
#elif __has_embed("myrandoms.dat")
    #embed "myrandoms.dat" limit(5)
#else
    140,178,92,167,120
#endif
    };

Technically, the __has_embed() identifier resolves to one of three values:

__has_embed() Result Description
__STDC_EMBED_NOT_FOUND__ If the file isn’t found.
__STDC_EMBED_FOUND__ If the file is found and is not empty.
__STDC_EMBED_EMPTY If the file is found and is empty.

I have good reason to believe that __STDC_EMBED_NOT_FOUND__ is 0 and the others aren’t zero (because it’s implied in the proposal and it makes logical sense), but I’m having trouble finding that in this version of the draft spec.

TODO

19.9.6 Other Parameters

A compiler implementation can define other embed parameters all it wants—look for these non-standard parameters in your compiler’s documentation.

For instance:

#embed "foo.bin" limit(12) frotz(lamp)

These might commonly have a prefix on them to help with namespacing:

#embed "foo.bin" limit(12) fmc::frotz(lamp)

It might be sensible to try to detect if these are available before you use them, and luckily we can use __has_embed to help us here.

Normally, __has_embed() will just tell us if the file is there or not. But—and here’s the fun bit—it will also return false if any additional parameters are also not supported!

So if we give it a file that we know exists as well as a parameter that we want to test for the existence of, it will effectively tell us if that parameter is supported.

What file always exists, though? Turns out we can use the __FILE__ macro, which expands to the name of the source file that references it! That file must exist, or something is seriously wrong in the chicken-and-egg department.

Let’s test that frotz parameter to see if we can use it:

#if __has_embed(__FILE__ fmc::frotz(lamp))
    puts("fmc::frotz(lamp) is supported!");
#else
    puts("fmc::frotz(lamp) is NOT supported!");
#endif

19.9.7 Embedding Multi-Byte Values

What about getting some ints in there instead of individual bytes? What about multi-byte values in the embedded file?

This is not something supported by the C23 standard, but there could be implementation extensions defined for it in the future.

19.10 The #pragma Directive

This is one funky directive, short for “pragmatic”. You can use it to do… well, anything your compiler supports you doing with it.

Basically the only time you’re going to add this to your code is if some documentation tells you to do so.

19.10.1 Non-Standard Pragmas

Here’s one non-standard example of using #pragma to cause the compiler to execute a for loop in parallel with multiple threads (if the compiler supports the OpenMP132 extension):

#pragma omp parallel for
for (int i = 0; i < 10; i++) { ... }

There are all kinds of #pragma directives documented across all four corners of the globe.

All unrecognized #pragmas are ignored by the compiler.

19.10.2 Standard Pragmas

There are also a few standard ones, and these start with STDC, and follow the same form:

#pragma STDC pragma_name on-off

The on-off portion can be either ON, OFF, or DEFAULT.

And the pragma_name can be one of these:

Pragma Name Description
FP_CONTRACT Allow floating point expressions to be contracted into a single operation to avoid rounding errors that might occur from multiple operations.
FENV_ACCESS Set to ON if you plan to access the floating point status flags. If OFF, the compiler might perform optimizations that cause the values in the flags to be inconsistent or invalid.
CX_LIMITED_RANGE Set to ON to allow the compiler to skip overflow checks when performing complex arithmetic. Defaults to OFF.

For example:

#pragma STDC FP_CONTRACT OFF
#pragma STDC CX_LIMITED_RANGE ON

As for CX_LIMITED_RANGE, the spec points out:

The purpose of the pragma is to allow the implementation to use the formulas:

\((x+iy)\times(u+iv) = (xu-yv)+i(yu+xv)\)

\((x+iy)/(u+iv) = [(xu+yv)+i(yu-xv)]/(u^2+v^2)\)

\(|x+iy|=\sqrt{x^2+y^2}\)

where the programmer can determine they are safe.

19.10.3 _Pragma Operator

This is another way to declare a pragma that you could use in a macro.

These are equivalent:

#pragma "Unnecessary" quotes
_Pragma("\"Unnecessary\" quotes")

This can be used in a macro, if need be:

#define PRAGMA(x) _Pragma(#x)

19.11 The #line Directive

This allows you to override the values for __LINE__ and __FILE__. If you want.

I’ve never wanted to do this, but in K&R2, they write:

For the benefit of other preprocessors that generate C programs […]

So maybe there’s that.

To override the line number to, say 300:

#line 300

and __LINE__ will keep counting up from there.

To override the line number and the filename:

#line 300 "newfilename"

19.12 The Null Directive

A # on a line by itself is ignored by the preprocessor. Now, to be entirely honest, I don’t know what the use case is for this.

I’ve seen examples like this:

#ifdef FOO
    #
#else
    printf("Something");
#endif

which is just cosmetic; the line with the solitary # can be deleted with no ill effect.

Or maybe for cosmetic consistency, like this:

#
#ifdef FOO
    x = 2;
#endif
#
#if BAR == 17
    x = 12;
#endif
#

But, with respect to cosmetics, that’s just ugly.

Another post mentions elimination of comments—that in GCC, a comment after a # will not be seen by the compiler. Which I don’t doubt, but the specification doesn’t seem to say this is standard behavior.

My searches for rationale aren’t bearing much fruit. So I’m going to just say this is some good ol’ fashioned C esoterica.


Prev | Contents | Next