2015年11月26日 星期四

C / C++ Macro Tips

Stringification

Sometimes you may want to convert a macro argument into a string constant. Parameters are not replaced inside string constants, but you can use the ‘#’ preprocessing operator instead. When a macro parameter is used with a leading ‘#’, the preprocessor replaces it with the literal text of the actual argument, converted to a string constant. Unlike normal parameter replacement, the argument is not macro-expanded first. This is called stringification.
There is no way to combine an argument with surrounding text and stringify it all together. Instead, you can write a series of adjacent string constants and stringified arguments. The preprocessor will replace the stringified arguments with string constants. The C compiler will then combine all the adjacent string constants into one long string.
Here is an example of a macro definition that uses stringification:
     #define WARN_IF(EXP) \
     do { if (EXP) \
             fprintf (stderr, "Warning: " #EXP "\n"); } \
     while (0)
     WARN_IF (x == 0);
          ==> do { if (x == 0)
                fprintf (stderr, "Warning: " "x == 0" "\n"); } while (0);
The argument for EXP is substituted once, as-is, into the if statement, and once, stringified, into the argument to fprintf. If x were a macro, it would be expanded in the if statement, but not in the string.
The do and while (0) are a kludge to make it possible to write WARN_IF (arg);, which the resemblance of WARN_IF to a function would make C programmers want to do; see Swallowing the Semicolon.
Stringification in C involves more than putting double-quote characters around the fragment. The preprocessor backslash-escapes the quotes surrounding embedded string constants, and all backslashes within string and character constants, in order to get a valid C string constant with the proper contents. Thus, stringifying p = "foo\n"; results in "p = \"foo\\n\";". However, backslashes that are not inside string or character constants are not duplicated: ‘\n’ by itself stringifies to "\n".
All leading and trailing whitespace in text being stringified is ignored. Any sequence of whitespace in the middle of the text is converted to a single space in the stringified result. Comments are replaced by whitespace long before stringification happens, so they never appear in stringified text.
There is no way to convert a macro argument into a character constant.
If you want to stringify the result of expansion of a macro argument, you have to use two levels of macros.
     #define xstr(s) str(s)
     #define str(s) #s
     #define foo 4
     str (foo)
          ==> "foo"
     xstr (foo)
          ==> xstr (4)
          ==> str (4)
          ==> "4"
s is stringified when it is used in str, so it is not macro-expanded first. But s is an ordinary argument to xstr, so it is completely macro-expanded before xstr itself is expanded (see Argument Prescan). Therefore, by the time str gets to its argument, it has already been macro-expanded.

Concatenation

It is often useful to merge two tokens into one while expanding macros. This is called token pasting or token concatenation. The ‘##’ preprocessing operator performs token pasting. When a macro is expanded, the two tokens on either side of each ‘##’ operator are combined into a single token, which then replaces the ‘##’ and the two original tokens in the macro expansion. Usually both will be identifiers, or one will be an identifier and the other a preprocessing number. When pasted, they make a longer identifier. This isn't the only valid case. It is also possible to concatenate two numbers (or a number and a name, such as 1.5 and e3) into a number. Also, multi-character operators such as += can be formed by token pasting.
However, two tokens that don't together form a valid token cannot be pasted together. For example, you cannot concatenate x with + in either order. If you try, the preprocessor issues a warning and emits the two tokens. Whether it puts white space between the tokens is undefined. It is common to find unnecessary uses of ‘##’ in complex macros. If you get this warning, it is likely that you can simply remove the ‘##’.
Both the tokens combined by ‘##’ could come from the macro body, but you could just as well write them as one token in the first place. Token pasting is most useful when one or both of the tokens comes from a macro argument. If either of the tokens next to an ‘##’ is a parameter name, it is replaced by its actual argument before ‘##’ executes. As with stringification, the actual argument is not macro-expanded first. If the argument is empty, that ‘##’ has no effect.
Keep in mind that the C preprocessor converts comments to whitespace before macros are even considered. Therefore, you cannot create a comment by concatenating ‘/’ and ‘*’. You can put as much whitespace between ‘##’ and its operands as you like, including comments, and you can put comments in arguments that will be concatenated. However, it is an error if ‘##’ appears at either end of a macro body.
Consider a C program that interprets named commands. There probably needs to be a table of commands, perhaps an array of structures declared as follows:
     struct command
     {
       char *name;
       void (*function) (void);
     };
     
     struct command commands[] =
     {
       { "quit", quit_command },
       { "help", help_command },
       ...
     };
It would be cleaner not to have to give each command name twice, once in the string constant and once in the function name. A macro which takes the name of a command as an argument can make this unnecessary. The string constant can be created with stringification, and the function name by concatenating the argument with ‘_command’. Here is how it is done:
     #define COMMAND(NAME)  { #NAME, NAME ## _command }
     
     struct command commands[] =
     {
       COMMAND (quit),
       COMMAND (help),
       ...
     };

Variadic Macros

A macro can be declared to accept a variable number of arguments much as a function can. The syntax for defining the macro is similar to that of a function. Here is an example:
#define eprintf(...) fprintf (stderr, __VA_ARGS__) This kind of macro is called variadic. When the macro is invoked, all the tokens in its argument list after the last named argument (this macro has none), including any commas, become the variable argument. This sequence of tokens replaces the identifier __VA_ARGS__ in the macro body wherever it appears. Thus, we have this expansion:
eprintf ("%s:%d: ", input_file, lineno) ==> fprintf (stderr, "%s:%d: ", input_file, lineno) The variable argument is completely macro-expanded before it is inserted into the macro expansion, just like an ordinary argument. You may use the ‘#’ and ‘##’ operators to stringify the variable argument or to paste its leading or trailing token with another token. (But see below for an important special case for ‘##’.)
If your macro is complicated, you may want a more descriptive name for the variable argument than __VA_ARGS__. CPP permits this, as an extension. You may write an argument name immediately before the ‘...’; that name is used for the variable argument. The eprintf macro above could be written
#define eprintf(args...) fprintf (stderr, args)
using this extension. You cannot use __VA_ARGS__ and this extension in the same macro.
You can have named arguments as well as variable arguments in a variadic macro. We could define eprintf like this, instead:
#define eprintf(format, ...) fprintf (stderr, format, __VA_ARGS__)
This formulation looks more descriptive, but unfortunately it is less flexible: you must now supply at least one argument after the format string. In standard C, you cannot omit the comma separating the named argument from the variable arguments. Furthermore, if you leave the variable argument empty, you will get a syntax error, because there will be an extra comma after the format string.
eprintf("success!\n", ); ==> fprintf(stderr, "success!\n", ); GNU CPP has a pair of extensions which deal with this problem. First, you are allowed to leave the variable argument out entirely:
eprintf ("success!\n") ==> fprintf(stderr, "success!\n", );
Second, the ‘##’ token paste operator has a special meaning when placed between a comma and a variable argument. If you write
#define eprintf(format, ...) fprintf (stderr, format, ##__VA_ARGS__)
and the variable argument is left out when the eprintf macro is used, then the comma before the ‘##’ will be deleted. This does not happen if you pass an empty argument, nor does it happen if the token preceding ‘##’ is anything other than a comma.
eprintf ("success!\n") ==> fprintf(stderr, "success!\n");
The above explanation is ambiguous about the case where the only macro parameter is a variable arguments parameter, as it is meaningless to try to distinguish whether no argument at all is an empty argument or a missing argument. In this case the C99 standard is clear that the comma must remain, however the existing GCC extension used to swallow the comma. So CPP retains the comma when conforming to a specific C standard, and drops it otherwise.
C99 mandates that the only place the identifier __VA_ARGS__ can appear is in the replacement list of a variadic macro. It may not be used as a macro name, macro argument name, or within a different type of macro. It may also be forbidden in open text; the standard is ambiguous. We recommend you avoid using it except for its defined purpose.
Variadic macros are a new feature in C99. GNU CPP has supported them for a long time, but only with a named variable argument (‘args...’, not ‘...’ and __VA_ARGS__). If you are concerned with portability to previous versions of GCC, you should use only named variable arguments. On the other hand, if you are concerned with portability to other conforming implementations of C99, you should use only __VA_ARGS__.
Previous versions of CPP implemented the comma-deletion extension much more generally. We have restricted it in this release to minimize the differences from C99. To get the same effect with both this and previous versions of GCC, the token preceding the special ‘##’ must be a comma, and there must be white space between that comma and whatever comes immediately before it:
#define eprintf(format, args...) fprintf (stderr, format , ##args)
See Differences from previous versions, for the gory details.


About Double Parentheses:
If you see something like "SOME_MACRO (("%s, %d, %u", arg1, arg2, arg3));", you could probably define the "SOME_MACRO" as the following to use on it functions like printf():
#define SOME_MACRO(x) \
    printf x;
Otherwise, use the "Variadic Macros" described above.

Reference:

https://gcc.gnu.org/onlinedocs/cpp/Macros.html#Macros
http://www.tutorialspoint.com/cprogramming/c_preprocessors.htm
http://stackoverflow.com/questions/5812877/why-one-needs-two-brackets-to-use-macros-in-c-c

沒有留言:

張貼留言