Tags

, , , ,

When I was a callow young programmer learning the ropes and enjoying the sheer power of the C language, I got a bit carried away with the macro pre-processor’s ability to let you redefine the language.

As much fun as that is, and as much as it can make your source code look cool, it’s a really bad idea. At some point the folks in comp.lang.c read me the riot act about it, and they were right.

So don’t do most of this (some of it is okay). This is a lesson in an Anti-Pattern.

Back in the early 1990s I had a C header file, std.h (“standard definitions header”), that I included in every project I wrote. At this point I was a solo programmer, so I could get away with my quirks bad habits.

Of course, the whole file is properly wrapped:

#ifndef STD_H
#define STD_H

/* ... file contents ... */

#endif

To make sure it’s only included once. This is especially important in “standard definition” files that many other files might include. All the header files you create should use this mechanism. (Obviously each defines its own symbol; only this file defines STD_H. The usual convention is the file name with symbol characters replaced by underbars, so std.h becomes STD_H.)

So far, so good. In fact, so far, so required for good style.

As far as the file contents, they begin with this next bit (this and its follow-up, or something like it, is likely to be found in just about any C standard definitions header):

#ifdef TRUE
#undef TRUE
#endif
#ifdef FALSE
#undef FALSE
#endif

It does something very important — not defining anything, but making sure the definitions TRUE and FALSE don’t already exist. They’ll get defined in the next bit.

Most software corporations have something like this section. These definitions create new datatype names that are synonyms for combinations of native ones. They allow the source code to be clearer about what it’s doing (and remember, clarity trumps everything).

typedef  unsigned char   byte;
typedef           short  data;
typedef  unsigned short  word;
typedef           long   bint;
typedef  unsigned long   bnum;
typedef           char*  cptr;
typedef           short  flag;
typedef  unsigned short  cnfg;
typedef  unsigned long   indx;
typedef           char   text[];

typedef enum {  TRUE = (1),  FALSE = (0) } boolean;
typedef enum { OFF = (0), ON = (1) } on_off;

/***/

Line #12 defines the TRUE and FALSE symbols and gives them a datatype.

Most of the others, or something very similar (all CAPS in some cases), are common enough in standard definitions files. The cnfg and indx types are a bit suspect, and I’m not sure I used them very much. (One issue with lots of cool definitions is you forget some and never use them.)

The OFF and ON symbols, along with their on_off datatype, are a bit silly, too, but I did use them.  All-in-all, this part of the file is okay; the next part, not so much. Try not to laugh.

§

In my defense, I never went this far:

#define BEGIN {
#define END   }

Which I actually saw in the standard definitions file for some BBS software. (I mean software to run a BBS, not software I saw on a BBS.) They apparently wanted their C code to look like Pascal.

Not that I can point fingers (this section of the file begins with the comment “Various new defines which also make the code easier to read.” — the “also” referring to an earlier comment):

#define OKAY            (0)
#define IS              ==
#define ISNT            !=
#define NOT             !
#define LT              <
#define ELT             <=
#define GT              >
#define EGT             >=
#define AND             &&
#define OR              ||
#define SHR             >>
#define SHL             <<
#define M_CAPS          (0x20)
#define MATCH(s,t)      (!strcmp(((char*)(s)),((char*)(t))))
#define NOMATCH(s,t)    (strcmp(((char*)(s)),((char*)(t))))

Defining AND and OR is pretty typical, and SHR and SHL aren’t too bad, but redefining the relation operators is a definite sin (pardon the vague pun). It needlessly and pointless redefines the language. (Worse, to my modern eye, that UPPERCASE stuff is ugly, but that was the convention for #define symbols then.)

It gets better (or worse, I mean):

#define Forever         for(;;)
#define ifNOT(x)        if((x)==0)
#define isOK(x)         ((x)==(E_OKAY))
#define notOK(x)        ((x)!=(E_OKAY))
#define isGOOD(x)       ((x)>=(0))
#define isBAD(x)        ((x)<(0))
#define isNULL(x)       ((x)==(NULL))
#define notNULL(x)      ((x)!=(NULL))
#define onNULL(x,y)     if(((x)=(y))==(NULL))
#define onBIT(a,b)      if(((unsigned)(a))&((unsigned)(b)))
#define isBIT(a,b)      (((unsigned)(a))&((unsigned)(b)))
#define noBIT(a,b)      (!(((unsigned)(a))&((unsigned)(b))))
#define BIT_ON(a,b)     ((a)|=((unsigned)(b)))
#define BIT_OFF(a,b)    ((a)&=((unsigned)(~((unsigned)(b)))))

This is the bit that got me yelled at by the folks in comp.lang.c, and they were right, it’s sin without any redeeming qualities.

The Forever definition isn’t too bad, but the rest are crimes. You may well wonder what I was thinking. I’ll come back to that in a minute.

§

The rest of the file, two sections, isn’t too bad. The first section defines all the ASCII control characters, the most important of which are:

#define CTRL_H      (0x08)
#define BKSPC       (0x08)    /* backspace char */
#define CTRL_I      (0x09)
#define TAB         ('\t')    /* tab char */
#define CTRL_J      (0x0A)
#define LF          ('\n')    /* linefeed char */
#define NL          ('\n')    /* newline char */
#define CTRL_K      (0x0B)
#define VT          ('\v')    /* vertical tab char */
#define CTRL_L      (0x0C)
#define FF          ('\f')    /* form feed char */
#define CTRL_M      (0x0D)
#define CR          ('\r')    /* carriage return char */

But there were symbols for CTRL_A through CTRL_Z. As with CTRL_J having synonyms, I included many of the other well-known ASCII control code synonyms. For example, ACK (CTRL_F) and NACK (CTRL_U).

The LF, CR, and TAB, symbols got the most use in my code along with a few other important ones:

#define ESC         (0x1B)    /* escape char */
#define SPC         (0x20)    /* space char */
#define DQT         (0x22)    /* double-quote char */
#define SQT         (0x27)    /* single-quote char */

The final section defines global error codes and gives them a datatype, ecode.

I won’t list these all, either, but the key ones are:

typedef enum
{
    E_ERROR   = (-1)    ,   /* Basic error code (ec<1).     */
    E_CLOUDY  = (-1)    ,
    E_OKAY    = (OKAY)  ,   /* Standard "A-Okay" return.    */
    E_CLEAR   = (1)     ,
    E_BLUE    = (39)    ,   /* BLUE is a positive value >1. */
 
    E_HELP   = (-0x7FFF),   /* Request for "Help" return.   */
    E_SYSTEM = (-0x6FFF),   /* Not part of the program.     */
    E_PROGRAM           ,   /* Program run-time errors.     */
    E_PROCESS           ,   /* Invalid input or output.     */
    E_MALLOC            ,   /* Global malloc() failure.     */
    E_NOTFOUND          ,   /* Global NOT FOUND error.      */
    E_OVRFLO            ,   /* Global OVERFLOW error.       */
    E_UNDRFLO           ,   /* Global UNDERFLOW error.      */
    E_FORMAT            ,   /* Global FORMAT error.         */
    E_SYNTAX            ,   /* Global SYNTAX error.         */
    E_BADVALUE          ,   /* Global BAD VALUE error.      */
    /* ... many more ... */
    E_SWX = (-0x57FF)   ,   /* SWITCHES error code base.    */
    E_NOARGS            ,   /* No names on command line.    */
    E_NBRARGS           ,   /* Too many names on cmnd line. */
    E_UNKSWX            ,   /* Unknown switch.              */
    E_SWXVAL            ,   /* Invalid value for a switch.  */
 
    E__USER = (-0x1FFF) ,   /* USER error code base.        */
}
    ecode;

Those first codes revealing, once again, my fundamental silliness. The ones at the end are for handling command line syntax errors.

§ §

My intentions were good. Even back then I wanted to write readable expressive code. I loved that I write near-English C code:

Forever {
    onBIT(file_mask, save_flag) {
        if (isOK(SaveFile(file_name))) {
            /* ... do stuff ... */
        }
        if (isNULL(file_handle)) {
            if (x IS 0 AND y LT 0) {
                /* ... do stuff ... */
            }
        }
    }
}

You get the idea. I still think it looks kinda cool, but it doesn’t look much like C. That’s the problem. Any other programmer had to learn this new language to understand my code. I could claim a kind of generic readability, but one man’s “Duh!” is another man’s “Huh?” One size never fits all.

It did make for some fun both in writing the code and how it looked. I don’t exactly regret it, but my grin has a lot of chagrin. I always did like inventing new computer languages (see many other posts here), but it’s best confined to truly new languages, not reinventing the wheel.

Ø