Tags
constants, defined values, global constants, literal values, literals, named constants, named values, numerical constants, P.J. Plauger, string constants
Along with “always use less-than” is another great piece of always advice, except this one is a prohibition, a never rather than an always (same thing, really, from a programmer’s point of view).
It has to do with never having literal values embedded in your code!
The less-than advice brooks no exceptions, but there are two exceptions to this one: the values zero and one.
But let’s back up and talk about literal values (usually called constants), why they don’t belong in your code (except, usually, zero and one), and what to do instead.
A constant, a literal value, is a hard-coded number or string that appears in your source code.
As an example, imagine we’re writing code that models a chessboard, which usually has eight rows and eight columns. It’s not uncommon when modeling a space to need to test for the edge.
Knowing the usual size of chessboards, it’s tempting to sprinkle occurrences of the number “8” throughout the code when testing for the greater edge (and presumably the lesser edge likewise involves the number “0”).
For instance, to test whether a legal move exists along a row:
def move_right(piece): piece.col < 8 def move_left(piece): 0 < piece.col
Likewise, to test for legal moves along a column:
def move_down(piece): piece.row < 8 def move_up(piece): 0 < piece.row
These return true if there is a free board square in the direction desired.
And they work fine so long as the board actually is 8×8. But what happens when the client decides to use a 10×10 board?
You have to find all the occurrences of “8” and change them. And you need to be sure you only change the ones that refer to row or column limits, not just any “8” you happen to find. Not all “8” values necessarily have to do with row or column limits!
(And you definitely don’t want to do simple search-and-replace, because that will affect numbers such as “18” and “88”!)
The use of “0” as the lesser limit seems reasonable, and it may well be. Zero is one of the allowed exceptions (see below for why). But what if the client decides they want a board with arbitrary borders?
Suppose they want a huge virtual board, like 500×500 or something, and they want a much smaller — size adjustable — “window” to that big board. So now our data object is a matrix of arbitrary size with arbitrary row and column limits!
It’s clear that dynamic values need to be kept in some sort of variable. The trap is that it’s not always clear that seemingly static values also demand a variable (or, rather, a name).
But they need something that, ideally, can’t be altered by running code.
There are two basic flavors that accomplish that: Compiler-level definitions, where the value is kept by the compiler and plugged in where needed. Actual variables with allocated space, but to which the compiler doesn’t allow writing. Nearly all languages have a way to do one or the other. Some can do both.
§
The use of definitions or read-only variables requires naming the value they represent, and this can also be a trap for the unwary.
For example, never do this:
define EIGHT = 8
This will really suck when that 8 changes to a 10! Always provide a name that says what the constant is. In the case of our chessboard, as a first cut, perhaps something like:
define MIN_ROW = 0 define MAX_ROW = 8 define MIN_COL = 0 define MAX_COL = 8
Except that “row” could refer to how long a row is or to how many rows there are (which is how long a column can be). Something like this might be better:
define LIMIT_LEFT = 0 define LIMIT_RIGHT = 8 define LIMIT_TOP = 0 define LIMIT_BOTTOM = 8
Or you might define them in terms of west, east, north, south. The point is that the names are specific and unambiguous.
Note that, even if the values never change, there is great value to naming them anyway. This improves the semantics of your code, because it says what the values are for. This is the second reason to never use literals in code.
BTW: On the topic of Bad Definitions, never, never do this:
#define BEGIN { #define END }
In, say, an attempt to turn a C-like language into a Pascal-like language. Never use definitions to change the nature or syntax of a language; that way lies madness.
§
The canonical advice, which I first heard from P.J. Plauger, is:
There shall be no literals in your code save 0 and 1 (or the empty string, “”) and you should view those with suspicion.
The empty string (“”) is the string equivalent of zero.
Why are zero and one allowed (but viewed with suspicion)? Because zero is so often used as a starting point:
for (int ix=0; ix < LIMIT_RIGHT; ix++) { // ... stuff ... }
There are many other places where zero is a natural constant that will not vary. For example, when comparing two numbers, they are equal if their difference is zero. That will never change.
The value one is also allowed in many cases because it’s the natural increment. Counting often proceeds by +1 or -1. (But be aware of cases where the algorithm might want an adjustable increment, a “step-by” value.)
Note how in our chessboard example it’s possible the lower limit, which is naturally zero in most cases, might need to be adjustable. Even a zero sometimes should be a named value.
Essentially you need to consider the semantics of the value. Is it an arbitrary value that just happens to be zero or one? If so, make it a named value.
Or do you really, truly, honest-to-gosh, no fooling, literally, genuinely mean the one and only, authentic and original zero? If so, then use it proudly.
Likewise one, and the question to ask is whether you can ever conceive of that value ever not being one. That’s generally the case when iterating over every item of some container.
§
It should be clear that this applies hugely to string values. The need to port software between languages usually requires changing all the string values. The use of string bundles to make that easy is a whole other topic.
Suffice to say, define your strings in one place as variables or defined constants and use the names throughout your code.
The bottom line is that, as you look through your source, all you should see (for the most part) is keywords, punctuation symbols, and names. There should be no numbers or strings anywhere but in their definition place.
All the values you use should have names describing what they do.
Your code will be more readable and more robust.