Programmers, like carpenters, are builders — we make things. The work can be for pay, but carpenters, for example, can build their own bookshelves and doghouses. Programmers also make software for themselves, sometimes to amuse, sometimes to provide a useful function.
A few of the apps I created for myself over the years turned out to be major workhorses for me — tools I used frequently. One of the earliest was PF.EXE, my Process File utility.
This is back in the late 1980s and early 1990s. I used, both at work and at home, PCs with MS-DOS. Windows 3.x happened in the early 1990s; by 1997 it was Windows95. Especially as a programmer, a lot of the work I was doing was at the MS-DOS level, not the Windows level.
Text has always figured heavily in both my hobby and professional work (especially the latter), so the need to manipulate text files in various ways was important. It became even more important when I began using Unix at work a lot.
One bane of computer life is how text files have different line-end protocols between the three major platforms. On Unix platforms, a line-end is a LINEFEED (LF) character (code value 10). On Apple platforms, a line-end is (or was) a CARRIAGE RETURN (CR) character (code value 13). And on MS-DOS, it’s both an LF followed by a CR.
[The modern equivalent based on what I see around the web is probably issues between the old CP-1252 text encoding and Unicode. I see a lot of ’ and — sequences where there should be ’ and — characters.]
Another common programmer annoyance involves TAB characters. Some (like me) avoid them like the plague; some deluded fools use them (some are oblivious to the difference and wonder why their indenting looks crazy sometimes). In any event, it’s nice to have a utility that converts TAB characters to some number of spaces. (A smart one can convert groups of leading spaces to TAB chars!)
§
The point is, programmers need to work like this with text files a lot. Sometimes we just want to count the words or lines of a file; sometimes we want to make changes. It’s an every day thing.
In the Unix world, there are lots of standard utilities that accomplish a lot of this. The awk and sed utilities, to name just two, offer a huge amount of power. (The wc utility exists solely and explicitly to count characters, words, and lines, in a file.)
But the MS-DOS world doesn’t have any of those tools, so I had to “roll my own” file processing tools. Since the need for them is so great, it was something I began doing early. By the 1990s it was standard practice.
Which brings me to PF.EXE, version 2.30, from 1992.
(I found an old ZIP file in an archive. I’m amazed I found this later version. The earliest versions are lost in the dusts of 3.5″ floppy history.)
§
Here’s the help screen you’d get with PF /?…
PF v2.30, May 92 == usage: PF infilename [outfilename] [switch(es)] Output defaults to stdout if outfilename not specified. For switches, the first two characters are required (switch names are not case-sensitive). Switches may appear anywhere on the on the command line, but processing follows command line order. Use / or - as switch char. Values can be: /SWX={number}, /SWX@{character}, /SWX:{string} Numbers can be: 0xhh (hex), ddd (decimal) or 0ooo (octal). -HXdump Hexidecimal Display -LNumbr Add Line Numbers -CCtr Count Characters -XTabs[=X] Expand {TAB} to SPACES -XCr Expand {CR} to {CR}{LF} -XLf Expand {LF} to {CR}{LF} -LFonly Reduce {CR}{LF} to {LF} -CRonly Reduce {CR}{LF} to {CR} -EOl=X Convert {X} to {CR}{LF} -CTrl=M Control Chars -> ASCII -GRfx=M IBM GFX Chars -> ASCII -TRan=X,Y Translate {X} to {Y} -STrip=X Strip Char {X} -ESc@C Convert 'C' to {ESC} -NAnsi Remove ANSI sequences -UCase Convert to UPPER CASE -LCase Convert to lower case -EBcdic IBM EBCDIC to ASCII -RCase Reverse Case of Alphas -ROT13 Rotate Alphas 13 Chars -WRap=L Wrap at L Columns -BLock 80 Character Blocks -CLip=F,T Clip from F to T cols -CUt=F,T Cut from F to T lines -COmpress Remove Extra Spaces (>1) -7Bit Make Bit-8 Always 0 -ANd=X AND With {X} -OR=X OR With {X} -XOr=X XOR With {X} -ENcode:S Encode Using String S -FASC=L Find ASCII Strings -QUotes Extract Quoted Text -REms=m Extract Code Comments -SCan Scan for () {} [] pairs -CHar@C Test For Char 'C' -TEst=X BIT TEST With {X} /HDR Add header/trailer /OWrite Overwrite output file /APpend Append to output file /INline Take input from stdin /?Ctrl Ctrl Modes Help Display /DOts Show 'work' dots /IBuff=B Set input buffer size /OBuff=B Set output buffer size /?Grfx Grfx Modes Help Display
So by this point (version 2.30), it was a pretty full-featured little utility. I’d been adding capabilities as I needed them for several years at this point.
§
Most of the actual processing routines were written in 808x assembler. I did a fair amount of that back in those days — it was the best way to have full command of the system. Only the program shell was in C.
Object-oriented C (and object-oriented assembler), no less! By the 1990s I was pretty serious about object-oriented design, and had found ways to use its techniques in plain old C (and in assembler).
FWIW, here’s the PF.C file in all its glory:
/**************************************************************************** ***** PROCESS FILE II ***** ***** version 2.30 ***** ***** May 3, 1992 ***** ****************************************************************************/ #include <stdio.h> #include <stdlib.h> #include <string.h> #include <std.h> #include <abend.h> #include <inpargs.h> #include "pf.h" extern o_JobList jobList ; extern o_Bridge charBridge ; extern swx aJobSwitches [] ; extern emsg aErrorMsgs [] ; extern char* linef [] ; extern swxjob SwxHelp ; #define HDR_DLINE linef [0] #define HDR_SLINE linef [1] #define HDR_TEXT linef [2] #define TLR_TEXT linef [3] /**************************************************************************** ***** DATA ***** ****************************************************************************/ ABEND* pErrs = NULL; /* handle->oErrHandler */ inpargs* pCmndLine = NULL; /* handle->oCommandLine */ static cptr aFileNames [MAX_ARGNMS]; /* Table of Cmd Line Names */ static cptr pInFileName = NULL; static cptr pOutFileName = NULL; void main(int argc, char *argv []) { indx dots=0; /* Next show is at: xxx */ /*-------------------------------------------------------*/ /* Initialize the ErrorHandler and InputArgs objects. */ /*-------------------------------------------------------*/ pErrs = INIT_ABEND(NULL, aErrorMsgs, NULL); pCmndLine = INIT_inpargs(NULL, aJobSwitches, MAX_ARGNMS, aFileNames); /*-------------------------------------------------------*/ /* Process the command line (input args). */ /*-------------------------------------------------------*/ pCmndLine->GetArgs(pCmndLine, argc, argv); if(isBAD(pCmndLine->status)) ERR_EXIT(pCmndLine->status, pCmndLine->pErrText); if(aFileNames[0][0] EQU '?') SwxHelp(pCmndLine, X_HELP); /*-------------------------------------------------------*/ /* Open Input and Output and initialize buffers */ /*-------------------------------------------------------*/ if(SWXX_(pCmndLine, X_INLINE)) { pInFileName = NULL; if(pCmndLine->iNbrNames EQU 0) pOutFileName = NULL; else pOutFileName = aFileNames[0]; } else switch(pCmndLine->iNbrNames) { case 0: ERR_EXIT( E_NOARGS, NULL ); break; case 1: pInFileName = aFileNames[0]; pOutFileName = NULL; break; default: pInFileName = aFileNames[0]; pOutFileName = aFileNames[1]; break; } ifNOT(pOutFileName) { ifNOT(SWXX_(pCmndLine, X_OBUFF)) charBridge.outBufBlk.iLength = 20L; ++SWXX_(pCmndLine, X_PAUSE); } else { ++SWXX_(pCmndLine, X_HDR); ++SWXX_(pCmndLine, X_DOTS); } charBridge.Open(pInFileName, pOutFileName); jobList.Open(); if(isBAD(jobList.status)) ERR_EXIT(jobList.status, NULL); /*-------------------------------------------------------*/ /* If /HDR switch is on, print the Header. */ /*-------------------------------------------------------*/ if(SWXX_(pCmndLine, X_HDR)) { fputs(HDR_DLINE, stderr); fprintf(stderr, HDR_TEXT, (pInFileName?strupr(pInFileName):"stdin")); jobList.Print(stderr); fputs(HDR_SLINE, stderr); } /*-------------------------------------------------------*/ Forever { /*---------------------------------------------------*/ charBridge.Input(); if(isBAD(charBridge.status)) if(charBridge.status EQU E_EOF) break; else ERR_EXIT(charBridge.status, aFileNames[0]); /*---------------------------------------------------*/ jobList.Exec(&(charBridge.pChr)); if(isBAD(jobList.status)) ERR_EXIT(jobList.status, NULL); /*---------------------------------------------------*/ charBridge.Output(jobList.status); if(isBAD(charBridge.status)) ERR_EXIT(charBridge.status, aFileNames[1]); /*---------------------------------------------------*/ if(SWXX_(pCmndLine, X_DOTS) AND (++dots>255)) { fputc('.', stderr); dots=0; } /*---------------------------------------------------*/ if(SWXX_(pCmndLine, X_PAUSE) AND (SpaceBarPause() EQU ESC)) break; /*---------------------------------------------------*/ } /*-------------------------------------------------------*/ jobList.Close(); if(isBAD(jobList.status)) ERR_EXIT(jobList.status, NULL); charBridge.Close( ); if(isBAD(charBridge.status)) ERR_EXIT(charBridge.status, aFileNames[1]); /*-------------------------------------------------------*/ /* If /HDR switch is on, print the Trailer. */ /*-------------------------------------------------------*/ if(SWXX_(pCmndLine, X_HDR)) { fputs(HDR_SLINE, stderr); fprintf(stderr, TLR_TEXT, charBridge.inBufBlk.nChars , charBridge.outBufBlk.nChars , charBridge.inBufBlk.nLines , charBridge.outBufBlk.nLines ); fputs(HDR_DLINE, stderr); } /*-------------------------------------------------------*/ if(SWXX_(pCmndLine, X_DEBUG)) { fprintf(stderr, "Input buffer: @%p, %4u bytes \n", charBridge.inBufBlk.pBuff, charBridge.inBufBlk.iLength); fprintf(stderr, "Output buffer: @%p, %4u bytes \n", charBridge.outBufBlk.pBuff, charBridge.outBufBlk.iLength); fprintf(stderr, "PAUSE: %i, INLINE: %i, DOTS: %i, HDR: %i, OWR: %i\n", aJobSwitches[X_PAUSE ].xToggle , aJobSwitches[X_INLINE].xToggle , aJobSwitches[X_DOTS ].xToggle , aJobSwitches[X_HDR ].xToggle , aJobSwitches[X_OWRITE].xToggle ); } exit(0); } /*eof*/
I had some weird ideas about macros and Hungarian Notation back in those days. I eventually outgrew those insanities.
Also, FWIW, here’s one of the assembly routines:
NAME PFLCASE PAGE 60,132 %OUT Assembling: PF_LCASE.ASM INCLUDE pf.hdr @CODE ExecLCase LABEL NEAR push si ; ---\ mov si, sp ; SI -> stack_frame mov si, [si+4] ; Point to The Byte. mov al, [si] ; Get The Byte. cmp al, 'A' ; If it's below an 'A': jb LC_1 ; . skipit--> cmp al, 'Z' ; Or above a 'Z': ja LC_1 ; . skipit--> xor al, CAPS_M ; Else: Toggle the "CAPS" bit. mov [si], al ; And re-write The Byte. LC_1: @Zero ax ; EXIT (WR). pop si ; ---/ ret ; ***** RETURN ***** ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; PostLCase LABEL NEAR lea ax, sLCase ; Point to id text string. jmp PostJob ; GO POST IT==> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; _SwxLCase LABEL NEAR lea ax, LCase ; Point to our job. jmp AddJob ; GO ADD IT==> PUBLIC _SwxLCase ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; @DATA EVEN LCase job <NULL, ExecLCase, PostLCase, NULL, FALSE, 0> sLCase db " lcase",0 ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; @END
I kinda miss those days a little (but I mostly really don’t).
§
What’s most notable about this version of PF is that it abstracts the process and uses an object-oriented plug-in approach to processing files. There’s an overall framework that reads and writes the file, but what happens to the file data is dispatched to the selected routines.
I’ll pick up that idea in a future post.
∅
You might notice that, at one point early in my career, I actually had a
#define
that setEQU
to == (and, yes, more experienced programmers absolutely read me the riot act on that one).There are a number of other macro definitions in that C source code that I’m embarrassed about, too. The
isBAD()
macro, for instance.What can I say. I got carried away with the possibilities of macro substitution and the ability to re-write the language. At least I never did this:
Which, IIRC, was part of the FIDONET code.