Tags

, , , , ,

To make this blog (I hope) useful, I have tips and advice and bits of advice. But the initial motivation was wanting to write about programming or software in general — topics too technical for my regular blog. In particular, I wanted a place to document past projects, ideas and personal (but sharable) thoughts about coding. I wanted a place to tell stories.

Today I have a story to tell about a programming project I’m working on.

I’ve been idly toying with the design of a new computer language I call BOOL (Basic Object-Oriented Language). This past week or so I’ve been working on what you might call an assembler — an app that reads input describing how to build some objects, builds them and then writes an output of the build.

The objects the app builds fall into nine or so meta-types (implemented as classes). Each object, regardless of meta-type, is unique due to the data populating it.

It takes at least two lines (commands) of input (which is text and line-based) to build an object: one to begin its creation; one to end it and emit the object to the output. Other commands configure the object in various ways during its construction.

The commands are not contiguous in the input; object building is nested. A given object may have sub-objects that must be built before the parent object can be completed.

The assembler design (in Python) consists basically of two classes: a Lexer class that tokenizes the input file, and a Builder class that takes a Lexer as input and emits the build. The Builder has methods implementing “agents” responsible for building each class of object.

And here’s where I took a wrong turn.

I had a working app that I considered nearly done when it struck me: each agent is building an object of a specific class, wouldn’t it be cool if each agent was a sub-class of a base Agent class? Perhaps agents could sub-class the very meta-type classes for the objects they were building! Or, at the least, the Agent class hierarchy could parallel the meta-type hierarchy and isolate it from the Builder.

All great ideas. Under other circumstances, in fact, probably a very solid design decision. Specialization, encapsulation, isolation: these are usually very good design goals.  It really all boils down to that last word: isolation.

The more you isolate parts of your code from other parts, the more you can mess with parts without affecting other parts. That’s one of the huge deals behind object-oriented design: isolate the data and its operations inside an object.

The Builder itself was nicely isolated: in goes a Lexer (which had been fed input files), out comes a build of objects. It worked really well, and there are times when you should leave well enough alone and move on. (And times when you shouldn’t; it all depends.)

I also had nice isolation among the agent methods; there was a minimum of sharing of object data. The sharing was little enough that I thought I could do a design where each agent was an object of an Agent sub-class to further isolate object construction.

The punchline is that it wasn’t. The agents needed too much state data about the over-all build. In particular, two agents built objects that many of the other agents needed to know about. (And those two primary agents needed to interact as well.)

The final design was offensive. I ended up passing in a big object (containing the shared state data) to the base class Agent. Each sub-class used that “frame” to access the shared data. Plus, two sub-classed Agents were special; those two primary agents mentioned above served other sub-classed Agents.

It was ugly. A kludge.

One clue was that the sub-classes ended up being more complicated than the original Builder methods. The new isolation among agents required extra effort in sharing data, even with the shared frame among them. And there was the ugliness of inter-class awareness.

In the original design, within the context of a single object instance, the different methods meshed smoothly, and so what if two of those methods were a bit meatier, did more, than the others. Those two methods just used object variables to publish the data other methods needed.

That design was a good compromise between isolation and clarity, efficiency and operation. Each method was fairly short (even the primary ones), so the Builder code was self-contained and not over-long. (The entire Builder class is just over 500 lines including blank and comment lines.)

In contrast, the class code for just the Agents was up to 260 lines before I realized my mistake (it was close to, but not completely, done at that point). The agent methods they were replacing amounted to 230 lines, so code size increased with no gain in function (or anything else).

The bottom line is that isolation does have a trade-off: the need to communicate with the outside world via some interface. At the least, that requires some extra machinery and complexity.

You have to balance the benefit of isolation with the cost of isolation.