I recently discovered something I should have realized a long time ago --
Layers are less about modularity and more about isolation.
I've been working on a compiler that I'll describe as four different layers. It could be depicted graphically as follows:
Here's a rough and simplified description of each layer:
- Lexer and Parser - The lexer converts standard human readable text into tokens which are then processed by the parser. The parser examines the tokens and makes sure they adhere to the specified context free grammar and builds an abstract syntax tree (AST) used for later phases.
- Semantic Analysis - Examines the AST and confirms that variables are not used without being declared and that the operators and functions used exist for the specified types.
- Type Checker - Verifies that types are used appropriately and as declared, e.g., that strings are not compared to integers.
- Code Generator - Processes the AST and generates the appropriate machine code.
And there you have it -- a nice
layered architecture, or so I thought. What I had really created was a layered mountain, one that required that I scale each prior layer in order to work with the next. Each layer was its own module, but it was mildly coupled to the next, or worse, to the next and the previous layers.
Layering Is Not Enough
TDD alone didn't solve the problem. In this case, it was quite easy to write tests, but the tests were basically acceptance tests and less unit tests.
Yeah, I thought layers were simple. I made sure each layer was its own module. My code was even easy to test:
@Test
void generatedCodeContainsExpectedPattern()
{
String input = /* ... */;
MachineCode code = compiler.toCode(input);
assertThat(code, behavesAsExpected());
}
But I had failed. As time progressed, it became harder and harder to identify exactly where an error occurred. For a while I justified myself, after all, it wasn't my fault I needed a compiler generator. I eventually had to admit my failure.
Create Isolated Layers
I needed isolated layers, layers that permitted me to work independently of the others, layers that protected me from change, layers that could be examined individually. The result didn't look much different, but was considerably easier to work with:
Between each layer that previously existed now stands an isolation layer with a single purpose --
to isolate behavior and functionality.
For example, the foundation of my my first code generator was a set of helper functions that were used to generate the correct code. These helper functions have now been simplified and augmented with a
builder class. The builder allows me to test just the specific pieces of the code generation, without regard for my input. The helper functions now truly have a single responsibility. As long as the code generator calls the builder in the same way as my unit tests, I am now guaranteed that the generated machine code will be correct.
My newly inserted isolation code was dead simple and almost trivial to understand, yet I reaped huge paybacks. Although I failed at first, in the end I learned something -- i
solate my layers -- and that's a win.