Kaleb Pederson's Blog: 2009

Tuesday, December 15, 2009

TDD: the Failed Panacea

Yes, I admit it. TDD is not a panacea; it fails as a cure-all. Some bigots might claim it is (and, yes, I probably sound like a bigot every now and then), but TDD isn't perfect.

Over the last few months I've been working on a compiler at Soph-Ware Associates. The compiler supports annotations mostly identical in syntax to Java's annotations. After a few hours using acceptance test driven development (ATDD), I had developed a nice structure for defining annotations using builders:

builder.withAnnotationName("AnnotationName")
.addRequiredArgument("attributeId", DataTypes.STRING)
.addOptionalArgument("answerId", DataTypes.INT, 42)
.build();
// arguments can also be created with builders if desired

My next step was to query various data from the annotation to be used in the compilation process, but I couldn't find a way to refactor my previous definition and validation work to retrieve the needed data. All the information was there, sort of, but not in a way that was amenable to refactoring.

TDD Failure

TDD had failed me. Okay, unlikely. TDD cannot force me to take into consideration everything I need consider. Had I spent a bit more time thinking about the information I would need later, and less about up front validation, I may have realized that had I done things differently, I may have been able to accomplish both quite easily. TDD didn't force me to use the validation algorithm I did. In fact, the more I think about it the more I believe I failed at closely adhering to the single responsibility principle, as applied to functions.

Even though TDD isn't the panacea I might like, at least I reap the benefits of its other strengths.

Friday, December 11, 2009

Featuritis: Cleaving to the Big Picture

A gray rubber mat sat in front of me as I figured out the rough dimensions needed so it would fit snugly in our car. Once I knew where the various slits and slots needed to be, I grabbed a pair of scissors and started to cut, being very attentive to cut in a straight line.

After cutting a few inches my eyes wondered up to the top of the gray mat. To my utter dismay and disgust, I found that I had indeed cut a straight line... that was off by about 20 degrees. I had been so focused on cutting a straight line that I hadn't confirmed that my line fit with the bigger picture.

Infected

Most programmers, myself included, seem to infected with the featuritis virus, the one that eats at us trying to convince us that we absolutely must build extraneous features or implement a design pattern that we think is necessary only later to have this new feature sit stagnant and untouched. It's a vile little virus that wastes our time and diminishes our ability to meet the truly important goals.

Goals

The problem with featuritis is that it distracts us from our ultimate goal -- to develop, as quickly as possible, quality software that meets its users needs and provides no unnecessary features. I'm going to concentrate on only one part of this -- meets user needs.

If we build an unnecessary feature, the program will take longer to build than is necessary. If we must consistently retest some feature, we're wasting our time. If we must consistently work around a poor architecture, design, or implementation, we're wasting time.

Defeating Featuritis

So, how do you defeat the despicable virus? Acceptance test driven development, a form of top-down design. In brief:

Write an acceptance test first. Don't lose sight of the end-user need.
Write only the code necessary to satisfy that requirement. This will often entail writing many different unit tests, each of which exercises part of the user's required functionality. Be diligent; be strong. It's hard, but with the end-goal, or big picture in mind, progress is evident.
Refactoring to follow the single responsibility principle (SRP). When you keep your code clean, it's much easier to spot when you start getting off on tangents.
Repeating the above steps as necessary

The above is also known as acceptance test driven development (ATDD). A much richer explanation is available in Lasse Koskela's Test Driven. He has a sample chapter available that explains ATDD.

Remember the big picture, the ultimate goal. Whenever I forget, I ultimately regret my lack of foresight. When I get distracted by scenery, I don't end up where I want to go.

Tuesday, December 8, 2009

You Want to be Good, Eh?

Hindu Prince Gautama Siddharta, founder of Buddhism, once said:

The mind is everything. What you think you become.

Another author whose name I no longer remember defined our individual character by the quality of our thoughts during our idle moments. Our thoughts form the connecting thread that binds these two statements together.

When idle, what do you spend most of your time thinking about? At work? At home? If we use our thoughts as a scale against which we measure ourselves, how do we fair?

Devotion Indicator

If I spend all my time doing "big design up front" (BDUF), thinking about its ramifications, and dealing with its subtleties, it's fair to expect that it's what I'm going to be good at. If I devote all my time working on UML diagrams and designing architectures, it will become one of my areas of expertise. If I dedicate my concentration and focus on writing lots of code, my ability to pump out volumes of code will increase. But, is this really what I want?

Quality Focus

Although some things aren't worth dedicating our time and energies to, quality deserves our most tenacious focus. Whether we're focusing on quality code, quality user experiences, or performance as quality, maintaining a focus on quality leads to skill improvement in that area -- "What you think you become."

So, you want to be good, eh? Then focus on quality. Study it. Learn it. Absorb it. You won't build a remarkable product nor a purple cow without it. With it, however, maybe you'll build the next Amazon.

Wednesday, December 2, 2009

Bad Layered Architecture

I recently discovered something I should have realized a long time ago -- Layers are less about modularity and more about isolation.

I've been working on a compiler that I'll describe as four different layers. It could be depicted graphically as follows:

Here's a rough and simplified description of each layer:

Lexer and Parser - The lexer converts standard human readable text into tokens which are then processed by the parser. The parser examines the tokens and makes sure they adhere to the specified context free grammar and builds an abstract syntax tree (AST) used for later phases.
Semantic Analysis - Examines the AST and confirms that variables are not used without being declared and that the operators and functions used exist for the specified types.
Type Checker - Verifies that types are used appropriately and as declared, e.g., that strings are not compared to integers.
Code Generator - Processes the AST and generates the appropriate machine code.

And there you have it -- a nice layered architecture, or so I thought. What I had really created was a layered mountain, one that required that I scale each prior layer in order to work with the next. Each layer was its own module, but it was mildly coupled to the next, or worse, to the next and the previous layers.

Layering Is Not Enough

TDD alone didn't solve the problem. In this case, it was quite easy to write tests, but the tests were basically acceptance tests and less unit tests.

Yeah, I thought layers were simple. I made sure each layer was its own module. My code was even easy to test:

@Test
void generatedCodeContainsExpectedPattern()
{
   String input = /* ... */;
   MachineCode code = compiler.toCode(input);
   assertThat(code, behavesAsExpected());
}

But I had failed. As time progressed, it became harder and harder to identify exactly where an error occurred. For a while I justified myself, after all, it wasn't my fault I needed a compiler generator. I eventually had to admit my failure.

Create Isolated Layers

I needed isolated layers, layers that permitted me to work independently of the others, layers that protected me from change, layers that could be examined individually. The result didn't look much different, but was considerably easier to work with:

Between each layer that previously existed now stands an isolation layer with a single purpose -- to isolate behavior and functionality.

For example, the foundation of my my first code generator was a set of helper functions that were used to generate the correct code. These helper functions have now been simplified and augmented with a builder class. The builder allows me to test just the specific pieces of the code generation, without regard for my input. The helper functions now truly have a single responsibility. As long as the code generator calls the builder in the same way as my unit tests, I am now guaranteed that the generated machine code will be correct.

My newly inserted isolation code was dead simple and almost trivial to understand, yet I reaped huge paybacks. Although I failed at first, in the end I learned something -- isolate my layers -- and that's a win.

Wednesday, November 11, 2009

Overwhelming Complexity

Joel Spolsky recently wrote about how he "was reminded of why student projects, while laudatory, frequently fail to deliver anything useful," which made me reflect on some of my experiences as a student of computer science at Eastern Washington University.

I was lucky enough to take a number of independent study classes wherein I was able to "study" at my own pace while meeting regularly with my instructor. In one such class I was studying compiler design. Because of my love for programming, I essentially made it into a project class and agreed with my instructor to work through a number of chapters through the year.

Introducing Complexity

Although the first quarter went well, my progress quickly slowed in the second quarter. I was diligent in my use of time, so that wasn't the problem. I was reading through the material, and understanding the material, so that wasn't the problem. The problem was the complexity.

By the time I had all the basics of the compiler in place -- the lexer, parser, semantic analysis, and tree building processes setup, the complexity started to kill me. I no longer knew exactly where I should add the various pieces of functionality. My naming was sufficiently poor that I'd get confused about which piece did what; I'd sit there trying to figure out what line of code I could tweak, what class I could create, what design pattern I could use. I was attempting to take time to design it correctly, but didn't refactor to a reasonable design, because I didn't know how. The complexity became overwhelming.

Overcoming Complexity

Many projects fall into the trap of complexity. But in all but the worst of cases, there's a way out -- one line of code at a time. Michael Feather's Working Effectively with Legacy Code details steps that can be taken to alleviate the complexity buildup over time, but for greenfield projects TDD is a boon and plays a huge role in reducing complexity.

How complex can a project get if you can test and isolate all the pieces and each piece handles a single responsibility? Yes, software projects can still become complicated, but they should still be manageable and understandable. TDD helps keep the software organized, layered, managed, and testable. I highly recommend it.

Saturday, November 7, 2009

Inexperienced Quality

When I first started programming I concentrated on one thing, making my program work. Not only was that the only thing that I concentrated on, but it was the only thing I was taught. My studies in computer science didn't prepare me to program, rather, they taught theory with an occasional programming project.

Knowing nothing about proper design or clean code, I slowly added more and more functionality to my program, I'd insert a few lines here and a few lines there until my functions became long and tangled. Classes quickly bloated becoming god classes with an overabundance of dependencies.

Over time, I learned more about design and started striving to always have clean code and follow the single responsibility principle. I eventually had an epiphany -- a stronger developer refactors the current design into an architecture that supports the newly required features and improves the design's quality attributes; the weaker developer stays with the current design no matter how messy the eventual solution might become.

Quality code doesn't come overnight, grow with a degree, or sprout with certifications, it comes with experience.

Image from: Chris Dalrymple's Moblog

Friday, October 23, 2009

Don't Architect, Refactor

The test-driven development mantra is "red, green, refactor," but we far too often let other things creep into the process. One of these things is domain knowledge.

Back in February, Gojko Adzik described his experience with Keith Braithwaite's "TDD as if you meant it" exercise. He was to TDD whether a stone in the game of Go could be taken, or not. In the game of Go, each piece is placed on a grid and can be taken if it is surrounded by only one liberty (i.e. free space) in any of the four cardinal directions.

His first test was to identify stones with two corners covered as having two liberties. Yet, despite that, and armed with additional domain knowledge, his test started out like this:

GoGrid grid=new GoGrid(3,3);

See it? Yeah, the first and only line. It's a domain leak. And it's not just Gojko -- I do it all the time, inadvertently of course. Rather than starting off with the simplest thing that could possibly work, we have a tendency to inject domain knowledge into our code. Suddenly we find that we must create a few classes rather than a few short lines of code. We've made our job harder and started forcing a design that didn't exist nor evolve from the code. Our leaky domain knowledge introduces untested design, BDUF, but on a smaller, slightly less turgid scale.

With TDD, the design is supposed to evolve, yet I often find myself saying, "this is easy to code, but...." And the but kills me, it's usually something describing a design or architecture problem that I don't know how I'm going to solve, so I sit there thinking, and thinking, and thinking.

"Stop! Don't architect, refactor," I must tell myself. That is, I need to implement what I know, worrying very little about the design. Writing the initial code isn't about having the best, most-pure design the first time. The refactoring step is about cleaning up the design. It's also been said this way, "Make it work. Make it right. Make it fast."

Tuesday, October 20, 2009

Test-Driven Development Justified

I stood in shock, in utter awe and amazement. I was talking with one of the lead programmers of an open source project, with whom I'd been in discussions countless times over the last few years. He was a programmer who commanded my respect, and the respect of everybody else in our IRC community. He wrote software that was virtually bug free and did exactly what it was supposed to. He used the idioms of the language, appropriate patterns familiar to capable programmers, and consistently demonstrated a sound understanding of theory vetted with practicality. And yet, despite all of this, he didn't know what TDD was. To be fair to him, I bet he did, but he didn't recognize the acronym and asked me what it meant.

The TDD Tax

Misko Hevery recently mentioned a tax associated with testing. I personally read that as the tax associated with TDD, which may or may not be exactly what was intended. What exactly is that tax and how can we separate the necessary part of the tax from the unnecessary?

Let me frame the details of the tax in light of a commercial project that has essentially taken 1.5 years, full time.

Project Total	Lines	Chars	Chars /Line
All Source	75478	1939605	27.02

Unit Tests
Backend	2,116	52,755	24.93
Model	5,972	168,279	28.18
Main	1,004	23,302	23.21
Utils	4,178	102,358	24.50
Widgets	8,747	251,869	28.79
Core	702	16,256	23.16
Support	157	3,272	20.84
Total	22,876	618,091	27.02
Project Total	75478	1939605	25.70

The statistics above describe a desktop application written with Qt. Unit tests were written with Qt's QTestlib which allows almost all interactions with widgets to be tested programatically. Qt supports a model-view architecture which also helped test the GUI components. The application consists of about 75 thousand lines of source code and 23 thousand lines of tests. The project actually should have more tests, but we weren't diligent enough at the time (and we regret it now).

Typing Tax

The first part of the TDD tax is the required extra typing. I'd like to believe that everybody knows that being the fastest typist in the world does not make them an uber-efficient, super effective programmer. Rather, typing speed has very little impact on the quality or effectiveness of a programmer.

How much is the typing tax? I type about 100 words per minute (WPM) for standard text, so let's assume that I can only type 60 when programming, because of all the extra special characters. Although WPM has different definitions, I'm going to use five characters as a word. Thus, 618,091 chars / 5 chars/word gives me 123618.2 words. And 123,618.2 words / 60 WPM = 2060.3 minutes. And 2060.3 minutes / 60 minute/hour gives 34.34 hours.

Yes, only 34.34 hours. I could type the entirety of my unit tests in only 34 hours... in less than one forty hour work week. In table form:

Words @ 5 chars/word		123618.2
Minutes @ 60 WPM		2060.3
Total Hours:		34.34

Unfortunately, I didn't keep track of time writing the tests, so I don't know what percentage of the time I spent writing tests and what percentage of the time I was developing production code. But, for sake of this blog post, I'm going to assume 10% of my total development time was dedicated to writing tests.

That means that I spent about (1.5 * 52) * 40 / 10 = 312 hours writing tests. Of that time spent writing tests, 34 hours, or 9.1% was spent typing. The other 277 hours or 90.9% was testing related.

Breaking it Down Further

Where else was that 277 hours spent? Some of it was inevitably spent trying to figure out how I could test a certain part of the application. Should that be part of the TDD tax? Perhaps, but if I'm striving for quality, I am still going to test it somehow, aren't I?

In some cases it will be easy to test manually and difficult to test programatically, such as with unit tests. In these cases, it could be said that I'm paying a tax.

I'm not sure how else I might be spending that time. I imagine some of that is spent deriving and discovering yet unknown requirements. Time is likely also spent working on some design related activities -- trying to discover how different pieces of the application can best fit together. Another small portion of that time will be spent running the unit tests time and time again.

ROI

What I'm really doing, throughout this process, however, is making a deposit against the need to run the tests multiple times. The more times I run the test, the more payment I get back; it compounds, like interest.

A bunch of different numbers wouldn't mean much here. For a one off project, or a project whose requirements are truly known up front, perhaps the tax associated with TDD isn't worth it. On most real-world projects, however, requirements will change... a lot. And each time a requirement changes or part of the application is re-written, the automated tests pay interest. I can change, replace, or re-write large chunks of functionality and know with certainty that I haven't changed any important behavior.

If you haven't tried paying the TDD tax, you should; you might find you'll later need to collect some of its interest. When you've diligently attempted TDD, spread the word for many still don't know about it. Even if it doesn't work for your situation, it might work for theirs.

Wednesday, October 7, 2009

Beans, Ping-Pong Balls, and Priorities

Jar filled with ping-pong balls and kidney beans

Not too long ago I found my wife placing ping-pong balls followed by beans into a canning jar. After briefly trying to reason about what she might be doing, I asked her. She explained that she was preparing a lesson about priorities for the children in her church primary class, and then elaborated.

The canning jar represents our day. Each day certain important things need to be completed. These are represented by ping-pong balls with a label summarizing the task. The beans represent other things that occupy or time throughout the day. The bean could represent inconsequential, such as spending a few minutes watching the news, or something necessary, such as eating a healthful meal.

Daily Organization

How should the day best be organized to accomplish all necessary tasks?

Possibility 1 - tackle the small tasks first.

Each small task is accomplished and each bean is placed in the jar until none remain. Next, the larger tasks are attempted. But, before long, there isn't any more room and it becomes impossible to accomplish all of the larger and more important tasks.

Possibility 2 - tackle the priority tasks first.

As each task is accomplished, the representative ping-pong ball is placed in the jar. Next, the remaining smaller tasks are completed and the beans placed into the jar. These beans flow easily around the ping-pong balls, and the jar is quickly filled.

Priorities & Analogy

Which of our daily tasks are ping-pong balls, and which are beans? The following are on my priority (ping-pong ball) list:

Test-Driven Development (TDD)
Pair Programming
Refactoring
Testing
User Interface Design / User Interaction Design
Self-Improvement

Bean Accrual Patterns

Of course, it’s likely quite obvious that if we fill our day with little inconsequential things, we’ll eventually run out of time to accomplish important, priority tasks. So, why the comparison?

We accrue beans in two different ways. First, when we choose to work on tasks that aren’t truly important or become distracted with tasks not pertinent to our goals and priority tasks. Good time management skills like those taught in the Pomodoro Technique can aid much in this regard.

The second accrual pattern is not a time management issue, rather, it’s a result of less-than-ideal software or code quality, which I’ll define as not as good as it could have been.

When developing using TDD, tests come first. Of course, I’m not going to purposefully create an API for myself that’s difficult to test. But, when I don’t do TDD, it will inevitably happen. I’ll end up doing something that makes it harder to test and not as good as it could have been. My bean has now been created.

But in reality, it’s likely worse than that. I didn’t create just one bean, rather, I created many beans:

A turgid or fragile design
A hard to test design
A manual testing requirement
Increased cognitive load
Decreased understanding of requirements
(Likely) increased coupling

Without TDD, as soon as I go to test that piece of software, it’s going to take longer because I will have missed something, making it harder to test. As soon as a requirement changes, I’ll need to manually re-test the necessary pieces.When a change request comes through a couple of months down the road, I won’t have good, concise examples of how to use that code, and it will take me longer that it could have.

Yes, there is a tax, of sorts, associated with TDD, but what about the ROI? Unless you’re absolutely sure it’s negative, I recommend it. Oh, and don’t forget, you can’t measure ROI one minute after investing, it’s a long-term investment.

Other Beans

Throughout the years that I've been programming, I've seen days and even weeks fill with beans:

When I don't do TDD, my code is not as good as it could have been.
When I don't do pair programming, I’ll miss something important or do something dumb
When I don't adequately refactor, my design degrades
When I don't sufficiently test, either before, during, or after coding, I'll end up rewriting some aspect of my design

A few beans don't matter. Nobody will notice... and it's easy to justify just a few beans. The problem comes when you've amassed so many you now have a collection. This is technical debt.

Bean Removal

Technical debt must be paid down before the interest accrues too rapidly. Each day, beans, our daily impediments, must be removed, and we must remove more than we create each day; we must shovel our way out. Unfortunately we can't usually start with a big shovel, we usually need to start with a hand trowel, slowly removing the encrusted dirt and debris. Eventually, we'll be able fit in one more ping-pong sized, important task. And then two... and then three. Given enough persistence, you can turn that brown field green.

Monday, September 28, 2009

About Kaleb Pederson

Hello. I’m Kaleb Pederson, a professional programmer and software craftsman, though firstly a husband, father, and Christian. This blog, however, is about programming, entrepreneurship, and software craftsmanship with a touch of things pertaining to general life.

My first computer was a used Radio Shack TRS-80. After receiving it as a birthday present, I started getting books from the library and introduced myself to BASIC. My first attempts programming led me to participate in numerous contests in high school that had programming tidbits and questions.

Although my first attempts at learning to program were with BASIC and Pascal, my first real introduction to software development started in 1995 with a relational database programming project using an Alpha 5 database, which included a BASIC variant called X-Basic (IIRC).

In my formal education I started studying engineering and mathematics but diverted to Computer Science in which I now have a Master’s of Science degree.

I currently work at Soph-Ware Associates where I’m fortunate enough to have the opportunity to improve my skills as a software craftsman.

Kaleb Pederson's Blog