Friday, April 30, 2010

The leak that wasn't a bug (i.e. one reason programming is hard)

Bugs are a pain, they eat time, destroy quality, and seem to almost always happen at inopportune times. What's even worse is when the bug is in a 3rd party library or somebody else's code.

My application needed to monitor a directory for changes to files. Unfortunately, Java 6 doesn't yet have support for directory monitoring, so I started looking around for different libraries. I found two that were good candidates, JNotify and JPoller but ultimately decided to use JPoller because it was a pure Java implementation and was fairly well documented.

It didn't take long to integrate JPoller into my application. Although it wasn't designed for testability in mind, I expected the tests to go fairly quickly, and they did. Everything looked good. After some refactoring I started to implement my next feature. Something was wrong. My file wasn't picked up. The first and second times I just assumed I placed the files in the wrong directory... but I hadn't.

I started looking through the JPoller source. I learned a few things, made some changes and continued the bug hunt. But, to no avail. Despite the logic looking perfectly sound it wasn't working right all the time. It was a race condition.

As luck would have it, I had a startling insight. What if the timestamps were wrong?

After quite a bit of investigation I discovered the insight was correct. Despite the files having a timestamp in milliseconds, it was only accurate up to one second. This allowed the file to be created at 850 milliseconds after some second and to have a timestamp of 850 milliseconds earlier.

The Law of Leaky Abstractions

I had just been struck by the Law of Leaky Abstractions. On a Windows NTFS file system the resolution was accurate to 100 nanoseconds, but on Linux it was accurate to one second. And Windows FAT filesystems were only accurate to two seconds (for modification time).

Despite the cross-platform nature of Java, I had to know and understand file system specifics in order to hunt down and kill this bug that wasn't a bug. Instead, it was a leak, and one that had to be carefully worked around in order to maintain the cross-platform nature of Java.

Programming is hard. It's never enough to know a single language or a single platform. Sooner or later a leaky abstraction forces you into other realms.

1 comment:

  1. Just wondering if you had any problem getting JPoller to work with Java6. I found that the "org.sadun.util" class library, needed by JPoller has a class org.sadun.util.DatabaseResourceBundle which has a method clearCache which conflicts with ResourceBundle.clearCache, which was added in Java6. This resulted in a VerifyError at runtime.

    ReplyDelete