Tuesday, February 9, 2010

Limitations of Refactoring IDEs

Refactoring Book

Coupled with every strength is a weakness. Within an IDE, the ability to leverage the utility of a mouse is a strength. When trying to automate the selection of the next four characters, no matter where you may be in a file, requiring the same mouse-driven approach becomes a weakness. And so it is with refactoring tools today. The current mouse-driven approach to identifying parameters to refactorings has a number of weaknesses:

  • Other than through GUI automation tools, such as APIs and macros supported by IDEs, starting the refactoring process cannot be automated.
  • It is impossible to apply the same refactoring to many elements at one time. Rather, each element must be selected one at a time and the refactoring applied.
  • It is difficult or impossible to script refactoring operations.
  • Elements being refactored are often identifi ed by line and column position within a fi le, which usually changes over time.

Although perhaps unseen and unrecognized, the weaknesses stated above are prevalent. The mainstay of element identi cation within existing refactoring tools is the mouse location and current cursor position. Although these function to identify the elements to which refactorings should be applied, they are not appropriate nor reasonable for every situation. Two use cases, detailed below, demonstrate an unmet need in the arena of refactoring tools.

Use Case 1 - Vendor Braches

Background

John is working with a vendor's source code and is maintaining a vendor branch. Unfortunately, the vendor's code has some unpatched bugs and often uses poorly named identi ers which make it hard to work with.

Problem

John has a set of patches which he applies using the standard vendor branch approach to patching. He has two sets of patches. The first set of patches applies the Rename refactoring to many di fferent elements in order to make the code easier to work with. The second set of patches fixes currently unpatched bugs.

Following the vendor branch patching process, after each vendor update John loads the vendor's source code into the source control system and then applies the first and second set of patches. Unfortunately, the first set of patches often fails to compile because any new references to the variables being renamed are not included in the patch. Each new reference must be manually renamed and the patch updated. Once the first patch has been successfully applied, the second bug-fi xing patch usually applies successfully.

Although this approach accomplishes what it needs to, it is less than ideal. With each new vendor release the Rename refactoring must be re-applied to the source code for every newly introduced reference to the variables being renamed.

Use Case 2 - Universal Language

Background

Beth is involved in an Open Source project whose primary developers are French. The source code is written and commented entirely in French, making it hard for Beth, who does not speak French. Although the developers have decided to change to English as their universal language, they have not yet made the change.

Beth worked with the developers to understand the meaning of each variable but needs to work with the French code for quite a while before English becomes the universal language.

Problem

Although Beth could maintain a separate English branch of the source code, that would require that all code be modi ed or committed twice, once for English and once for French. She could convert the source code over once within a separate branch, but each new reference to an existing variable would need to be changed. No good solution exists.

Use Case 3 - In Concrete

I would like to perform the following refactorings, but doing so through a GUI will be painful:

  • Rename all my *ElementName classes in a given namespace to *Node where the asterisk represents some wildcard.
  • Rename all get*Instance member functions to instance for every class within the DataNode namespace.

Perhaps the above seem meaningless, but I've had to do almost those exact same things. After a while, I discovered that some of the class names and methods that I was using weren't descriptive enough. I had to go rename over 50 classes even more files.

Conclusion

Current refactoring tools are wonderful... for what they were intended for. They make the process of writing code, and cleaning up that code while you're working on it, far easier and more likely to happen that if the process were manual. But, the GUI itself is also a weakness, one for which we need an alternative. We need a way to handle these and other uses cases. We need a way to back-port refactorings, where possible, without forcing the programmer to take manual steps. We need wildcards and sed-like functionality in our refactoring tools.

Much of the above content comes directly from my thesis. I believe there's a solution and future posts will discuss pieces of that solution. I don't believe my solution is complete or perfect, but I hope to further the work in this area.

Image courtesy seizethedave on Flickr

No comments:

Post a Comment