Photo by Michał Parzuchowski on Unsplash
Picture this:
It’s a dark, stormy night. Bright lightning rips the sky every few minutes. In the distance, you see a large pile of code written years ago with most parts of it forgotten to those who wrote it, and even more with no authors in sight. You approach it gingerly, not knowing where to begin. You decide to prod one end, with dread in your heart, not knowing what calamity will befall the team as a result of your boldness.
If that’s not exactly your cup of tea, then picture this instead:
A Jenga tower - each layer precariously balanced on top of another. One hasty move, and the whole tower comes crashing down.
This is exactly what dealing with legacy code can feel like. Let me start with first describing what I mean by “legacy” code. I’m referring specifically to
source code inherited from someone else and[/or] source code inherited from an older version of the software.(Source).
A common scenario for such an inheritance is when you (as a company or a developer) are roped in to work on code written by another company/developer(s) and are expected to extend and maintain the said codebase. I’d like to discuss two aspects of working with legacy code in this post—
Common problems with legacy codebases
Overcoming these problems in an efficient way that balances delivery with code quality
Code coverage
One common problem I have encountered when working with legacy systems is the lack of tests. There will be few unit tests if at all and maybe some integration or functional level tests — most of these written as an afterthought rather than to actually have safeguards around code. Most or all of these tests will only cover happy path scenarios and will leave out edge cases within the system.
This in itself may not be such a drastic problem. It begins to become a problem as the system grows and the developers working on the system rotate over time. It becomes harder and harder to keep track of how changes affect the system. This also leads to the creation of silos or a high Bus-Factor with an extremely high dependency on people that “know the system well”.
Needless to say, not every problem can be solved by increasing code coverage and adding more tests, but it does help eliminate some of the risks involved. Having broader test coverage helps ensure that any changes to the system do not affect existing functionalities. Also, having more unit tests ensures that issues with logic are caught at a lower level, making it easier to identify the offending piece of code.
In an ideal world, any system will conform to the test pyramid — large number of unit tests, some service tests and fewer UI/functional tests.
Simplified Test Pyramid (Source)
However, with most legacy codebases you might encounter, the test pyramid might look something like this
(Source)
When first starting working with a legacy codebase that looks like the image above, one common pitfall is trying to jump right in and start writing unit tests for anything and everything. While the intent is extremely noble, it also means that you will be delivering no business value in that time. It’s harder to justify that to a client that sees no functional value being added to their system.
A more effective approach is to start by writing tests for any piece of code you touch or the new code that you add. This will help lead you to a middle ground called the Cupcake pattern.
Note: The cupcake pattern is seen as an anti-pattern given that the same amount of information is tested at multiple levels. However, that applies to greenfield codebases more than the legacy ones. It should definitely be avoided if you’re starting a project from the ground-up. In legacy codebase, it is that much needed but non-ideal middle ground that helps pave the way to the ideal state.
(Source)
Over time as you get more familiar with the system, you can continue adding tests at all levels and achieve an acceptable test pyramid for your project.
Outdated Libraries/Technology
I have come across situations where developers are extremely reluctant to upgrade to a newer version of a library because of breaking changes being introduced; or where a project continuous to be written using outdated tools and technologies because of the fear of breaking the system.
These fears are entirely valid and definitely a huge consideration. One must, however, keep in mind that using outdated tools and libraries can add up and bite you when you least expect it. Older tools are often no longer supported and it can often be much harder to find answers to questions. There is also the fact that your needs change over time, at some point outdated tools will no longer fulfil your needs.
In order to avoid these pitfalls, ensure that you are always at the latest versions of libraries and tools in your project. Updating libraries frequently also means you probably won’t run into massive upgrade related changes in one go. Most libraries will not make drastic changes between releases (I’m looking at you Angular) and an update should be fairly straightforward. Even if you have to make some changes, it’s a better use of your time than the time spent ensuring version compatibility across your project because of that one dependency that cannot be upgraded.
A corollary to the use of outdated tools, is using extremely new tools. Tools/libraries that are still in the beta phase or have not had at least one major release run a risk of not being supported on all platforms. Unless your project has a very specific niche requirement, I would recommend staying away from such libraries.
Refactoring
As a developer, I’m often tempted to dive right into a codebase and start re-writing pieces that I think can be written better. When dealing with legacy code, the first step is to read the code and understand, and when understanding some part of the code is extremely painful you want to save your other team members that same pain by refactoring. While this is an act that your team members will appreciate, it might be detrimental to the overall state of your project since it is not adding any functional/business value. As I said earlier, it’s harder to justify something like that to a client who has brought you in to deliver that business value. It’s also quite easy to get lost down rabbit holes when trying to refactor such code, even if you promised yourself that you will time-box the refactoring you can overshoot that because you don’t want that effort to go waste.
Despair not, because there is an alternative to having to work your way around code you don’t quite understand. Whenever you get the urge to refactor a certain piece of code, ask yourself these two questions:
Is this code a part of the functional feature that I am working on right now?
Is this code working as expected in it’s current form?
If the answer to either of these questions is yes, avoid refactoring it then and there. Just as with the code coverage aspect, only refactor code that you’re touching as part of your implementation. Everything else, can be added to your project’s Tech Debt wall. Normally, the said wall would look something like this:
(Source)
The wall is a way of documenting the issues in your code, or code that you have inherited. I think it goes without saying that the Tech Debt wall is not a dumping ground for bad design decisions. It should only be used to track existing issues and the team should make a conscious effort to bring down technical debt over the course of the project. One practice that was followed on a few of my projects was to prioritise a few tech tasks every iteration, with approval from business stakeholders/product owners to balance out the functional vs technical value delivered. If you only focus on technical, your stakeholders will be unhappy whereas if you only focus on functional, you keep racking up technical debt and your code keeps getting harder to maintain.
Conclusion
Working on an inherited codebase isn’t always fun or easy, and frankly can be discouraging at times. It could be due to different ideologies of how code should be written, or competence of the people who originally wrote it, or any others from a number of factors. However, it’s something that most software developers will have to deal with/have dealt with at some point in their careers.
This was my attempt to document practices that have come in handy in my experience of working with inherited code.
Comments