Many times working as a developer we are confronted with having to work on legacy applications which may be poorly written or architectured, or have just aged, and not been properly maintained.
Many of these applications may be very vital and critical to the business, or many times the entire business of the organisation depends upon that one application. In either case we need to move forward, and cannot let things fester. Not updating the application may have many significant impact such as:
- Security risks as the programming language version and the framework version in use, may no longer be supported.
- Various Performance issues may be happening as the application is no longer able to handle the increased traffic load, and increase in database size.
- Data consistency issues may be happening as the application and database architecture doesn’t exactly capture the business requirements, as the goal post has moved.
- Application errors and frequent bugs because of the layers of code which are there makes the application harder to understand and manage.
- Database timeouts, deadlocks, race conditions.
- Large memory footprint, CPU spikes, etc.
- Integrating new packages or libraries which require a newer version of the language and/or framework can be challenging.
- Decreased developer productivity, as developers cannot use the improved or better paradigms of the language and/or framework.
- Harder to onboard new developers, because of the complexity and/or archaity, the application becomes too hard to understand.
Don’t blame yourselves if you find yourself in this situation, many top and well maintained OpenSource projects have found themselves in this situation and had to reinvent themselves. Mozilla has to do this with Firefox and other projects being managed by it, Openssl is being reinvented as libressl, Gnome has done it a few times, and there are others. The bottom-line is “change is the only constant”.
In this article we will talk about what are the common ailments affecting such applications, and what are the long term as well as short term strategies which can be used to modernize the application. So, let us try to enumerate the common pitfalls which affect legacy codebases that makes them hard to work with:
- Use of old syntax, or now deprecated language features.
- Use of old libraries which are no longer being maintained.
- Lack of test coverage, or sometimes tests are there, but they may not be comprehensive or covering the entire stack, especially the frontend.
- Database de-normalizations, bad Database hygiene
- Layers and layers of code superimposed on top of each other. Several control-flow (`if`) statements.
- Repeated code patterns where different parts may be having hard to spot differences or edge cases.
- Gigantic application size, or lines of code.
- Messy class hierarchies, or huge classes, methods which are hundred lines long.
- Fails the code smell tests, or has anti-patterns.
Most businesses operate under tight deadlines and fiscal constraints, and the product gives more importance to feature development or functionality improvements, than to code refactor or clean-up of technical debt. So, it is very important for the Engineering leadership of the company to underscore the importance of continuous refactoring and taking care of technical debt. Just as a product or functionality needs to be continuously tinkered with and improved, similarly the application code needs continuous overhauling to be in good shape. So, how can one go about fixing this?
One simple strategy is to completely rewrite the application. You are starting afresh, so you can take a modern stack, follow newer and better practices, and your work doesn’t impact the existing application. The users are just expected to switch from the old application to the new one. This strategy works if your application is small or moderately less complex. But for any reasonably sized or complex application it can be a mammoth undertaking. Just because it is a modern stack, doesn’t mean that the developers have full familiarity with it, they may have to learn / unlearn many things, mistakes can be made. Replicating all the functionality of the existing application with its nuances, will require strong and complete understanding of the old application, and all that knowledge may be hard to acquire. Also, the cost in terms of missed opportunity may be large for the product / business team, or two different engineering teams may be required, one which continues to work and maintain the old application, and another which builds the new application. Using this strategy can be high risk and requires a detailed cost and benefit analysis.
An alternative strategy is to go for progressive and gradual refactoring. This is what is usually better suited for large and complex projects, and provides for incremental benefits, and hedges the risk. We are going to focus on this next, and discuss various approaches to solving it.
- Selective rewriting – There may be parts of application which are good candidates to be moved to an external process or a micro-service. For instance, image processing can be easily moved to an AWS Lambda or Google Cloud Function microservice using the Serverless framework. Similarly, HTML to docx and PDF conversion, ElasticSearch / Solr indexing, report generation, etc. This is application specific, but all large applications will usually have such components which are very resource intensive, and can be easily made an external modern service, and hence reduce the footprint of the legacy application.
- Integrate Code Syntax Checkers and Static Analysis tools – Integrate and configure the application to make use of tools like Rubocop, ESLint, Reek, etc. They should be well integrated with your application code review and CI pipeline. All Pull Requests (PR) should be checked and made sure that they pass the code quality check. It is tempting to run them and try to fix all the issues discovered in legacy application at one go, but the number of changes required can be very large, making the PR unwieldy, hard to review and risky to merge. Also, it may be a lot of time investment to fix all the issues. So, better to create separate tasks for each of the issues discovered, and gradually fix them over a period of time.
- Cleanup Database Schema – Ensure that columns which should be not null, are flagged as such. Remove old and unused columns and tables. Try to remove *_cache columns which may be hard to maintain, but provide little benefits. Usually with some clever programming, the cache columns can be removed with no performance drawbacks, but with benefits such as avoiding multi-table locks and writes. Similarly try to remove other unnecessary database de-normalizations. It is not for nothing that premature optimization is looked down upon. Removing database de-normalizations and other unnecessary convolutions can simplify the application database architecture, as well as help in a lot of code cleanup.
- Remove Dead Code – Remove commented or unused code. If it hasn’t been used for years, no one will use it in the future. In any case it can always be easily retrieved from the version control.
- Track Everything in Refactor Stories – Any change which needs to be made should be all recorded in stories in the issue tracker. This will help the team in prioritizing them, estimating them, and this can turn out to be very helpful in deciding what should be done and when should be done. It will also help the product and business team to get clear visibility into the process, and also will help the engineering team in negotiating sprint story points for the technical cleanup related work.
- Periodic Independent Review – It is always nice to have a fresh pair of eyes going over your application’s code and architecture. Teams tend to indulge in groupthink, so having an outsider’s opinion and review is always beneficial, and good for long term health.
- Slow and Steady Improvements – The team should continuously try to strive for improving the quality of the codebase and application in all the changes which they make. Every PR should increase the code test coverage, reduce code complexity. There can be few dedicated sprints points in every Sprint to work on the technical debt related issues.
- Refactor to DRY code patterns – An analysis should be done to find different parts of the application which have similar structures. Then after a thorough understanding of those areas have been developed, the common patterns should be extracted out to a shared module such that they can be easily reused, and are more extensible. These duplications can be in either backend or frontend code, an attempt should be made to reduce both with an equal zeal, with the objective of reducing the number of lines of code and simplifying the application.
- Upgrading the Stack – A precursor to this is having a good test suite, or developing sufficient QA muscles by writing test cases, and have team members well experienced with the various functionalities of the application. This can then be followed by making sure that all the deprecation warnings are removed, and then going through the various inbetween releases to upgrade the app, while ensuring compatibility at the same time.
- Patterns to Refactor – There are Rails/web application design patterns like Service Classes, PORO, FormObjects, Decorators/ Presenters/Serializers, etc. which can be used to simplify the application architecture.
Refactoring is an important part of the application development life cycle, and I have tried to capture here some of the important points to keep in mind while refactoring a legacy application. Please feel free to share what you think in the comments below.
Surendra Singhi is a technical Consultant cum Coach who helps organizations improve their Software Development practices to deliver great products and services by refining their processes, workflows and technical strategies. He works for Kreeti Technologies, and has mentored several development teams to better realize their potentials and deliver increased value to the stakeholders.
We at Kreeti Technologies provide the expertise to upgrade teams and take their products toward success. Do you have an app where you need help with refactoring, or feature development or scaling it up? Contact us at: email@example.com