Let’s start with why is this a hard problem?
You have some old code; you want to move that logic to some new code.
Sounds easy, right?
Well, it’s hard. Very very hard.
In general, if you can fix up the existing code and make it work, do that.
A complete re-write is usually the wrong choice.
If by this point I’ve convinced you against your choice to migrate your code base, great. If not, read on.
In this post, I will talk about two types of migration:
There’s no real course for this type of project, so getting good at migrating code is more of a case of learning from experience. Yours and others.
There are different opinions as to how to define “Legacy Code”, but my favorite is:
Legacy code is any code in production that’s not covered by good tests
Being covered by well-written tests allows you to have confidence that the code is doing something it should be doing and with purpose.
Without these tests, you can tell what the code is doing, but not if that’s, in fact, the correct thing to do or just something that happens to now be the “right” thing to do but was actually functionally wrong when written.
Migrating a codebase can be a complex process. Especially if your legacy code is bad.
Before you start any migration, it’s key to know what the expected outcomes and timelines are.
Big Bang code releases don’t work. That’s worth repeating. Big bang code releases “DON’T WORK”!
Plan small and incremental releases with your new code running side by side with the old code.
Speak with the product owner. Find out the way in which the product “should” work, rather than how it “does” work. There will be differences there, I promise. Decide if you are fixing these differences during this migration project, or if your migration project is a like-for-like replacement of what is already there.
You, by yourself, sitting in front of your laptop scouring for hours through code won’t be able to make these decisions. The Product department is there for a good reason.
Please don’t switch off at this point. 🙂
Automated Testing is vital to the success of any migration project.
In order to validate that your new code does what the old code did and gain more confidence, you need to be able to quickly and repeatedly test the outcomes of the existing code with the output of the new code.
What do I mean by that?
Imagine your service is simply taking a User record from a BUS and saving it to a database.
You want to be able to run the legacy code, find the database record it creates, then run the migrated code and compare the database record it creates. This is the only way to validate your code.
This is especially important on a large codebase.
Generally, code migrations require automated integration testing, not just service-level tests.
Some other things to consider:
Are you migrating your code so that it can run on a new operating system?
If you are porting like for like, do you need to carry across the full commit history? What is your plan for the legacy code repository once
you are fully migrated? Will the commit history die with the code?
Do you need to be backward compatible with things like error messages?
Are you using the same repository or creating a new code repository for the new code base?
Are you trying to introduce any new features?
If you plan correctly and break down your deliverables into small deliverable, vertical chunks, delivering is not easy, but not stressful.
Why is it not stressful?
Well by now, you should have:
If you have both of those things, you should not be stressed about whether you have successfully delivered a like-for-like replacement.
I’m not saying it will be easy to get the output of the new code to match the output of the old code, but once you have it matching, your tests will tell you the job is done.
It’s now an interactive process of doing this same thing for every user story you have.
The great benefit of having the user stories and the integration tests is that new developers (as long as they are familiar with the software development) can easily join the project and get going quite quickly.
Migrating a large codebase is a slow process to do well.
In order to judge progress and ensure that things are on track, you need strong project management skills.
The Project Manager needs to be able to set clear milestones (hopefully through the use of well-defined User Stories) and track the burndown of work already done, to predict how much work there is left.
You need to ensure that you are able to constantly deliver to production as you go, rather than wait until the entire code base is moved and released in one big bang release.
Without doing this, you are not getting feedback from your new code in production.
It is easier to release a small portion of code to production, deal with a couple of bugs, and fix them, rather than releasing two years’ worth of work in one go and be inundated with problems (that may seem pessimistic given the focus on testing that I’ve given, but it’s realistic. We are but human).
Do small iterations of work, get them signed off by QA and Prod, get them to production, and move on with the next small iteration of work.
Any code that has been around for a sufficient amount of time and has been worked on by a sufficient amount of different software developers, will have bad code.
It’s a fact of life.
The code will probably have tight coupling with middleware and databases.
Removing this coupling in the legacy code is the first step to migration. Ensure database code is split from back-end logic. Make sure that config files are separated into specific areas of the code. You get the idea.
You will come across times when you see that the code does something that you think is wrong.
Customers will have been using this for years and what was introduced as a bug is now being used as a feature.
You need to talk with Product now.
Do you continue to support this “bug/feature” in the new codebase, or should you reach out to your clients and explain that things may be changing as you’re moving the codebase and now that bug is no longer going to be there for them to take advantage of?
Follow the data – look through your logs to see how often this flow is used. Take a choice from that.
As you start to migrate the code, you will learn it in a whole lot more depth and uncover things you would never have dreamed possible when embarking upon this migration mission.
Scope can creep. Project timelines can grow.
I can guarantee from a lot of personal experience, that your estimation for migrating a code base of any decent size will be too low.
Estimate high and hope to deliver early.
When migrating code, you need to be able to turn clients on and off for your new stack.
You need the ability to be able to fall back to the old code should an issue arise in your new code.
Beyond that though, you need to ensure that if something was processed through your new code, you can either revert that and push it to your legacy stack, or you can recover via your legacy stack somehow.
Let me give you an example.
Let’s say that you are migrating an e-commerce purchasing application.
Someone wants to buy an iPad. The sale goes through your new code. Your new code has a bug where it is creating two purchase records for every one sale. This means that each client will be charged double what they wanted to spend and get a second iPad. I doubt that’s an acceptable outcome.
In this situation, you need to be able to move back to running on your legacy code and clean up the bad records in the database.
Using feature switches to turn switches between new/old code is a good way to go.
I’ll quickly go over some good practices (some that I’ve already covered above):
Software Code-based migrations are hard. They are complex, moving targets that are littered with potential pitfalls.
You need clear goals and targets. You need to break down the work into user workflows and carefully track your progress.
To migrate successfully, you need to test, test test. Projects succeed or fail based on their testing.
If you follow the guidance I’ve written here, you should be able to navigate the complex world of Software and migrate successfully.