Dealing with Legacy Code Part 1: Get started

What is this all about?

At the moment I have to deal with legacy code during my day-job. And I got the idea that, maybe, some other people are interested how I solved all the different issues as well. So I decided to start a serie of blogposts about the different kind of challenges I faced and how I dealt with them.

Oh my fucking god, what a mess …

Maybe you know the following situation as well: your boss told you to take care of this suspicious project no one in the company wants to touch. When people hear about your new task they look like you told them that you gonne die a painful dead at the weekend. So WFT!
You try to find the repo and there is no. There is only a folder containing some binaries and a zip-file containing the source. Or there is a repo with only one version checked in some weeks ago. Or there are different named folders with version names in it containing the source.  WFT, again!
You will try to open the project and you don’t have a clue how to do this. No obvious entry points how to start. WTF, WTF, WTF!
You want to build the project and you cannot find any makefiles at all. The you find the makefile but not documentation how to run the build. Or you are able to runf the build but the 3-party-libs are not there.
After a hard week you are going into your weekend. You think about quitting the job and try to get a shepherd in Australia. But stop! Next week we will start to get out of this mess.

Getting back control: try to build it

The first thing what I try to understand is to get an overview about the project. Often I saw often used peaces of code next to dead ends, commented code or just experiments, which are not in use any more. The makefile / build description shows you which parts are currently in use for the production code. In the companies I worked people are using one kind of build system in a bundle of projects. If there are still colleages left which dealted with this project you can ask them which how this worked. For instance if a GNU-make was in use you can grep ( see How to use grep on linux or How to use findstr on Windows ) to look for Makefile. Visual Studio project files are using the sln-extension. Grep for it and try to run the build.
Copy the build output and look for errors. Each build error provides information about the struture of your project:

  • Dependencies are missing:
    • The project is using the following dependencies
    • Try to find it and intregate is as well
  • The project is looking for different includes ( in c++ ), modules ( python ) or packackes ( Java / C# ) and the build cannot find them:
    • The project is using an environment variable to locate different packages
    • Identify the packages and install them
    • Modify your environment until it works
    • If this creates issues: try to run your build in verbose mode, this helps a lot to analyze what is going wrong
  • The source does not build, syntax errors all over the place:
    • Which version of the language was in use? Any extensions? For instance if you are using Cuda-Code in c++ you need to install a special compiler for that.
    • Try different compilers, some code which build fine with VS-2010 does not build when using g++ / clangs
    • Are the syntax errors caused by OS-specific defines? Identify them
    • Try to find them,  Stack-Overflow can help you
  • Linking / assembling fails:
    • Try to answer:
      • Are you sure that you have run all makefiles?
      • Do you need to copy stuff manually?

Still does not work…

.. but you learnd some stuff about your project. Hopefully you have some better understanding.

Getting starting with a Legacy-Code-Project

Day zero

Imagine the following situation: you are starting a new job, you are looking forward to your bright future. Of course you are planning to use the newest technologies and frameworks. And then you are allowed to take a first look into the source you have to work with. No tests, no spec, which fit to the source, of course no doc but a lot of  ( angry ) customers, which are strongly coupled to this mess. And now you are allowed to work with this kind of … whatever.

We call this Legacy-Code and I guess this situation is a common one, every developer or most of them will face a situation like this during his/her career. So what can we do to get out of this? I want to show you some base-techniques which will help you.

Accept the fact: this is the code to work on and it currently solves real problems!

No developer is planning to create legacy code. There is always a reason like: we needed to get in buisness or we had failed. Or the old developers had not enough resources to solve all upcoming issues or develop automatic tests. 10 years ago I faced this situation again and again: nobody want to write automatic tests because it costs a lot of time and you need some experience how to design your architecture in a way that its testable. And there were not so much tools out in those days.

The code is there for a reason and you need to accept this: this working legacy code ensures that you got the job. So even when its hard try to be polite when reading the code. Someone invested a lot of lifetime to keeps it up and running. And hopefully this guy is still in the company and you can ask him some questions.

You can kill him later ;-).

Check, if there is any source-control-management

The first thing you should check is the existence of an Source-Control-Management-Tool like Subversion, Git or Perforce. If not: get one, learn how to use it and put all your legacy code into source control! Do it now, do not discuss. If any of the other developers are concerned about using one install a SCM-tool on you own developer-pc and use it there. I promise: it will save your life some day. One college accidentally killed  his project-files after 6 weeks of work. He forgot the right name of his backup-folder and removed the false on, the one containing the current source. He tried to save disk-space, even in those old day disk-space was much cheaper than manpower.

To avoid errors like this: use a SCM-tool.

Check-in all your files!

Now you have a working SCM-tool check if all source-files, scripts and Makefiles are checked-in. If not: start doing this. The target of this task is just to get a reproducible build for you. Work on this until you are able to build from scratch after checking out your product. And when this works write a small KickStarter-Doc how to build everything from scratch after a clean checkout. Of course this will not work in the beginning. Of course you will face a lot of issues like a broken build, wrong paths or a different environment. But this is also a sign of legacy-code: not reproducible builds. Normally not all related files like special Makefiles are checked in. Or sometimes the environment differs between the different developer-PCs. And this causes a lot of hard to reproduce issues.

Do you know the phrase: “It worked on my machine?” after facing a new bug. Sometimes the developer was right. The issue was caused by different environments between the developer machines ( for instance different compiler version, different IDE, different kernel, different whatever … ).

When you have checkin all you files try to ensure that everyone is using the same tools: same compiler version, same libs, same IDE and document this in your KickStarter-Doc. Let’s try other guy’s to work with this and fix all upcoming issues.

This can slow down the ongoing development tasks. To avoid this you can learn how to work with branches with your SCM-tool ( for instance this doc shows how to do branches in git: ).