How to Get Rid of Technical Debt or What We Would Have Done Differently 2 Years ago
We talked about cleaning legacy code with Rector 2 months ago on 40th meetup of PHP friends in Prague.
Who is we? Me and CTO of the company I worked for, a great leader and technical expert who taught me a lot, Milan Mimra.
The talk was not full of shallow tips, nor about framework migration. Instead, we talked about small decisions that were made 2 years. Decisions, which took 3 months to get rid of.
For all the rest of you who speak English better or don't want to copy-paste code instead of watching the video, I'm writing this post. So you can learn from our pain and make your brand new mistakes :)
Does Doesn't Matter
This talk was about my work as cleaning lady in a project, that wanted to improve their PHP codebase across the whole application. Any PHP projects I recently consulted would benefit from these changes, so keep reading even if you already run on the latest version of your favorite framework.
Instant upgrades effectivity doesn't depend on project size, but people still want size numbers, so there you go:
- Symfony 4.* backend for mobile application
- 253 endpoints = routes
- 105 000 lines of code (bigger than Symfony with ~90 000 lines)
From all the work we've done, we've picked 10 most important changes that brought code quality to another level.
1. Advance Your Coding Standard
When people say "we use coding standard", it usually means "we use only PSR-2 but nothing more".
That's like saying "we use Symfony" when you're using only controllers.
Why You Should use It?
- the best code improvements in exchange for massively improved code - for 10 minutes, life-time code check
- it's very easy to start using it if you don't use it yet
- further refactoring with Rector or analysis with PHPStorm or PHPStan is more robust
It's better to start slow, so the first things we did was:
- moving from string
- coding standard enforced line-ending - not just sniff, but a fixer
- finalized (almost) everything
- accidental fatal error check of an empty array
How to Apply?
Just run ECS with following set:
# ecs.yaml services: # use ::class Symplify\CodingStandard\Fixer\Php\ClassStringToClassConstantFixer: ~ # keep line-lenght constant Symplify\CodingStandard\Fixer\LineLength\LineLengthFixer: lineLength: 120 # default value for array property Symplify\CodingStandard\Fixer\Property\ArrayPropertyDefaultValueFixer: ~ # every non-Entity non-abstract class must be final - you have to skip those that are used + have children, like this ↓ PhpCsFixer\Fixer\ClassNotation\FinalClassFixer: ~
And the result?
2. Single Class = Single Place
Have you heard about PSR-4? Well, so did many people before you, but it's very to forget it thanks to:
- PHPUnit magic autoload - you can place your tests anywhere, PHPUnit will find them
- Nette\RobotLoader - load anything from anywhere, for free
- composer "classmap" - just load the directory, who cares
- DDD - Domain Driven Design - apparently, one of the most favorite principles is to put all files in one directory, templates, configs, services, objects... just put it there
This all helps a lot to messy applications. As basic as this rule seems, I must sadly say that lack of PSR-4 is one of the biggest problems of legacy applications.
Forget Service Autodiscovery, Use Ze Handz!
To add more salt into the wound, now imagine you want to use some sort of services autoregistration (and you should, unless you want to kill your project slowly). That means you tell your dependency injection container "load these services from this directory".
One of such implementaions is PSR-4 autodiscovery in Symfony:
Such a simple yet powerful idea... which all the Symfony applications I've seen so far struggle. There Symfony 4 and 5 projects, where programmers still have to write each service manually.
I wonder, isn't there something better to do? Like writing unique PHP code that cannot be automated?
So that's what we did:
How to Apply?
This one requires lof ot manual configuration tweaking of Rector rules, but you can run basic migration with following set:
# rector.yaml services: Rector\Autodiscovery\Rector\FileSystem\Rector\Autodiscovery\Rector\FileSystem: ~ Rector\Autodiscovery\Rector\FileSystem\MoveServicesBySuffixToDirectoryRector: ~ # configure `composer.json` first Rector\PSR4\Rector\Namespace_\NormalizeNamespaceByPSR4ComposerAutoloadRector: ~
3. Make Your CI Fast or Die Trying
This is rather a general tip than a specific migration.
In the beginning we had:
- slow database test
- Doctrine fixtures that were added on every integration test
Not only CI but also local testing was slow. That leads to frustration due to loooooooooong feedback loop. In reality, it means check test = having a coffee or go smoking. And you don't want to hurt your team intentionally like that, do you?
So after a lot of work we had super fast CI that:
- loaded database just once
- reference fixtures via IDs in constants instead of in-Database references
- checked coding standards in standalone job
- checked static analysis in standalone job
- checked Rector in standalone job
4. Remove What You Don't Use
I see you now thinking "we don't have any dead code, or PHPStorm would tell us". No, it wouldn't, at least no in level the static analysis can.
How do we know? Well, we were you in the start... "no, we don't have any dead code":
The moment you realize:
- the code is broken
- you have to fix it
- you have to maintain it
- oh, it's used in tests
- it's used only in the test but nowhere else!
God, don't do this manually! Automate ↓
# ecs.yaml parameters: sets: - "dead-code"
# rector.yaml sevices: Rector\DeadCode\Rector\Class_\RemoveUnusedDoctrineEntityMethodAndPropertyRector: ~ # + some extra private rules :)
Soon, you'll be 10-20 % slimmer and you'll fit into your favorite bathing suite :)
Also, you'll avoid skipping code-reviews:
5. Make Your Ground Base
Rock Diamond SOLID
Have you ever seen code like this?
I'd never expect
string turn into
null, but it happened anyway. Silently. And not just from this function.
We wanted this code to throw an exception and tell us what's wrong. Right in the place where it happened, not just "string expect, null given" later.
There is a Safe package that can partially protect you, but it's rather general and with a generic exception.
But there is even better one - Nette\Utils:
How to get it into your code? Easy:
# rector.yaml parameters: sets: - "nette-utils-code-quality"
6. Config SPAM
We already wrote that config suffers from manual services registration spam. But there is more...
This is famous Symfony tag spam.
- not needed
- overly promoted in documentation
- thus at the end used by everybody
- making application stupidly complex and hard to maintain and extend in the end
Another typical issues is manual parameter spam.
Add Spam Filter
We don't like spam, do you? So we dropped all this crap and used 2 compiler passes to make it KISS:
7. Remove All the Copy-Paste Legacy
We all know how the story goes:
- "They did it"
- "I just maintain it"
- "It's faster to copy-paste it..."
- "I don't have time to fix it now..."
- 3 years later...
- "Oh my, what's this? How do we get out of it? Rewrite?"
In our case, it was json, written as a string:
It was hard to maintain, even read or change a value in that mess. We had many bugs just because of invalid quote concat, missed closing quote or new-line issues. We needed to use a normal array like most people do, but it was spread in over 70 use cases, some of them nesting array into 5 levels.
Easy, we made a Rector rule for that:
# rector.yaml services: Rector\CodingStyle\Rector\String_\ManualJsonStringToJsonEncodeArrayRector: ~
8. Know ALL Your Properties
I already wrote about this here: How we Completed Thousands of Missing @var Annotations in a Day
This is super important to make instant refactoring reliable!
9. Enter Anywhere?
I already wrote about it... twice:
Clean, simple, PHP without config magic!
10. Switch ID to UUID?
This was the biggest problem in the whole project. First, it used classic
int ids. Then uuid was about to be used (for whatever good reasons), but for the lack of refactoring time only, new entities used it. So part of the old approach, part of the new approach. Sometimes one class uses one of those and another of those.
Total mess, maintanece of 2 huge layers and negative effective on input/output layer, because it had to be consistent. There cannot be a situation where user sees urls like:
And not only the user but also external API, mobile applications, invoice systems, etc.
This was a real challenge for the whole team - it included database, Doctrine refactoring rules, external service unification refactoring and also changing PHP types all over the application.
id-to-uuid set was born. Use at own risk, but it might be worth it, since it can save you months of work while your team can keep delivering features.
And that's all folks, all we made happen in 3 months work of... 1 person. We wanted to show you, that all big changes don't have to be "whole-team-2-years-refactoring-super-expensive-no-features".
It can be smooth, step by step, closed incremental iterations, done by one full-time person.
Now the code is:
- 200 % more readable
- instead of 3 locations, the class now only has 1 location
- the database has 1 way to use ids instead of 2
- configs are 95 % smaller
- is 40 % safer thanks to exceptions of
- most of the time the developer doesn't visit configs at all and can focus on PHP code of the feature instead
- it has 80 % fewer anti-patterns, so any new developer will produce high-quality code by default, just by reading the already written code
This is Inspiration and Proof, it Can Be Done. Now Yours is Choice to Make:
What Are You Going to Clean Tomorrow in Your Code Base?