Refactor - You Keep Using That Word…
I stumbled upon a thread recently where the question was posed, “What are some common mistakes when refactoring code?”
The answers were interesting. The more I read, the more I realized that folks weren’t talking about the same thing. They were all saying “refactor”, but many were describing scenarios that sounded more like a rewrite rather than a refactor. This is not uncommon. I’ve encountered this on multiple forums, in slack discussions, in blog posts, and in actual human to human conversation (it happens).
Words matter in communication. Really, they do. You don’t have to be a Lexicographer. But you might want to make sure you know the meaning of the words you do use. And know their meaning in the specific context within which you are using them. Do a little research. Assume you don’t know and seek to understand.
Words convey concepts. When we misuse a word or lack a shared definition, we diminish the ability to maintain a shared understanding; we diminish our ability to communicate. It’s hard enough when we all agree on the meaning of the words we’re using.
You see, much of the advice given was helpful from the perspective of a rewrite. “Make sure you fully understand what the old code does.”, “Make sure you have sufficient time.”, and even “Make sure your manager approves.” - These are all reasonable things to consider when rewriting a piece of code or an entire application. These are not necessary or perhaps even appropriate when it comes to refactoring.
What does Refactor mean?
From Wikipedia -
This article is about a behaviour-preserving change. It is not to be confused with Rewrite (programming).
Code refactoring is the process of restructuring existing computer code—changing the factoring—without changing its external behavior. Refactoring is intended to improve nonfunctional attributes of the software. Advantages include improved code readability and reduced complexity; these can improve source-code maintainability and create a more expressive internal architecture or object model to improve extensibility.
Typically, refactoring applies a series of standardised basic micro-refactorings, each of which is (usually) a tiny change in a computer program's source code that either preserves the behaviour of the software, or at least does not modify its conformance to functional requirements. Many development environments provide automated support for performing the mechanical aspects of these basic refactorings. If done well, code refactoring may help software developers discover and fix hidden or dormant bugs or vulnerabilities in the system by simplifying the underlying logic and eliminating unnecessary levels of complexity. If done poorly it may fail the requirement that external functionality not be changed, introduce new bugs, or both.
From Agile Alliance -
Refactoring consists of improving the internal structure of an existing program’s source code, while preserving its external behavior.
The noun “refactoring” refers to one particular behavior-preserving transformation, such as “Extract Method” or “Introduce Parameter.”
Common Pitfalls
Refactoring does “not” mean:
rewriting code
fixing bugs
improve observable aspects of software such as its interface
From Refactoring.com -
Refactoring is a disciplined technique for restructuring an existing body of code, altering its internal structure without changing its external behavior.
Its heart is a series of small behavior preserving transformations. Each transformation (called a "refactoring") does little, but a sequence of these transformations can produce a significant restructuring. Since each refactoring is small, it's less likely to go wrong. The system is kept fully working after each refactoring, reducing the chances that a system can get seriously broken during the restructuring.
We all seem to agree
Refactor is not a word in the common lexicon. As far as I know, it is specific to software development. While factoring is a thing in mathematics, the notion of refactoring doesn’t exist. Or more accurately, the word refactoring is not a part of the mathematic lexicon.
As we look across these definitions, there are some common elements:
Restructuring existing code
Preserving behavior
Taking small, safe steps
Not the same as re-writing
As an industry, we all seem to agree that refactoring is restructuring existing code in small safe steps while preserving the behavior of the system.
We also agree that it is not re-writing code, but a definition should describe what a thing is.
What does Rewrite mean?
From Wikipedia -
This article is about code rewrites, where it is expected that the behavior will change. It is not to be confused with Code refactoring.
A rewrite in computer programming is the act or result of re-implementing a large portion of existing functionality without re-use of its source code or writing inscription. When the rewrite is not using existing code at all, it is common to speak of a rewrite from scratch.
From Merriam-Webster -
: to write (something) again especially in a different way in order to improve it or to include new information.
: to program anew especially : to revise or write a new program for
: to rewrite or revise a program especially of a computer
We all seem to agree
Rewrite is a word in the common lexicon. It means different things in differing contexts. I’ve tried to find definitions specific to software.
As we look across these definitions, there are some common elements:
Programming anew
Changing behavior - at least not strictly validating existing behavior
Large steps
As an industry, we seem to agree that rewriting is a re-implementation (writing anew) of a large portion of an application.
So what?
This wouldn’t be the first time I was accused of being pedantic or worrying about unnecessary semantics. I’ve been railing on about Technical Debt for well over a decade.
But, it does matter.
Refactoring is a specific discipline that reduces risk rather than introducing it. Refactoring is beneficial. Refactoring is a skill that all developers should learn and apply regularly. To misrepresent it as something to be avoided, cavalier behavior, or risk-inducing is a detriment to the industry.