What is a three way merge?
Short functions are key to a simple and readable code base. Iām sure youāve stumbled across several hundred line monstrosities before that are hell to test or reason about.
Here Iāll present a short example on a lesser considered issue surrounding functions, that of three-way merges. The motivation for this is based on a colleague needing to merge in my changes to a file she was already working in. Specifically, the changes were in the middle of a Very Long Function that she was also modifying the body of. The examples here are in C++ and Javascript, but the core point is the same for all programming languages.
Iāll share an example of a horrible merge, then an example of the same code, refactored, in a much easier merge. Thereās no special tricks involved ā the code is simply pulled apart into smaller, more readable chunks.
Jump straight to the end for some suggestions on how to keep code mergeable.
A three way merge is where one file has been updated by two competing sources simultaneously, and one must merge with the other. Youāll typically only come across this when using source control tools like Perforce or git.
Hereās a simple example of such a merge, presented in the tool KDiff3:
Here we have three files, written by several programmers: Ali (source.txt
), Betsy (left.txt
), and Carter (right.txt
) in KDiff3. Ali is on the left, Betsy in the middle, and Carter on the right. In Aliās file, you can see a poor implementation of a random number generator, as inspired by XKCD.
Here, source.txt
refers to the latest version of a given file that is checked into our source control system, without further changes or additions.
Betsy and Carter have decided this wasnāt random enough, and chose to improve the algorithm, each on their own computer. Betsy thought that 7 was more random, whereas Carter thought 8 was even more random than that.
Betsy decided to check in her change as soon as she wrote it, unaware of Carterās current herculean efforts to also overhaul the random number generator.
Now for the merge
When Carter tries to check in, they realise that the latest version in source control is no longer the source.txt
they were working against ā rather itās now the equivalent of left.txt
. Unable to automatically sort this out, the source control system offers Carter the ability to perform a manual merge of the three files. This allows Carter to decide what really is the true contents of the file.
At the bottom, we can see the output of the merge. We have decided to choose the Carterās update as the real, canonical piece of code. We could have also chosen Betsyās, or simply kept Aliās original code as is.
Merges are not easy. This was about as trivial an example of a three way merge possible, and yet itās still possible to make a mistake, or misunderstand intent.
Letās look at a more complex merge, on a larger function. The provided code is deliberately contrived and obtruse to make a point ā the merge and general structure of the code are far easier to comprehend when you reduce complexity. However, this is based on, and representative of, the āworst mergeā referred to above. The contents of the code here is not important: the complexity of the shape and the merge is what you should focus on. Donāt waste time trying to follow what the code is really doing.
A quick look at the files
The first file, long_base.js
, is our base file, that two programmers have decided to update and work on.
The second file, long_left.js
, drops an else
statement on line 7, and tweaks what is happening on line 13 (now line 15).
Our last file, long_right.js
, adds extra logging, puts in an extra while loop, changes the default return from 7 to 8, and more. Itās a non-trivial change, and touches multiple areas of the function. In this case, the programmer even decided to use a different brace style, opting for the next line as opposed to current.
Letās take a look at this in KDiff3. It wonāt be pretty.
That merge just aināt right šµ
Even a small subsection of this merge is awkward and uncomfortable to work with. Not only is it difficult to encode the intent of both programmers, itās actually very easy to merge incorrectly. This could lead to, in the best case, code that doesnāt compile. In the worst case, subtle bugs and ordering effects could creep in.
But it doesnāt have to be this way.
Here are refactored examples of the above code. The following changes have been made:
- Weāve pulled the
return 7
into an early return, saving on a layer of indentation throughout most of the function. - We inlined
value1
into the function call, though sometimes it may be more readable to keep it as a well named variable. - All of lines 8ā16 were pulled into a new function,
transformer5
. We no longer care about the implementation of it, and changes in indentation level around the function can no longer confuse our merge here. The programmers may have changes totransformer5
in another file which requires its own merge, but here: we donāt know, and we donāt care.
ā ļø Cheater! ā ļø You didnāt do anything new, you just moved the code elsewhere!
Exactly. Unnecessary details have been abstracted away. Even with meaningless variable names and functions, you can now easily follow the entire function with very little mental effort.ļø
Letās now look at the merges.
š
This merge is beautiful. A junior developer, or one with little to no experience in the area, should be able to perform this merge with ease. Even the indentation change has not caused an issue.
As such, we should strive to keep our code as simple and flat as makes reasonable sense. By doing so, we can save much time in bugs and frustration caused by merging in work.
This point on frustration is important ā by thinking about how others will interact with our code, we can improve our workflow and working situation not just for ourselves, but for our entire team. Reduce the stress on others, and theyāll be able to do the same for you.
Some good ways to keep a file easily mergeable involve:
- Short functions are gold ā aim for under 20 lines. Anything over 50 and youāre possibly doing too much, and a merge will bite you at some point.
- Abstracting out functionality that isnāt strictly necessary. If you could add a comment explaining what some code is for, first see if you could just extract a function.
- Decide on either tabs or spaces, and a bracing style. I donāt care what you use, but nothing is worse than trying to diff two files, only to have the intent obscured by people fighting over adding their own bracing or white-spacing style. Pick one, and stick with it.
I hope youāve taken something positive away from this, and let me know if you have any thoughts, agreements or differing views!
š¬