Short functions are key to a simple and readable code base. Iâm sure youâve stumbled across several hundred line monstrosities before that are hell to test or reason about.
Here Iâll present a short example on a lesser considered issue surrounding functions, that of three-way merges. The motivation for this is based on a colleague needing to merge in my changes to a file she was already working in. Specifically, the changes were in the middle of a Very Long Function that she was also modifying the body of. The examples here are in C++ and Javascript, but the core point is the same for all programming languages.
Iâll share an example of a horrible merge, then an example of the same code, refactored, in a much easier merge. Thereâs no special tricks involved â the code is simply pulled apart into smaller, more readable chunks.
Jump straight to the end for some suggestions on how to keep code mergeable.
A three way merge is where one file has been updated by two competing sources simultaneously, and one must merge with the other. Youâll typically only come across this when using source control tools like Perforce or git.
Hereâs a simple example of such a merge, presented in the tool KDiff3:
Here we have three files, written by several programmers: Ali (source.txt
), Betsy (left.txt
), and Carter (right.txt
) in KDiff3. Ali is on the left, Betsy in the middle, and Carter on the right. In Aliâs file, you can see a poor implementation of a random number generator, as inspired by XKCD.
Here, source.txt
refers to the latest version of a given file that is checked into our source control system, without further changes or additions.
Betsy and Carter have decided this wasnât random enough, and chose to improve the algorithm, each on their own computer. Betsy thought that 7 was more random, whereas Carter thought 8 was even more random than that.
Betsy decided to check in her change as soon as she wrote it, unaware of Carterâs current herculean efforts to also overhaul the random number generator.
Now for the merge
When Carter tries to check in, they realise that the latest version in source control is no longer the source.txt
they were working against â rather itâs now the equivalent of left.txt
. Unable to automatically sort this out, the source control system offers Carter the ability to perform a manual merge of the three files. This allows Carter to decide what really is the true contents of the file.
At the bottom, we can see the output of the merge. We have decided to choose the Carterâs update as the real, canonical piece of code. We could have also chosen Betsyâs, or simply kept Aliâs original code as is.
Merges are not easy. This was about as trivial an example of a three way merge possible, and yet itâs still possible to make a mistake, or misunderstand intent.
Letâs look at a more complex merge, on a larger function. The provided code is deliberately contrived and obtruse to make a point â the merge and general structure of the code are far easier to comprehend when you reduce complexity. However, this is based on, and representative of, the âworst mergeâ referred to above. The contents of the code here is not important: the complexity of the shape and the merge is what you should focus on. Donât waste time trying to follow what the code is really doing.
A quick look at the files
The first file, long_base.js
, is our base file, that two programmers have decided to update and work on.
The second file, long_left.js
, drops an else
statement on line 7, and tweaks what is happening on line 13 (now line 15).
Our last file, long_right.js
, adds extra logging, puts in an extra while loop, changes the default return from 7 to 8, and more. Itâs a non-trivial change, and touches multiple areas of the function. In this case, the programmer even decided to use a different brace style, opting for the next line as opposed to current.
Letâs take a look at this in KDiff3. It wonât be pretty.
That merge just ainât right đľ
Even a small subsection of this merge is awkward and uncomfortable to work with. Not only is it difficult to encode the intent of both programmers, itâs actually very easy to merge incorrectly. This could lead to, in the best case, code that doesnât compile. In the worst case, subtle bugs and ordering effects could creep in.
But it doesnât have to be this way.
Here are refactored examples of the above code. The following changes have been made:
- Weâve pulled the
return 7
into an early return, saving on a layer of indentation throughout most of the function. - We inlined
value1
into the function call, though sometimes it may be more readable to keep it as a well named variable. - All of lines 8â16 were pulled into a new function,
transformer5
. We no longer care about the implementation of it, and changes in indentation level around the function can no longer confuse our merge here. The programmers may have changes totransformer5
in another file which requires its own merge, but here: we donât know, and we donât care.
â ď¸ Cheater! â ď¸ You didnât do anything new, you just moved the code elsewhere!
Exactly. Unnecessary details have been abstracted away. Even with meaningless variable names and functions, you can now easily follow the entire function with very little mental effort.ď¸
Letâs now look at the merges.
đ
This merge is beautiful. A junior developer, or one with little to no experience in the area, should be able to perform this merge with ease. Even the indentation change has not caused an issue.
As such, we should strive to keep our code as simple and flat as makes reasonable sense. By doing so, we can save much time in bugs and frustration caused by merging in work.
This point on frustration is important â by thinking about how others will interact with our code, we can improve our workflow and working situation not just for ourselves, but for our entire team. Reduce the stress on others, and theyâll be able to do the same for you.
Some good ways to keep a file easily mergeable involve:
- Short functions are gold â aim for under 20 lines. Anything over 50 and youâre possibly doing too much, and a merge will bite you at some point.
- Abstracting out functionality that isnât strictly necessary. If you could add a comment explaining what some code is for, first see if you could just extract a function.
- Decide on either tabs or spaces, and a bracing style. I donât care what you use, but nothing is worse than trying to diff two files, only to have the intent obscured by people fighting over adding their own bracing or white-spacing style. Pick one, and stick with it.
I hope youâve taken something positive away from this, and let me know if you have any thoughts, agreements or differing views!
đŹ