Wait, what's scope?
Like all good blog posts, this one first came to me while I was banging my head against the keyboard, cursing and screaming out for a maternal figure to hold me while I incoherently babbled about a crash caused by seemingly-nonsensical code.
āThis shouldnāt compile! This CANāT compile!ā
And yet, despite the odds, C++ had decided yet again to show me a new side of it. Please take a moment to peruse the following example code, based on the original that caused me such pain, and see if you can spot the bug that causes a nullptr exception.
Of course, to compile this properly we need some further definitions. Check out https://repl.it/@Winwardo/IfElse-Chain-With-Bug for the full example.
To save you the trouble, the error is on line 12, where I attempt to access asDerivedA->predicate
.
Thatās right: despite declaring asDerivedA
in a separate if statement above, Iām able to access it in an else/if statements below. No, this is not a compiler bug; this is clearly and unambiguously defined in the C++ specification. Before we get to that, letās take a look at some code.
If youāre new to the syntax thatās used in the code sample, if (int i = 5) {
is a perfectly valid way of declaring and defining a variable, then using it inside the given if statement. It allows us to write terser, clearer code, while also avoiding limiting the scope of a variable. Itās good practise to keep the scope of a variable as small as possible, both to avoid accidental re-use, and to lower the mental overhead on future programmers.
If youāve not come across the term before, the scope of a variable is the code itās accessible in. Variables defined within a function, if statement, or other control block are scoped to that block. This is important to stop us accidentally re-using variables that were declared elsewhere in code.
The concept of a scope is used in most languages ā all major languages, including C++, Javascript, C#, Rust, even Haskell, have a concept of scope. Note that specific definitions and rules vary from language to language.
You might have also heard of global scope ā this is where a variable is accessible from anywhere, inside or outside a function or block. This is considered poor practise in many cases.
Scope in C++ isnāt just for functions ā any time you open a new pair of curly braces, or block, you create a new scope. You should also know that if weāre declaring (and defining) inside the if statement, the condition then becomes the result of the definition, converted to a boolean. For converting a number to boolean, we ask the question āIs this number not 0?ā If the number is 0, the boolean is false, otherwise itās true.
Knowing this, we can transform a simple if declaration as such:
Both first
and second
are semantically the same, and produce identical assembler output from the compiler. You can prove this to yourself here: https://godbolt.org/g/zRC8BT
If you think this is a fluke or want further proof that this is the case, here you can see the C++ specification Decreeing That It Must Be So, in section [stmt.select]. Donāt worry if this section of this blog-post is confusing!
Because of this choice, we can see that any variables declared inside the init-statement must be accessible to the else statement. This alone may not seem enough to cause the bug above, however, due to C++ās resolution to the dangling else problem, it means variables declared in the init-statement must also be visible in any else-if statements.
If this is not clear, note that the following two functions are identical:
NO. STOP RIGHT THERE. Donāt. YOU. DARE.Having discussed with several colleagues, weāve been unable to come up with a single legitimate use case for this behaviour. Every example weāve considered can be rewritten in a much clearer and less surprising fashion.
To entertain the idea however, here is the sort of code you may have the misfortune of stumbling across.
Note that weāve encoded some state that can be accessed via operator bool()
, which will be called when we convert the data to a boolean type.
This is not immediately clear code, and itās certainly astonishing. Intent has been obscured behind layers of abstraction that donāt provide us any real benefit.
Thatās a fair question. I was working on a data-exporter, which had to legitimately cast a pointer to one of several potential types in order to extract information for serialization. (No, polymorphism would not have been the solution here.)
It was an if-else chain, and I had copy-pasted one part, forgetting to update the variable name in the process. As shown by PVS Studioās brilliant article titled The Last Line Effect, copying and pasting similar lines downwards regularly results in a bug.
C++ is a big language, and despite having been working in it for 5 years, it still reveals new parts of itself to me every day. Itās a nice reminder that, even if you think youāre writing code well, without static analysis tools and effective automated testing, itās very easy to miswrite some code and introduce a crash to a system.
Have you come across this curiosity anywhere? Can you think of an example where it would be useful to use? Let us know on Twitter š
Thanks to Jessica Baker, Chantelle Porritt, Andy Bastable and Stuart Milne for proof-reading and discussions.