Manank's blogZola2023-12-06T23:30:20+05:30https://www.manank.in/atom.xmlThe "goto" statement2023-12-06T23:30:20+05:302023-12-06T23:30:20+05:30Manankhttps://www.manank.in/blog/goto-statement/<p>The goto statement has been controversial since the advent of programming languages. In this blog post, we will deep dive into the history and the origin of the
sentiment(yes, it is Dijkstra!), and the opposing views and the rationality behind both.</p>
<span id="continue-reading"></span><h3 id="introduction">Introduction</h3>
<p>A "goto" statement is like a magical teleporter in computer programming. It lets the computer jump from one part of a program to another without any conditions or questions. Imagine it's like a super-fast shortcut that programmers can use to change the usual order of instructions. It's beautiful. In assembly language, it is a single jump instruction.</p>
<center>
<img src="/images/goto.png" width=500 alt="goto"/>
</center>
<h3 id="motivation">Motivation</h3>
<p>So we had a course called Computer Networks, and as a part of that, we had a few labs and assignments. One day, I was writing code(in C), and my friend
came into my room, stared at my screen, and then said, "Bruh, you are using goto?! Don't use it; it's not good.". I asked him why, and he gave various good
reasons for it, which we will discuss further, but still, I was not convinced because where I had used goto, it made a lot of sense, and I had seen
a lot of gotos in the Linux kernel code. As far as I remember, I had used it to break out of a triple loop and resource cleaning in case of error. So, I
decided to dig deeper and find out more about this debate.</p>
<h3 id="dijkstra-s-thoughts">Dijkstra's Thoughts</h3>
<p>Dijkstra published a relatively strong article against the use of goto in The Communications of the ACM 11, 3 (March 1968), <a href="https://www.manank.in/blog/goto-statement/dij.pdf">Go To Statement Considered Harmful</a>.</p>
<center>
<img src="/images/Edsger_Dijkstra.jpg" width=200 alt="goto"/>
</center>
<blockquote>
<p>Edsger Dijkstra was a Dutch computer scientist who made significant contributions to the fields of algorithms, programming, and software engineering. His work on the Dijkstra algorithm, structured programming, and formal methods has had a lasting impact on the field. </p>
</blockquote>
<blockquote>
<p>Fun fact about the Title: The original title of the article was <a href="https://www.cs.utexas.edu/users/EWD/ewd02xx/EWD215.PDF">EWD 215: A Case against the GO TO Statement</a>. XYZ considered harmful was a common title during that time, and thus it was published with the current title. </p>
</blockquote>
<p>The majority of the opposition to the goto construct among students, professors, and other programmers comes from this paper. Which is valid as long as you
have read the entire article. The majority of people today will agree that they should avoid gotos at all costs, but hardly a few will be able to answer
that "why" should you do that. There is also a famous quote at the beginning of the article that goes like this:-</p>
<blockquote>
<p>The quality of programmers is a decreasing function of the density of goto statements in the programs they produce.</p>
</blockquote>
<p>A powerful statement to make; he goes on to say that he is convinced that the goto statement should be abolished from all "higher level" programming
languages. Keep in mind that at the time of publication, C did not even exist.
One of the points that Dijkstra focused on was about the structure of the program. He argued that the extensive use of gotos in a program makes it harder
for the reader to understand and maintain. Imagine a function spanning 1000 lines and a goto statement from, say, line 200 down to 800 and many other
similar jumps; it's very easy to get confused and lose track of all the branching, which is the result of several goto statements.</p>
<p>The main motive of Dijkstra seems to encourage the use of a structured programming paradigm.</p>
<h4 id="structured-programming">Structured Programming</h4>
<p>According to Dijkstra's <a href="https://www.manank.in/blog/goto-statement/notes-on-structured.pdf">Notes on Structured Programming</a>, He defines structured programming as a method of writing programs that uses a limited set of control structures. He argues that this approach makes programs easier to understand, maintain, and modify. He identifies three basic control structures: sequence, selection, and iteration. He also discusses the importance of using recursion and data abstraction.</p>
<p>Specifically, Dijkstra argues that structured programming can help to avoid the "spaghetti code" problem, in which programs become difficult to
understand and maintain due to their complex and tangled structure. He suggests that by using a limited set of control structures, programmers can make
their programs more modular and easier to reason about.</p>
<p>Dijkstra also discusses the importance of using recursion and data abstraction in structured programming. Recursion is a technique in which a function
calls itself. This can be a powerful tool for writing concise and elegant programs. Data abstraction is the process of hiding the implementation details
of a data structure or algorithm. This can make programs easier to understand and maintain, as programmers can focus on the functionality of the program
without having to worry about the underlying details.</p>
<p>So basically, structured programming aims to make the structure of the program easier to read and less confusing by following a single entry, single
exit principle, and thus discourages the usage of goto, which would break the principle.</p>
<h3 id="responses-to-the-article">Responses to the Article</h3>
<p>There have been a few responses to the article in the form of publications and many debates and arguments in various online forums.
The most widely studied and sought-after response is by none other than <a href="https://en.wikipedia.org/wiki/Donald_Knuth">Donald Knuth</a>.</p>
<center>
<img src="/images/knuth.jpg" width=300 alt="goto"/>
</center>
<blockquote>
<p>Donald Knuth, a towering figure in computer science, is renowned for his seminal work on algorithms, his revolutionary TeX typesetting system, and his influential multi-volume series "The Art of Computer Programming." A Turing Award winner, Knuth's meticulousness and passion for clarity have shaped modern computing, making him a legend in the field.</p>
</blockquote>
<h4 id="structured-programming-with-goto-statements-by-donald-e-knuth">Structured Programming with GOTO statements by Donald E. Knuth</h4>
<p>Knuth did defend the judicious usage of goto in <a href="https://www.manank.in/blog/goto-statement/knuth-GOTO.pdf">Structured Programming with GOTO</a>. I suggest you give it a read. He argued that
there are cases where the "goto" statement can enhance code expressiveness. He gave
examples of situations where using "goto" could lead to more concise and efficient code. He also contended that in some instances, using "goto" could
lead to more explicit and more efficient code, particularly in situations involving error handling and resource cleanup, which is a common pattern in kernel
code. </p>
<h4 id="examples">Examples</h4>
<p>Here are a few examples from a <a href="https://softwareengineering.stackexchange.com/questions/154974/is-this-a-decent-use-case-for-goto-in-c/154980#154980">stackoverflow thread</a>
The OP posted the following code, which is the case when we try to avoid gotos; it becomes a mess if there are multiple
functions that could fail and could lead to 5-6 levels of indentation.</p>
<pre data-lang="c" style="background-color:#2b303b;color:#c0c5ce;" class="language-c "><code class="language-c" data-lang="c"><span>error = </span><span style="color:#bf616a;">function_that_could_fail_1</span><span>();
</span><span style="color:#b48ead;">if </span><span>(!error) {
</span><span> error = </span><span style="color:#bf616a;">function_that_could_fail_2</span><span>();
</span><span> </span><span style="color:#b48ead;">if </span><span>(!error) {
</span><span> error = </span><span style="color:#bf616a;">function_that_could_fail_3</span><span>();
</span><span> </span><span style="color:#b48ead;">if</span><span>(!error) {
</span><span> error = </span><span style="color:#bf616a;">function_that_could_fail_4</span><span>();
</span><span> </span><span style="color:#b48ead;">if</span><span>(!error){
</span><span> ...to the n-th tab level!
</span><span> } </span><span style="color:#b48ead;">else </span><span>{
</span><span> </span><span style="color:#65737e;">// deal with error, clean up, and return error code
</span><span> }
</span><span> } </span><span style="color:#b48ead;">else </span><span>{
</span><span> </span><span style="color:#65737e;">// deal with error, clean up, and return error code
</span><span> }
</span><span> } </span><span style="color:#b48ead;">else </span><span>{
</span><span> </span><span style="color:#65737e;">// deal with error, clean up, and return error code
</span><span> }
</span><span>} </span><span style="color:#b48ead;">else </span><span>{
</span><span> </span><span style="color:#65737e;">// deal with error, clean up, and return error code
</span><span>}
</span></code></pre>
<p>But if we decide to use goto, the solution is much more simple, elegant, and readable</p>
<pre data-lang="c" style="background-color:#2b303b;color:#c0c5ce;" class="language-c "><code class="language-c" data-lang="c"><span>error = </span><span style="color:#bf616a;">function_that_could_fail_1</span><span>();
</span><span style="color:#b48ead;">if</span><span>(error) {
</span><span> </span><span style="color:#b48ead;">goto</span><span> cleanup;
</span><span>}
</span><span>error = </span><span style="color:#bf616a;">function_that_could_fail_2</span><span>();
</span><span style="color:#b48ead;">if</span><span>(error) {
</span><span> </span><span style="color:#b48ead;">goto</span><span> cleanup;
</span><span>}
</span><span>error = </span><span style="color:#bf616a;">function_that_could_fail_3</span><span>();
</span><span style="color:#b48ead;">if</span><span>(error) {
</span><span> </span><span style="color:#b48ead;">goto</span><span> cleanup;
</span><span>}
</span><span>...
</span><span>cleanup:
</span><span style="color:#65737e;">// deal with error if it exists, clean up
</span><span style="color:#65737e;">// return error code
</span></code></pre>
<p>another similar snippet</p>
<pre data-lang="c" style="background-color:#2b303b;color:#c0c5ce;" class="language-c "><code class="language-c" data-lang="c"><span style="color:#b48ead;">int </span><span style="color:#8fa1b3;">frobnicateTheThings</span><span>() {
</span><span> </span><span style="color:#b48ead;">char </span><span>*workingBuffer = </span><span style="color:#96b5b4;">malloc</span><span>(...);
</span><span> </span><span style="color:#b48ead;">int</span><span> i;
</span><span>
</span><span> </span><span style="color:#b48ead;">for </span><span>(i=</span><span style="color:#d08770;">0 </span><span>; i<numberOfThings ; i++) {
</span><span> </span><span style="color:#b48ead;">if </span><span>(</span><span style="color:#bf616a;">giveMeThing</span><span>(i, workingBuffer) != OK)
</span><span> </span><span style="color:#b48ead;">goto</span><span> error;
</span><span> </span><span style="color:#b48ead;">if </span><span>(</span><span style="color:#bf616a;">processing</span><span>(workingBuffer) != OK)
</span><span> </span><span style="color:#b48ead;">goto</span><span> error;
</span><span> </span><span style="color:#b48ead;">if </span><span>(</span><span style="color:#bf616a;">dispatching</span><span>(i, workingBuffer) != OK)
</span><span> </span><span style="color:#b48ead;">goto</span><span> error;
</span><span> }
</span><span>
</span><span> </span><span style="color:#96b5b4;">free</span><span>(workingBuffer);
</span><span> </span><span style="color:#b48ead;">return</span><span> OK;
</span><span>
</span><span> error:
</span><span> </span><span style="color:#96b5b4;">free</span><span>(workingBuffer);
</span><span> </span><span style="color:#b48ead;">return</span><span> OOPS;
</span><span>}
</span></code></pre>
<p>This is one of the examples where using goto makes the code more readable. <code>goto</code>s become confusing when the
jumps are bi-directional, but using one-way jumps in situations like this can improve the overall readability
and encourage code reuse. The same code without the use of goto would look like this:-</p>
<pre data-lang="c" style="background-color:#2b303b;color:#c0c5ce;" class="language-c "><code class="language-c" data-lang="c"><span style="color:#b48ead;">int </span><span style="color:#8fa1b3;">frobnicateTheThings</span><span>() {
</span><span> </span><span style="color:#b48ead;">char </span><span>*workingBuffer = </span><span style="color:#96b5b4;">malloc</span><span>(...);
</span><span> </span><span style="color:#b48ead;">int</span><span> i;
</span><span>
</span><span> </span><span style="color:#b48ead;">for </span><span>(i=</span><span style="color:#d08770;">0 </span><span>; i<numberOfThings ; i++) {
</span><span> </span><span style="color:#b48ead;">if </span><span>(</span><span style="color:#bf616a;">giveMeThing</span><span>(i, workingBuffer) != OK){
</span><span> </span><span style="color:#96b5b4;">free</span><span>(workingBuffer);
</span><span> </span><span style="color:#b48ead;">return</span><span> OOPS;
</span><span> }
</span><span> </span><span style="color:#b48ead;">if </span><span>(</span><span style="color:#bf616a;">processing</span><span>(workingBuffer) != OK){
</span><span> </span><span style="color:#96b5b4;">free</span><span>(workingBuffer);
</span><span> </span><span style="color:#b48ead;">return</span><span> OOPS;
</span><span> }
</span><span> </span><span style="color:#b48ead;">if </span><span>(</span><span style="color:#bf616a;">dispatching</span><span>(i, workingBuffer) != OK){
</span><span> </span><span style="color:#96b5b4;">free</span><span>(workingBuffer);
</span><span> </span><span style="color:#b48ead;">return</span><span> OOPS;
</span><span> }
</span><span> }
</span><span>
</span><span> </span><span style="color:#96b5b4;">free</span><span>(workingBuffer);
</span><span> </span><span style="color:#b48ead;">return</span><span> OK;
</span><span>}
</span></code></pre>
<p>Another argument for the use of goto statements might be performance, as explained in this <a href="https://lkml.org/lkml/2003/1/12/203">20-year-old thread on Linux kernel mailing list</a></p>
<blockquote>
<p>Subject Re: any chance of 2.6.0-test*? </br>
From: Robert Love </br>
Date: 12 Jan 2003 17:58:06 -0500 </br></p>
<p>On Sun, 2003-01-12 at 17:22, Rob Wilkens wrote:</p>
<blockquote>
<p>I say "please don't use goto" and instead have a "cleanup_lock" function
and add that before all the return statements.. It should not be a
burden. Yes, it's asking the developer to work a little harder, but the
end result is better code.</p>
</blockquote>
<p>No, it is gross and it bloats the kernel. It inlines a bunch of junk
for N error paths, as opposed to having the exit code once at the end.
Cache footprint is key, and you just killed it.</p>
<p>Nor is it easier to read.</p>
<p>As a final argument, it does not let us cleanly do the usual stack-esque
wind and unwind, i.e.</p>
<pre data-lang="c" style="background-color:#2b303b;color:#c0c5ce;" class="language-c "><code class="language-c" data-lang="c"><span style="color:#b48ead;">do</span><span> A
</span><span style="color:#b48ead;">if </span><span>(error)
</span><span> </span><span style="color:#b48ead;">goto</span><span> out_a;
</span><span style="color:#b48ead;">do</span><span> B
</span><span style="color:#b48ead;">if </span><span>(error)
</span><span> </span><span style="color:#b48ead;">goto</span><span> out_b;
</span><span style="color:#b48ead;">do</span><span> C
</span><span style="color:#b48ead;">if </span><span>(error)
</span><span> </span><span style="color:#b48ead;">goto</span><span> out_c;
</span><span style="color:#b48ead;">goto</span><span> out;
</span><span>out_c:
</span><span>undo C
</span><span>out_b:
</span><span>undo B:
</span><span>out_a:
</span><span>undo A
</span><span>out:
</span><span style="color:#b48ead;">return</span><span> ret;
</span></code></pre>
<p>Now stop this.</p>
<p>Robert Love</p>
</blockquote>
<p>That being said, it is very tempting to use gotos when it is not required and create spaghetti code; thus programmers
should really know what they are doing when they decide to use goto.</p>
<p>A lot of times, the usage of break and continue in C might also need to be clarified, as it is difficult to predict the flow in case
of multiple nested loops with multiple conditions. Also, the capabilities of break and continue are limited, such that
breaking out from multiple nested loops is impossible to do without <code>goto</code>. It also gives more granular control
in case of multiple nested loops so as to break out of, let's say 2 inner loops instead of only, say only, the innermost loop,
which is the case with <code>break.</code></p>
<h3 id="conclusion">Conclusion</h3>
<p>So then, should you use <code>goto</code> or not? Well, it depends, actually. If you are using a higher-level programming language than C, the chances
are that you might never need to think about it, there would be better structured programming constructs offered by your language.
But then there are languages like lisp that rely heavily on <code>goto</code>s.
If you are using C, then it gets difficult to gauge the feasibility and the effect of using <code>goto</code> such that it improves the overall
quality and readability of the code, but not so much that it results in spaghetti code.</p>
<blockquote>
<p>“as a full professor with tenure, I don't have to worry about being fired when I use goto statements.” — Donald Knuth :-) </p>
</blockquote>