JavaScript Closures for C and Pascal Programmers

Martin ISDN

4.75/5 (6 votes)

Jul 24, 2019

GPL3

11 min read

8368

Demystify the inner workings of JavaScript closures

Programming for the Masses, not the Classes

"In programming languages, a closure, also lexical closure or function closure, is a technique for implementing lexically scoped name binding in a language with first-class functions." - Wikipedia

"A closure is the combination of a function and the lexical environment within which that function was declared." - MDN web docs

If you can understand what is a JavaScript closure from the above definitions, don't waste your time with this article.

For the rest of us, a comfort quote taken from stackoverflow's Q&A - "Closures are not hard to understand once the core concept is grokked. However, they are impossible to understand by reading any theoretical or academically oriented explanations!"

Throughout history, people gained knowledge mostly empirically.
I'll use Dheeraj Kumar Kesri's code example from his article here on Code Project: JavaScript Closures

This is a Closure

function getMultiplier(multiplyBy) {
        return function(num) {
                return multiplyBy * num;
        }
}

and this is its usage:

var multiplyByTwo = getMultiplier(2);
var multiplyByTen = getMultiplier(10);

var twoIntoFive = multiplyByTwo(5);
var tenIntoSix = multiplyByTen(6);

console.log(twoIntoFive);   // prints 10
console.log(tenIntoSix);    // prints 60

Never mind now that this closure example does not do anything useful, you will have to get comfortable with closures because you will see them a lot if you program in JavaScript.

Even if you are beginning in JavaScript and C doesn't have some of its features, your intuition tells you that:

when dealing with nested functions, the inner function has access to the enclosing function's declarations and definitions of its variables, functions and parameters.
during the execution of the getMultiplier function that unnamed inner function is returned and assigned to the multipleByTwo variable, later used for its calling.

The real problem for us C programmers is - how can the inner function still access getMultiplier's parameter after the enclosing function has executed and returned? don't all locals to getMultiplier get destroyed?

The short answer is: the Garbage Collector allows that.

The long answer is of course the same, but let's go through it step by step.

Functions in JavaScript

"A function is a parametric block of code defined one time and called any number of times later."

There are a few ways in which you can declare a function in JS, but for this article, we'll mention only two.

function double(num) {
    return 2*num;
}

And let's say...

var triple = function(num) {
    return 3*num;
};

This second way is called a function expression. The unnamed function on the right hand side of the assignment is called an anonymous function. Its reference is assigned to the triple variable.

From now on, the symbolic names: double and triple both represent functions, they hold a reference to a function.

Even though we in C are accustomed to the first example of declaring a function, the book "Getting MEAN with Mongo, Express, Angular and Node" in its free chapter, "appendix D Reintroducing JavaScript" states about the second declaration "this is how JavaScript sees it anyway".

function double(num) {}
// JavaScript interprets this as
var double = function(num) {};

This implies than even if you declare the function in the first way, you can at will reassign something else to the double identifier.

function foo(num) {}
foo = 5;
// this works

If you try something like this in C, you will be greeted with an error. If for no other reason, then because you would be changing the type of identifier foo on the fly. Even if you try to redeclare foo as a float or char, you still won't be allowed, provided that you redeclare it as a global, not inside main in which case you will be shadowing.

Programmers use the word reference a lot. For simplicity, I would like to strip that term of any other meaning then as the address. no matter what baggage does the word reference carry around in all the different programming languages of the known universe, the crucial information in programming that reference has is the location of an object in the memory of the computer, i.e., its address.

In the above example, the variable triple is assigned the reference of the anonymous function. You can pass that reference to other identifiers, you can copy that reference around, the most crucial information that reference has, the value of the address, will always be the same. It will always tell the memory location of that anonymous function in its executable form, after it has been translated by the JavaScript interpreter.

So what goes on in that function expression example with triple also takes place in the closure example at the beginning. The function getMultiplier as a result of its execution, as they say in higher level languages, returns another function. for us from the C world, it returns a reference to the inner function's executable representation by the JavaScript interpreter. That reference is immediately assigned to the variable multiplyByTwo.

Remember the Garbage Collector? It if wasn't the multiplyByTwo identifier on the left side of the assignment to hold the returning reference of the anonymous function, the GC will happily do its job and get rid of the function instance. It will free the memory occupied by it, since there is no way you or anybody else could ever refer to it and invoke it.

The Nested Functions

Now, for Pascal programmers, nested functions are no big deal. we also have to get acquainted with them if not for what they are useful (and believe me, they are), then at least a bit for how they work.

Before a function is called, its stack frame is set up: parameters, local variables, return value, everything that goes... but, languages like Pascal and D that have the ability of nested functions push on the stack frame of the inner function one more information, a pointer to the stack frame of the enclosing function.

When the inner function uses one of its private variables, fine. but, when it uses one of its enclosing function local variables, then it has to undergo a process of delegation to the outer function's stack frame.

You will find the concept of delegation everywhere as you program in JavaScript, the most notable being the delegation through the prototype chain.

Previously, I said that the JavaScript interpreter when embarks on a function, translates it into an executable form, i.e., machine code. That might not be in fact, but it served good that explanation of a reference. I really don't know what is the immediate result of source code translation by JavaScript. Maybe it fragments that source into reusable tokens that themselves are turned into or exist already in machine code. Maybe, it translates "source code into some efficient intermediate representation and immediately executes it".

You will have to understand that I'm using a model to describe what is going on and that model may not be entirely exact with reality, but I hope it will serve good in understanding better the subject of closures and JS programming in general.

That being said, there is something else going on with that inner function. something very similar to the way constructor functions work in JS.

function Person(name) {
    this.name = name;
    this.getThyName = function() {
        return this.name;
    };
}

var peter = new Person("Peter");
var shmeter = new Person("Shmeter");

You would intuitively think that peter and shmeter share the same function, but that is not the case. They share the same exact functionality, but the Person constructor just automates the task of writing the getThyName method for you from your blueprint. You will not only get two objects with name variable members, you will also get two getThyName methods.

var john = {
    name: "John",
    getThyName: function() {
        return this.name;
    }
};

var shmon = {
    name: "Shmon",
    getThyName: function() {
        return this.name;
    }
};

Now you intuitively know that these two methods reside in different memory locations and they have a different referential value, i.e., address.

To have only one method reused by peter and shmeter in the former example, you will have to assign the function getThyName to the prototype of the Person constructor. This way peter and shmeter will call getThyName by the process of delegation through the prototype chain.

In Pascal, you would expect our nested function example to compile, after you've set a referential name for the inner function and be done with it. you would have one instance of getMultiplier and one of the inner function that you would name, say multiply. After this, the only thing that changes would be the respective stack frame created and destroyed on each respective call to getMultiplier or multiply.

In JavaScript, the interpreter will find the body of getMultiplier in the global scope and reserve the needed memory for its instantiation, translate it "into some efficient intermediate representation", assign its reference to identifier getMultiplier and be free to pass around its reference as many times as you need a copy of it. The intermediate representation is created only once on only one location.

On the other hand, when you call, say getMultiplier(2), JS will get to the point of the inner function and its job will be to make an instance of it. This is where things get similar to our constructor example, for every different call to getMultiplier, the interpreter constructs a new instance of the inner anonymous function. When the instance is created, the JS interpreter has its reference and will pass it as a return value from getMultiplier. For every call to the enclosing function, you get a different reference of the enclosed function.

If there is nothing to hold that reference value upon return, good job. The Garbage Collector can dispose the memory occupied by that instance of the anonymous function.

Luckily, we have the variable twoIntoFive waiting on the left side of the assignment, an lvalue to the rescue.

The Scope Chain

"Scope is the context environment (also known as lexical environment) created when a function is written. This context defines what other data it has access to."

Let's try to find out a JavaScript analogy for what Pascal is doing when it is inserting a pointer to the outer function's stack frame into the inner function's stack frame. How is the delegation throughout the scope chain achieved? What is the scope chain?

For every JavaScript program you run, an execution context stack is created. This stack is built out of different execution contexts. Its first element is the global execution context. Simply, that would be everything that is outside a function body. Whenever a function is called, a new execution context is created.

Let's say we just ran a script, so on this stack, we have only the global execution context and as the script goes, the first of our function calls gets executed. The global execution context will be paused and then a new execution context for our function will be created and it will be pushed on top of the execution context stack over the global execution context.

If our function calls another function, the execution context of our function will be paused and the newly created execution context for the called function will be pushed on top of our function's execution context. When the called function exits, its execution context is popped from the stack and our execution context becomes active so long as our function doesn't exit. When it exits, its execution context will get popped and the global execution context becomes active again... writing execution context this many times is so boring that I have to find a way to factor it out in front of the last two paragraphs.

Every function call creates a new execution context. getMultiplier(2) and getMultiplier(10) will execute in two different execution contexts.

Every execution context has two phases: the creation and the execution phase.

In the creation phase, the activation object is created. To this object, the local variables and function parameters are slapped, basically all new declaration of this scope. Then the scope chain is created. last, the context is set. context is directly related to the value of this and it is not the same as execution context of which is a part.

For some reason, the activation object of the global execution context is called the variable object.

The scope chain of an execution context is a list of activation objects, that when it is evaluated, always starts with the current innermost activation object and ends with the variable object. The innermost activation object is the activation object for the current execution context of the function that is executing.

So the delegation for the scope chain of an executing function always goes from its local scope to the outer scope of the enclosing function (if any?) where it was defined and so on... and ends in the global scope as you would have expected.

You may have noticed that when dealing with nested functions in their execution contexts, the invoked inner function has a reference to the activation object of its enclosing function via the scope chain. Hence the Garbage Collector cannot dispose of the enclosing function's activation object to free up some memory even after the outer function has finished its job and returned.

This was the long answer and I hope you enjoyed it. :)

One implication of this:

function getMultiplier(multiplyBy) {
    var t = "something";
    return function(num) {
        return multiplyBy * num;
    }
}
var multiplyByTwo = getMultiplier(2);

The value of variable t and parameter multiplyBy are preserved since they are part of the same activation object, you don't have to explicitly refer to t inside the inner function.

Any suggestions to improve the validity of the answer given here will be greatly appreciated and credited.

History

25^th July, 2019: Initial version