What is a "Closure"

 

Occasionally, one bumps into the term "closure", or "block closure" in computer science writings. Frequently, someone will be saying that a computer language feature under discussion isn't a "full closure" or a "proper closure". You may have encountered such opinions, for example, when researching into "inners" or "Runnables" or "Callables" in Java; or when looking at "functors" (function objects) in C++.

Meaning

A closure is a block of code that isn't executed there and then. However, a closure is more than just a fancy term for a function, routine or method. A closure can be passed to another block of code, for the receiving block to execute when it deems fit. Thats why the term starts creaping closer when discussing function pointers and functors in C and C++. They are close to enabling closures. But, as well as being a code block amenable to being passed around and to deferred execution, a closure also carries the context in which it was defined, and thus can reference variables, etc. from that context. So if you're a Java programmer, maybe you think of inners; but they're not full closures, because of those pesky final restrictions.

Imagine you wanted to give a Runnable object to a Java thread (this is how a thread is told what it's to start doing in Java), and that you wanted to implement this Runnable object with an anonymous local inner class. All well and good; you've been able to do that since Java 1.1. Furthermore, imagine that the code block in which all this is going on has an important local variable that you would like the anonymous local class code to refer to. Your initial thought might be that it would be impossible for an object, in object memory (the "heap") to reference a local variable that is (or was) on the run-time stack. Well you can. But in order to get it working, Java insists that locals (and arguments to inner objects' enclosing methods) must be final (constant); then it can safely copy the item in question into the local object.

Origin

"Closure" is an obscure term; albeit one that began, long ago, with a precise definition. "Closure" is, however, a dusty old term, and a fairly meaningless choice of word as far as programming is concerned. Allen Wirfs-Brock—involved in the design of several Smalltalks, and close to the original leakage of the term into the programming community—mentions he thinks it was a mistake to start the Smalltalk community using the term and thereby causing countless delvers into the deeper depths of object-oriented programming to suddenly encounter the term, and rightly say "Eh?".

Probably only because I had done some Lisp and Smalltalk during the 1980s, did the term ring distant bells when I first bumped into it.

The term "closure" originates with the last aspect of closures mentioned in the opening paragraph—the aspect that is most difficult to simulate in languages like Java and C++—that the code block be associated with, and have access to, the scope in which it was created.

Long ago, in the 1970s, Smalltalk was sorting out how methods and activation records would be implemented. A method is probably something you are already familiar with: an object-oriented function that is messaged for rather than called, and that is run by an object instance using its own state as well as any argument values and local variables. An activation record is the record that's created just for the use of the method invocation. Each individual method invocation needs its own working area otherwise one can't have methods that message for themselves (recursion).

Most block-structured languages push method activation records onto a run-time stack. It's very efficient. Scope, and nested scope handling, automatically tumble straight out of classic stack behaviour. Some Smalltalks, however, used heap-based activation records to make it easier to achieve proper closure. They were trying to get round the kinds of problems illustrated by the discussion of Java local objects in the last paragraph of the previous section.

It was in Lisp, and possibly in the lambda calculus that influenced the design of many functional languages, where the term "closure" arose. (If you haven't heard of the lambda calculus then perhaps you've heard of the Turing machine. Back in the 1930s, Alan Turing and Alonzo Church (with S. C. Kleene) were both working on the nature of computability. Today we still refer to the fruits of their independent labours as the Church-Turing thesis. Turing used his eponymous machine and Church used the lambda calculus.)

Lisp says that when a "block" is associated with a particular environment the block is "closed" (think verb rather than adjective) with respect to that environment, and that any "open" (or "free") variables in the function are bound to their corresponding entries in the environment. Hence the term "closure". (You may not be familiar with "free variables". That's maybe because they are part of why you can't easily do full closures in languages like C++ and Java, for example. Think of the variable identifiers you encounter in the code block of a Java or C++ method (or member function). These identifiers will bind either to argument values, local variables or instance variables (or data members). A free variable, however, would not be defined in the code block's scope, but in the environment in which the (code block) closure was defined.

Reason

You might be interested in why Smalltalk was going to all this trouble. It's because just about everything in Smalltalk is an object. That's why the syntax of Smalltalk doesn't require hundreds of pages to describe; only half a page. Smalltalk is object-oriented with a vengeance. (It doesn't mean that Smalltalk is an order of magnitude easier to learn however—just a factor or two easier to learn. A novice Smalltalk programmer must know a significant proportion of the library, whereas a novice Java programmer could get by with knowing only a tiny bit about just one small library package.)

In particular, methods are implemented as objects in Smalltalk; and selection (if-then-else) and iteration (loops) are implemented not as syntax, but as methods of library objects. For example an "if-then-else" is done by passing two block objects—passing them to the true or false objects that result from some boolean message. These two block objects are sent to this boolean object with the message "if you're the 'true' object then execute this block object for me; on the other hand, if you're the 'false' object then execute this other block for me". This way of doing things takes a bit of getting used to but it's elegant and consistent, and it works fine. You would expect the 'true' block and the 'false' block to be able to refer to variables from their context. Their context in block structured languages, is the enclosing scope(s). If they are block objects being passed around, however, you can see that they have to undergo closure with respect to their context.

References

 

[ Briefings Home Page ]

 

 

 

Copyright © 2013 John Deacon. All rights reserved.