C data finalization – in JavaScript

May 2, 2012 § 4 Comments

A few iterations ago, the Mozilla Platform introducefd js-ctypes, a very nice Foreign Function Interface (FFI) for JavaScript. As its inspiration, Python’s ctypes, js-ctypes lets (privileged) JavaScript code open native libraries, import their functions and call these functions almost as if they were regular JavaScript functions.

Here is an example using the Unix libc:

// Open the C library
let libcCandidates = [
  'libSystem.dylib',// MacOS X
  'libc.so.6',      // Linux
  'libc.so'         // Android, B2G
];
let libc;
for each(let candidate in libcCandidates) {
  libc = ctypes.open(candidate);
  if (libc) {
    break;
  }
}

// Import some functions from libc
let open = libc.declare("open", ctypes.default_abi,
  /*return int*/ ctypes.int,
  /*const char* path*/ctypes.char.ptr,
  /*int oflag*/ ctypes.int
  /*int mode*/ ctypes.int);
let read = libc.read("read", ctypes.default_abi,
  /*return ssize_t*/ ctypes.ssize_t,
  /*int fildes*/ ctypes.int,
  /*void *buf*/ ctypes.voidptr_t,
  /*size_t nbytes*/ ctypes.size_t);
let close = libc.read("close", ctypes.default_abi,
  /*return int*/ ctypes.int,
  /*int fd*/ ctypes.int);

// Now use them
let myfile = open("/etc/passwd", 0, 0);
if (myfile == -1)
  throw new Error("Could not open file");
// ...

If you are familiar with XPConnect, the mechanism generally used in the Mozilla Platform for letting JavaScript and C++ interact, you can see that using js-ctypes to call native code directly is much nicer than adding a C++ XPCOM/XPConnect layer. From what I hear, it seems to be also much faster, as XPConnect needs to perform expensive magic to ensure that memory is properly passed between JavaScript and C++. In addition, this selfsame memory magic now prevents XPConnect from being executed from threads other than the main thread, which makes js-ctypes the only manner of doing any system access from worker threads.

Now, js-ctypes nicely solves the issue of calling native code from JavaScript. However, JavaScript and C are very different languages, with very different paradigms, so getting them to coexist requires a little more than simply the ability to place calls or convert values. In particular, C has:

  • manual resource management (memory must be released, file descriptors must be closed, locks must be released, etc.);
  • no language-level mechanism for error management (a task smaller than a process cannot be killed because of an error).

By contrast, Javascript has:

  • automated memory management, but no support for managing automatically resources other than memory (no user-level finalization or scoped resources mechanism);
  • several language/vm-level mechanisms that can kill a task in non-trivial manners (exceptions, “this script is busy”, etc.)

Unfortunately, putting all of this together makes it quite difficult to write JavaScript code that manipulates C resources without leaking. Such leaks can cause both performance issues (memory leaks, in particular, tend to slow down the whole system) and hard-to-track errors (leaking file descriptors can prevent the application from opening any new file, or, under Windows, can prevent the application from reopening some files that were improperly closed, while leaking locks can completely freeze an application).

Introducing C data finalization

For this reason, we have recently added a new features to js-ctypes, designed to add automated resource management to JavaScript: C data finalization.

Specifying a finalizer is simple:

function openfile(path, flags, mode) {
  let fd = open(path, flags, mode);
  if (fd == -1) {
    throw new Error("Could not open file " + path);
  }
  return ctypes.CDataFinalizer(fd, close);
}

What this code does is ensure that, whenever the file descriptor is garbage-collected, function close is called, releasing the C resources represented by that file descriptor. This value is C data with a finalizer, aka CDataFinalizer.

You can use it just as you would use the C data through js-ctypes:

let myfile = openfile("/etc/passwd", 0, 0);
let result = read(myfile, myarray, 4096); // Read some data
// Wherever required, |myfile| is automatically converted to
// the underlying integer value.
// Once |myfile| has no reference, it will (eventually) be
// closed.

It is, of course, possible (and strongly recommended) to close the file manually to ensure that resources are immediately available for the process and the rest of the system:

let myfile = openfile("/etc/passwd", 0, 0);
// ...
// ... do whatever you wish to do with that file
let result = myfile.dispose(); // This calls |close|.

// From this point, |myfile| cannot be converted to the underlying
// integer value anymore. Any attempt to do so will raise an
// exception.

Or, an equivalent but more verbose solution, using forget:

let myfile = openfile("/etc/passwd", 0, 0);
// ...
// ... do whatever you wish to do with that file
let fd = myfile.forget();
// From this point, |myfile| cannot be converted to the underlying
// integer value anymore. Any attempt to do so will raise an
// exception.
let result = close(fd);

This mechanism is, of course, not restricted to file descriptors. It has been used with success to other data structures, including malloc-allocated strings.

Details and caveat

JavaScript does not feature finalization and might never do so. There are good reasons for this: finalization considerably complicates the garbage-collector and introduces the possibility of subtle bugs and leaks that the various JS implementors do not want to inflict to their users (if you are curious, two of the main problems are resurrection of dead references and finalization of cyclic data structures).

Consequently, C data finalizers are not full-featured finalizers. Indeed, the main limitation of C data finalizers is that its first argument must be a C value and its second argument must be a pointer to a C function – for the above mentioned reasons, letting users specify any JavaScript function as a finalizer would open a can of worms that nobody really wants to see crawling around.

Also, before using a finalizer, you should be aware that JavaScript garbage-collection is not necessarily deterministic – during the testing phase of CDataFinalizer, we have encountered memory errors caused by developers (ok, I will confess, that was me, sorry guys) making invalid assumptions about just when values would be garbage-collected. Let me emphasize this: any hypothesis you make about when a value is finalized is bound to be regularly false. In other words, C data finalizers should be used as a last line of defense, not as the default mechanism for recovering resources.

Still, C data finalizers are a powerful mechanism that make manipulation of C values with JavaScript much more reliable. Indeed, it is one of the core mechanisms used pervasively by the OS.File library.

edit As per Steve Fink’s suggestion, I have emphasized that users should not rely on the behavior of garbage-collection/finalization, and clarified the can of worms.

Where Am I?

You are currently browsing entries tagged with ffi at Il y a du thé renversé au bord de la table.