Asynchronous file I/O for the Mozilla Platform

October 3, 2012 § 18 Comments

The Mozilla platform has recently been extended with a new JavaScript library for asynchronous, efficient, file I/O. With this library, developers of Firefox, Firefox OS and add-ons can easily write code that behave nicely with respect to the process and the operating system. Please use it, report bugs and contribute.

Off-main thread file I/O

Almost one year ago, Mozilla started Project Snappy. The objective of Project Snappy is to improve, wherever possible, the responsiveness of Firefox, the Mozilla Platform, and now, Firefox OS, based on performance data collected from volunteer users. Thanks to this real-world performance data, we have been able to identify a number of bottlenecks at all levels of Firefox. As it turns out, one of the main bottlenecks is main thread file I/O, i.e. reading from a file or writing to a file from the thread that also runs most of the code of Firefox and its add-ons.

« Read the rest of this entry »

OS.File: Iterating through a directory

July 17, 2012 § 2 Comments

Since its first landing, OS.File has been steadily gaining new features. Today, let me show you OS.File.DirectoryIterator. As its name implies, this class serves to iterate through the contents of a directory.

How to

Iterating through a directory is quite simple:

let iterator = new OS.File.DirectoryIterator("/tmp");
try {
  for (let entry in iterator) {
    // Do something with the entry.
  }
} finally {
  iterator.close(); // Release system resources as soon as possible
}

As usual with OS.File, calling iterator.close() is not strictly necessary, but is a good habit to take, as it releases critical resources immediately.

Of course, should you need all the entries for future consumption, you can place them all in an array as follows:

let array = [entry for (entry in iterator)]; // Array comprehensions in JS, at last!

Each entry contains the available information about one file:

for (let entry in iterator) {
  // Checking the type of the entry
  if (entry.isDir) {
    console.log(entry.name, "is a directory");
  } else if (entry.isLink) {
    console.log(entry.name, "is a link”);
  } else {
    console.log(entry.name, "is a regular file");
  }

  // Getting the full path to the entry
  console.log("Full path", entry.path);
}

One of the main design goals of OS.File is to be I/O efficient. Here, this means that during the iteration, the library will perform exactly one I/O call per entry, with one additional call for opening the iterator and one for closing. With respect to I/O, getting the type, name and path of the file is free.

Under Windows, a few additional informations are available, also for free. As usual with OS.File, OS-specific features are prefixed: winCreationTime, winLastWriteTime and winLastAccessTime.

For instance, to list the creation times of entries under Windows:

for (let entry in iterator) {
  if ("winCreationTime" in entry) {
    console.log("The file was created at", entry.winCreationTime);
  }
}

Or, to sort a list of entries by creation time:

let entries = [entry for (entry in iterator)];
if (entries.length > 0 && "winCreationTime" in OS.File.DirectoryIterator.Entry.prototype) {
  entries = entries.sort(function(a, b) {
    return a.winCreationTime - b.winCreationTime;
  })
}

If you wonder why we introduced fields winCreationTime et al. and not a cross-platform field creationTime, recall that, for the sake of I/O efficiency, each entry only contains the information returned by one single I/O call. As the Windows call returns more information than the Unix version, an entry under Windows offers more information than under Unix.

Finally, the Windows back-end offers an additional feature: iterating through only the subset of the entries of the directory matching some regular expression. As usual, since the feature is Windows only, it is prefixed by win.

let iterator = new OS.File.DirectoryIterator("C:\\System\\TEMP",
    /*platform-specific options*/ { winPattern: "*.tmp" } );
// ... do something with that iterator

FAQ

What’s this I/O efficiency?

The two main goals with OS.File are:

  • provide off-main-thread I/O;
  • be I/O efficient.

I/O efficiency is all about minimizing the number of actual I/O calls. This is critical because some platforms have extremely slow storage (e.g. smartphones, tablets) and because, regardless of the platforms, doing too much I/O penalizes not just your application but potentially all the applications running on the system, which is quite bad for the user experience. Finally, I/O is often expensive in terms of energy, so wasting I/O is wasting battery.

Consequently, one of the key design choices of OS.File is to provide operations that are low-level enough that they do not hide any I/O from the developer (which could cause the developer to perform more I/O than they think) and, since not all platforms have the same features, offer system-specific information that the developer can use to optimize his algorithms for a platform.

How does OS.File compare to Node.js I/O in terms of I/O-efficiency?

OS.File is designed for efficient off-main-thread I/O. For the moment, OS.File does not provide an asynchronous API that can be used from the main thread, although we are working on fixing this.

By contrast, Node.js low-level I/O is designed to mirror a subset of an old version of Posix, and provides both a synchronous and an asynchronous API on top of these calls.

The choice made by Node.js works well on the platforms for which Node.js is generally targeted (e.g. Unix-based servers) but we need better to cope with the platforms for which Firefox and Firefox OS are targeted (e.g. not only Unix but also Windows machines, as well as battery-powered devices with slow storage, etc.).

How does directory iteration compare to Node.js directory iteration?

Node.js provides a primitive readdir to iterate through a directory. This primitive returns an array of file names. The implementation of this primitive already costs about n I/O calls, where n is the number of files in the directory.

Consequently, the algorithm to determine which entries of a directory are subdirectories costs

  • about n I/O calls to establish the list of entries ; then
  • about n I/O calls to determine which are subdirectories.

This makes walking a directory recursively (to empty it or to copy it to another drive, for instance) twice more expensive than necessary. Note that this measure is very much non-scientific, as the I/O call to determine if an entry is a subdirectory can be much more expensive than the call to list the entry, depending on the OS.

By comparison, the OS.File directory iterator requires about n I/O calls for this purpose.

Similarly, under all platforms, finding the file accessed least recently has a cost of about 2·n I/O calls under Node.js.

With OS.File, the cost is similar to Node.js under Unix, but only n under Windows. Upcoming work with OS.File should also onsiderably reduce the I/O cost under Linux, Android and Firefox OS.

Finally, for an algorithm that can break from iteration once some condition is met (e.g. looking if at least one file matches some condition), Node.js will still require n I/O calls, while OS.File generally requires only as many I/O calls.

How does this compare to XPCOM nsILocalFile::directoryEntries?

Until now, the only manner of listing entries in a directory on the Mozilla Platform was nsILocalFile::directoryEntries.

Generally, OS.File directory is more convenient to use from JavaScript, can be called off-main thread, and provides more information than nsILocalFile::directoryEntries for the same I/O cost, which makes it more I/O efficient for e.g. iterating a directory looking for a file matching some condition, or walking recursively through a hierarchy.

The counterparts are that OS.File directory iteration cannot be used from the main thread yet and cannot be called from C++.

Unbreaking Scalable Web Development, One Loc at a Time

May 23, 2011 § 18 Comments

The Opa platform was created to address the problem of developing secure, scalable web applications. Opa is a commercially supported open-source programming language designed for web, concurrency, distribution, scalability and security. We have entered closed beta and the code will be released soon on http://opalang.org, as an Owasp project .

If you are a true coder, sometimes, you meet a problem so irritating, or a solution so clumsy, that challenging it is a matter of engineering pride. I assume that many of the greatest technologies we have today were born from such challenges, from OpenGL to the web itself. The pain of pure LAMP-based web development begat Ruby on Rails, Django or Node.js, as well as the current NoSQL generation. Similarly, the pains of scalable large system development with raw tools begat Erlang, Map/Reduce or Project Voldemort.

Opa was born from the pains of developing scalable, secure web applications. Because, for all the merits of existing solutions, we just knew that we could do much, much better.

Unsurprisingly, getting there was quite a challenge. Between the initial idea and an actual platform lay blood, sweat and code, many experiments and failed prototypes, but finally, we got there. After years of development and real-scale testing, we are now getting ready to release the result.

The parents are proud to finally introduce Opa. « Read the rest of this entry »

Where Am I?

You are currently browsing entries tagged with node.js at Il y a du thé renversé au bord de la table.

Follow

Get every new post delivered to your Inbox.

Join 33 other followers