Asynchronous file I/O for the Mozilla Platform

October 3, 2012 § 18 Comments

The Mozilla platform has recently been extended with a new JavaScript library for asynchronous, efficient, file I/O. With this library, developers of Firefox, Firefox OS and add-ons can easily write code that behave nicely with respect to the process and the operating system. Please use it, report bugs and contribute.

Off-main thread file I/O

Almost one year ago, Mozilla started Project Snappy. The objective of Project Snappy is to improve, wherever possible, the responsiveness of Firefox, the Mozilla Platform, and now, Firefox OS, based on performance data collected from volunteer users. Thanks to this real-world performance data, we have been able to identify a number of bottlenecks at all levels of Firefox. As it turns out, one of the main bottlenecks is main thread file I/O, i.e. reading from a file or writing to a file from the thread that also runs most of the code of Firefox and its add-ons.

Let us look at the behavior of a typical main thread file I/O:

  1. The Firefox process initiates some I/O operation (say, reading or writing a few bytes to the disk). Until the operation is complete, the main thread is frozen, which means that the user interface is not updated, that events are not handled, that web pages are not displayed and basically that (almost) nothing happens.
  2. The I/O operation is sent to the operating system.
  3. The operating system waits until the device is available.
  4. The operating system actually performs the I/O operation.
  5. The operating system returns control to the Firefox process, which resumes its normal behavior.

Now, operations are typically pretty quick. However, there is a big catch, especially for those among us who like reasoning in terms of algorithmic complexity: item 2 is actually very much non-deterministic. It depends on numerous factors beyond our reach, such as how busy the operating system is at the moment, how busy the drive is at the moment, how busy the other drives are at the moment, how long it has been since the drive was last accessed, whether the device is running on battery power, how much memory is currently available to the operating system, etc.

The end result is that, sometimes, for no apparent reason, a trivial operation such as flushing a buffer, renaming a file or even closing a file, will take 10 seconds. During these ten seconds, the Firefox process is frozen.

For this reason, an important part of Project Snappy is to get rid of main thread file I/O in Firefox and the Mozilla Platform, and replace it with off-main thread file I/O. Let us look at the behavior of typical off-main thread file I/O:

  1. The Firefox process requests an I/O operation from an I/O thread. During this operation, the main thread remains active, the user interface is updated, events are handled, web pages are displayed and basically, everything happens except code that needs the result of the I/O operation to proceed.
  2. The I/O thread initiates the I/O operation.
  3. The I/O operation is sent to the operating system.
  4. The operating system waits until the device is available.
  5. The operating system actually performs the I/O operation.
  6. The operating system returns control to the Firefox process.
  7. The I/O thread triggers the execution of the code that was waiting for the result of the I/O operation.

As we can see, the critical difference between main thread I/O and off-main thread I/O is that the user can keep using the application even while the operating system is busy carrying out the I/O operation. In most languages, the result is typically slightly slower and a little more difficult to write than main-thread operation (if you are curious, you can check out languages such as Rust or Opa, in which this is not the case), but for interactive applications, the benefit far outweighs the cost.

OS.File, chapter 2

The OS.File library has been developed for the specific purpose of letting developers on the Mozilla Platform perform efficient off-main thread file I/O. I have introduced several components of OS.File in past blog entries. Today, let me introduce the asynchronous API for OS.File, as a set of examples:

Copying or renaming a file

Let us copy file “profile.ini” (from the profile directory) to “profile.ini copy” (in the temporary directory). For this purpose, we will use function OS.File.copy.

// Import OS.File
Components.utils.import("resource://gre/modules/osfile.jsm");

// Compute the path to some well-known file
let source = OS.Path.join(OS.Constants.Path.profileDir, "profile.ini");
let dest = OS.Path.join(OS.Constants.Path.tmpDir, "profile.ini copy");

let promise = OS.File.copy(source, dest);
console.log("The copy has started. This message will generally be displayed before it is complete, though");

promise = promise.then(function onSuccess() {
   console.log("I have successfully copied file", source, "to", dest);
});

As you can see, I/O operations return promises, i.e. objects that can be used to trigger some behavior upon completion of the operation. For more details about promises, you may read the documentation of the library – note that this implementation of promises is specific to Firefox but promises are expected to become standard as part of a future version of JavaScript.

Reading/writing the full contents of a file

In the previous example, we used built-in function OS.File.copy. This time, to demonstrate reading and writing, we will read the full contents of the file and write it back:

// Import OS.File
Components.utils.import("resource://gre/modules/osfile.jsm");

// Compute the path to some well-known file
let source = OS.Path.join(OS.Constants.Path.profileDir, "profile.ini");

// Read the contents of this file
let promise = OS.File.read(source);
console.log("Currently reading file", source);
// During the read operation, the process and JavaScript continue executing

// Once the operation is complete, we can display the results
let contents;
promise = promise.then(function onSuccess(result) {
   // array is a Uint8Array
   contents = result;
   console.log("I have just read", contents.byteLength, "bytes");
});

// Now, let us write this to another file
let dest = OS.Path.join(OS.Constants.Path.tmpDir, "profile.ini copy 2");
promise = promise.then(function onSuccess() {
    return OS.File.writeAtomic(dest, contents, {tmpPath: dest + ".tmp"});
});

// Of course, we can add further instructions whose execution will continue during
// the read and write operations

console.log("I might be doing some I/O, but I can continue working");

Recall that the read and write operations take place off the main thread, so (unless the file is too large to fit in memory) the complete read and the complete write will actually be quite fast. Also note that the communication of data between threads is quite cheap (buffers are never copied, for one thing).

In this example, we have used function writeAtomic. This function ensures that file “profile.ini copy 2″ is not modified unless we are certain that it has been fully written to disk. Using this function considerably reduces the risk of corruption, should the process somehow be stopped during the operation – we have seen this happening because of batteries running out, of crashes, of anti-viruses behaving badly, etc.

Reading and writing text

Reading and writing text is slightly more complex, as text needs to be encoded/decoded. Nothing to worry about, though, now that the StringEncoding API has landed:

let encoder = new TextEncoder(); // Use default encoding (utf-8)
let decoder = new TextDecoder(); // Use default encoding (utf-8)

// Write to the file
let promise = OS.File.writeAtomic(dest, encoder.encode("My text"), {tmpPath: dest + ".tmp"});
promise = promise.then(function onSuccess() {
   return OS.File.read(dest);
});

let text;
promise = promise.then(function onSuccess(array) {
   text = decoder.decode(array);
   console.log("Here is the text I decoded: ", text);
});
// ...

Handling errors

So far, we have been dealing only with successes. However, no I/O library would be complete if it did not let its users deal with runtime errors. Since everything is executed off the main thread, the traditional model of syntactically-scoped exceptions, as featured by JavaScript, cannot be used. Fortunately, promises are also a great mechanism for dealing with asynchronous errors.

let promise = OS.File.read(aFileThatDoesNotExist);
promise = promise.then(function onSuccess(contents) {
   console.log("I have successfully read the contents of the file");
   return true;
}, function onError(reason) {
   // reason is an instance of OS.File.Error
   if (reason.becauseNoSuchFile()) {
     console.log("Ah well, the file does not seem to exist");
     return false;
   } else {
     console.log("Some other error", reason);
     return false;
   }
});

Again, for more details on promise-based error-handling, the best source is the documentation of promises.

Walking directories

A I/O library would hardly be complete if it did not provide facilities for iterating through a directory:


let iterator = new OS.File.DirectoryIterator(OS.Constants.Path.tmpDir);
let promise = iterator.forEach(function iter(entry, index) {
   console.log("I have encountered", entry.name, entry.isDir?"(directory)":"(not a directory)");
});
promise = promise.then(function onSuccess() {
   iterator.close();
}, function onError(reason) {
   iterator.close();
   throw reason; // Propagate error
});

Note that, should you need to, you can close the iterator at any time during the iteration, which will effectively stop the loop. Also, your function iter can return a promise, in which case the operation will be carried out before the loop continues.

And more…

OS.File also offers functions for accessing the details of a file (OS.File.prototype.stat), reading to an already-allocated typed array, getting/setting the current position in the file, getting/setting the current directory, moving/renaming files, deleting files, creating/removing directories, etc. – with more features coming.

Each of these operations is executed off the main thread, to guarantee maximal reactivity. Indeed, only the logistics of synchronization between threads is executed on the main thread.

What now?

Well, now, the library is available as part of Firefox 18. You are invited to test it, try it, break it, file bug reports and requests for features!

If I find time, I will try to blog about the design of OS.File in another blog entry.

About these ads

Tagged: , , , , , , , , , , , , , , ,

§ 18 Responses to Asynchronous file I/O for the Mozilla Platform

  • And says:

    Since a copy-operation might effectively be a upload/download if you are copying to or from a network drive, shouldn’t the copy command accept a progress-callback, with information such a elapsed time and bytes transferred as arguments, to allow the application to display progress and perhaps cancel the operation (e.g. by returning a abort-status-code)?

    Would it make sense to put asserts in the non-async file and network operations to check that the files aren’t being invoked on the main thread?

    • yoric says:

      Since a copy-operation might effectively be a upload/download if you are copying to or from a network drive, shouldn’t the copy command accept a progress-callback, with information such a elapsed time and bytes transferred as arguments, to allow the application to display progress and perhaps cancel the operation (e.g. by returning a abort-status-code)?

      This feature makes sense, but it should be implemented at a much higher level than OS.File, presumably in a client module.

      Would it make sense to put asserts in the non-async file and network operations to check that the files aren’t being invoked on the main thread?

      Definitely. We are using currently working on detecting (and rooting out) main thread file operations, although we use less invasive techniques than asserts. We want Firefox to keep working, after all :)

  • Caspy7 says:

    Forgive my ignorance on the topic, but I’m wondering if this will affect Jetpack or its developers.
    Also, this is extremely cool to see. Curious, would sync file writing ever be retired? Or would that be impossible/undesirable? (What if Rust became the new language of choice?)

    • yoric says:

      Firstly, thanks :)

      I will let Jetpack developers reply to your first question, but normally, yes, the features of OS.File will be available to Jetpack add-on developers. Jetpack developers might wish to add a thin layer on top of OS.File, though.

      Now, sync file writing is not bad by itself, as long as it any interactive operation is not running on the same thread. OS.File itself is largely based on off-main-thread sync writing. Similarly, with Rust, it should be rather simple to have synchronous but non-blocking file I/O. However, we are going to do our best to completely get rid of main thread file I/O, because it hurts interactivity.

    • ochameau says:

      Speaking about jetpack, we are missing asynchronous API. Our existing `file` API is meant to be deprecated in favor of async one. But then there is multiple choice, that haven’t been discussed yet:
      * Just expose OS.File as-is,
      * Expose priviledged version of FileReader/FileWriter object that already exist in web pages,
      * Implement some popular Node.js file API by sing OS.File
      * …
      We are lacking resources in order to prioritize this work, but having OS.File landed will most likely help us prroviding async API!

  • skierpage says:

    Please get the ugly `Follow “Il y a du thé renversé au bord de la table”
    Get every new post delivered…’ out of the bottom right corner of my browser tab! (SeaMonkey 12.3.1) If it has a close button or action I can’t see it. As I zoom, this irritating rectangle shows more of its text and blocks more of the page. Maybe you intend to only show the `(+) Follow’ tab on the top of it, but it isn’t working.

  • Andre says:

    There seems to be a bug, when iterating files. Entries do not seem to be of type DirectoryIterator.Entry when I do an “instanceOf”

    I looked at “osfile_async_front.jsm” and there Entry is really just defined as

    DirectoryIterator.Entry = function Entry(value) {
    return value;
    };

    which explains the bug.

    Any plans to fix this?

    p.s. apart from these minor bugs, I really like the new API :)

  • Andre says:

    Another question: What about situations where you actually need synchronous file access, f. ex. when you have an EventListener attached using addEventListener and want to decide about event propagation depending on information obtained using OS.File?

    Or in callbacks from C++/XPCOM code that need to return a value (in my specific case I have a JS XPCOM nsIProtocolHandler where I want to use OS.File in the newChannel method)?

    • yoric says:

      There was a lengthy discussion regarding whether we should provide a main thread synchronous API. The final reply was this was a bad idea, because we always want to get rid of these. So, I am afraid that OS.File cannot help you in purely synchronous cases.

      I am not convinced that nsIProtocolHandler::newChannel needs to be synchronous, though. Can’t you just decide to return a purely asynchronous channel?

  • Andre says:

    Concerning the Promise bug: This seems to be by design and is the same way Q handles it (and is in the CommonJS spec [1]). Apparently exceptions thrown in the success-handler lead to the promise being returned by “then” to be in the failed state, thus the then has to be chained like that:

    var deferred = Promise.defer();
    deferred.promise.then(function success(){
    throw new Error(“fail”);
    }, function fail(e){
    // “fail” error won’t appear here
    }).then(null, function fail2(e){
    // it will end up here!!
    console.log(e);
    });

    deferred.resolve(“moo”);

    Very confusing. Q additionally has a “fail” instead of “then” which is the same but without the first parameter.

    [1] http://wiki.commonjs.org/wiki/Promises/A

    • yoric says:

      Yes, this is by design. Perhaps not intuitive, but among other benefits, it works nicely with Task.js.

      I hope that we will eventually get method fail, too, although it is not implemented yet.

  • […] of operations performed for open(), read(), write(), fsync() and stat(). Yoric wrote an interesting article some time ago on why main-thread IO can be a serious performance […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

What’s this?

You are currently reading Asynchronous file I/O for the Mozilla Platform at Il y a du thé renversé au bord de la table.

meta

Follow

Get every new post delivered to your Inbox.

Join 33 other followers

%d bloggers like this: