Shutting down things asynchronously

February 14, 2014 § Leave a comment

This blog entry is part of the Making Firefox Feel As Fast As Its Benchmarks series. The fourth entry of the series was growing much too long for a single blog post, so I have decided to cut it into bite-size entries.

A long time ago, Firefox was completely synchronous. One operation started, then finished, and then we proceeded to the next operation. However, this model didn’t scale up to today’s needs in terms of performance and performance perception, so we set out to rewrite the code and make it asynchronous wherever it matters. These days, many things in Firefox are asynchronous. Many services get started concurrently during startup or afterwards. Most disk writes are entrusted to an IO thread that performs and finishes them in the background, without having to stop the rest of Firefox.

Needless to say, this raises all sorts of interesting issues. For instance: « how do I make sure that Firefox will not quit before it has finished writing my files? » In this blog entry, I will discuss this issue and, more generally, the AsyncShutdown mechanism, designed to implement shutdown dependencies for asynchronous services.

« Read the rest of this entry »

Copying streams asynchronously

October 18, 2013 § Leave a comment

In the Mozilla Platform, I/O is largely about streams. Copying streams is a rather common activity, e.g. for the purpose of downloading files, decompressing archives, saving decoded images, etc. As usual, doing any I/O on the main thread is a very bad idea, so the recommended manner of copying streams is to use one of the asynchronous string copy APIs provided by the platform: NS_AsyncCopy (in C++) and NetUtil.asyncCopy (in JavaScript). I have recently audited both to ascertain whether they accidentally cause main thread I/O and here are the results of my investigations.

In C++

What NS_AsyncCopy does

NS_AsyncCopy is a well-designed (if a little complex) API. It copies the full contents of an input stream into an output stream, then closes both. NS_AsyncCopy can be called with both synchronous and asynchronous streams. By default, all operations take place off the main thread, which is exactly what is needed.

In particular, even when used with the dreaded Safe File Output Stream, NS_AsyncCopy will perform every piece of I/O out of the main thread.

The default setting of reading data by chunks of 4kb might not be appropriate to all data, as it may cause too much I/O, in particular if you are reading a small file. There is no obvious way for clients to detect the right setting without causing file I/O, so it might be a good idea to eventually extend NS_AsyncCopy to autodetect the “right” chunk size for simple cases.

Bottom line: NS_AsyncCopy is not perfect but it is quite good and it does not cause main thread I/O.

Limitations

NS_AsyncCopy will, of course, not remove main thread I/O that takes place externally. If you open a stream from the main thread, this can cause main thread I/O. In particular, file streams should really be opened with flag DEFER_OPEN flag. Other streams, such as nsIJARInputStream do not support any form of deferred opening (bug 928329), and will cause main thread I/O when they are opened.

While NS_AsyncCopy does only off main thread I/O, using a Safe File Output Stream will cause a Flush. The Flush operation is very expensive for the whole system, even when executed off the main thread. For this reason, Safe File Output Stream is generally not the right choice of output stream (bug 928321).

Finally, if you only want to copy a file, prefer OS.File.copy (if you can call JS). This function is simpler, entirely off main thread, and supports OS-specific accelerations.

In JavaScript

What NetUtil.asyncCopy does

NetUtil.asyncCopy is a utility method that lets JS clients call NS_AsyncCopy. Theoretically, it should have the same behavior. However, some oddities make its performance lower.

As NS_AsyncCopy requires one of its streams to be buffered, NetUtil.asyncCopy calls nsIIOUtil::inputStreamIsBuffered and nsIIOUtil::outputStreamIsBuffered. These methods detect whether a stream is buffered by attempting to perform buffered I/O. Whenever they succeed, this causes main thread I/O (bug 928340).

Limitations

Generally speaking, NetUtil.asyncCopy has the same limitations as NS_AsyncCopy. In particular, in any case in which you can replace NetUtil.asyncCopy with OS.File.copy, you should pick the latter, which is both simpler and faster.

Also, NetUtil.asyncCopy cannot read directly from a Zip file (bug 927366).

Finally, NetUtil.asyncCopy does not fit the “modern” way of writing asynchronous code on the Mozilla Platform (bug 922298).

Helping out

We need to fix a few bugs to improve the performance of asynchronous copy. If you wish to help, please do not hesitate to pick any of the bugs listed above and get in touch with me.

Trapping uncaught asynchronous errors

October 14, 2013 § 2 Comments

While the official specifications of DOM Promise is still being worked on, Mozilla has been using Promise internally for several years already. This API is available to the platform front-end and to add-ons. In the past few weeks, Promise.jsm (our implementation of Promise) and Task.jsm (our implementation of Beautiful Concurrency in JavaScript, built on top of Promise) have been updated with a few new features that should make everybody’s life much easier.

Reporting errors

The #1 issue developers encounter with the use of Promise and Task is error-handling. In non-Promise code, if a piece of code throws an error, by default, that error will eventually be reported by window.onerror or any of the other error-handling mechanisms.

function fail() {
  let x;
  return x.toString();
}

fail(); // Displays somewhere: "TypeError: x is undefined"

By opposition, with Promise and/or Task, if a piece of code throws an error or rejects, by default, this error will be completely ignored:

Task.spawn(function*() {
  fail(); // Error is ignored
});

 

Task.spawn(function*() {
  yield fail(); // Error is ignored
});

 

somePromise.then(function onSuccess() {
  fail(); // Error is ignored
});

 

somePromise.then(function onSuccess() {
  return fail(); // Error is ignored
});

Debugging the error requires careful instrumentation of the code, which is error-prone, time-consuming, often not compositional and generally ugly to maintain:

Task.spawn(function*() {
  try {
    fail();
  } catch (ex) {
    Components.utils.reportError(ex);
    throw ex;
    // The error report is incomplete, re-throwing loses stack information
    // and can cause double-reporting
  }
});

The main reason we errors end up dropped silently is that it is difficult to find out whether an error is eventually caught by an error-handler – recall that, with Promise and Task, error handlers can be registered long after the error has been triggered.

Well, after long debates, we eventually found solutions to fix the issue :)

Simple case: Reporting programming errors

Our first heuristic is that programming errors are, well, programming errors, and that programmers are bound to be looking for them.

So,

Task.spawn(function*() {
  fail(); // Error is not invisible anymore
});

will now cause the following error message

*************************
A coding exception was thrown and uncaught in a Task.

Full message: TypeError: x is undefined
Full stack: fail@Scratchpad/2:23
@Scratchpad/2:27
TaskImpl_run@resource://gre/modules/Task.jsm:217
TaskImpl@resource://gre/modules/Task.jsm:182
Task_spawn@resource://gre/modules/Task.jsm:152
@Scratchpad/2:26
*************************

The message appears on stderr (if you have launched Firefox from the command-line) and in the system logs, so it won’t disrupt your daily routine, but if you are running tests or debugging your code, you should see it.

A similar error message will be printed out if the error is thrown from a raw Promise, without use of Task.

These error messages are limited to programming errors and appear only if the errors are thrown, not passed as rejections.

General case: Reporting uncaught errors

Now, we have just landed a more general support for displaying uncaught errors.

Uncaught thrown error

Task.spawn(function* () {
  throw new Error("BOOM!"); // This will be displayed
});

Uncaught rejection

Task.spawn(function* () {
  yield Promise.reject("BOOM!"); // This will also be displayed
});

Uncaught and clearly incorrect rejection

Task.spawn(function* () {
  Promise.reject("BOOM!");
  // Oops, forgot to yield.
  // Nevermind, this will be displayed, too
});

These will be displayed in the browser console as follows:

A promise chain failed to handle a rejection: on Mon Oct 14 2013 16:50:15 GMT+0200 (CEST), Error: BOOM! at
@Scratchpad/2:27
TaskImpl_run@resource://gre/modules/Task.jsm:217
TaskImpl@resource://gre/modules/Task.jsm:182
Task_spawn@resource://gre/modules/Task.jsm:152
@Scratchpad/2:26

These error messages appear for every uncaught error or rejection, once it is certain that the error/rejection cannot be caught anymore. If you are curious about the implementation, just know that it hooks into the garbage-collector to be informed once the error/promise cannot be caught anymore.

This should prove very helpful when debugging Promise- or Task-related errors. Have fun :)

Support for ES6 generators

You may have noticed that the above examples use function*() instead of function(). Be sure to thank Brandon Benvie who has recently updated Task.jsm to be compatible with ES6 generators :)

Project Async & Responsive, issue 4

July 4, 2013 § 4 Comments

It is told that, on the far side of the Sea of Ocean, in a mythical city sitting on a Great Lake, the Fellowship finally met. They saw face-to-face, reviewed each other’s quests, prepared for future adventures and ate perhaps a little too much.

Storage: Support for mozIStorageAsyncConnection (bug 702559) has finally landed, thus bringing type-safe off-main thread storage and adding off-main thread opening of storage. The never-ending work to convert synchronous database clients to asynchronous storage proceeds, including inline autocomplete (bug 791776), async annotations (bug 699844), async bookmarks backups (https://etherpad.mozilla.org/places-backups-changes).
Other: We have improve the startup speed for Session Restore (bug 887780), added a backup feature during upgrade for session data
 (bug 876168) and we are still progressing on cleaning up and making Session Restore asynchronous. Finally, a battle-tested version of the the module loader for workers is now nearly ready to land, along with a refactored version of the core of OS.File (bug 888479).

Project Async & Responsive, issue 1

April 26, 2013 § 3 Comments

In the previous episodes

Our intrepid heroes, a splinter cell from Snappy, have set out on a quest to offer alternatives to all JavaScript-accessible APIs that blocks the main thread of Firefox & Mozilla Apps.

Recently completed

Various cleanups on Session Restore (Bug 863227, 862442, 861409)

Summary Currently, we regularly (~every 15 seconds) save the state of every single window, tab, iframe, form being displayed, so as to be able to restore the session quickly in case of crash, power failure, etc. As this can be done only on the main thread, just the jank of data collection is often noticeable (i.e. > 50 ms). We are in the process of refactoring Session Restore to make it both faster and more responsive. These bugs are steps towards optimizations.

Status Landed. More cleanups in progress.

Telemetry for Number of Threads (Bug 724368)

Summary As we make Gecko and add-ons more and more concurrent, we need to measure whether this concurrency can cause accidental internal Denial of Service. The objective of this bug is to measure.

Status Landed. You can follow progression in histograms BACKGROUNDFILESAVER_THREAD_COUNT and SIMPLEMEASURES_MAXIMALNUMBEROFCONCURRENTTHREADS.

Reduce number of fsync() in Firefox Health Report (Bug 830492)

Summary Firefox Health Report stores its data using mozStorage. The objective of this bug is to reduce the number of expensive disk synchronizations performed by FHR.

Status Landed.

Ongoing bugs

Out Of Process Thumbnailing (Bug 841495)

Summary Currently, we capture thumbnails of all pages, e.g. for display in about:newtab, in Panorama or in add-ons. This blocks the main thread temporarily. This bug is about using another process to capture thumbnails of pages currently being visited. This can be useful for both security/privacy reasons (i.e. to ensure that bank account numbers do not show up in thumbnails) and responsiveness (i.e. to ensure that we never block browsing).

Status In progress.

Cutting Session Restore data collection into chunks (Bug 838577)

Summary Currently, we regularly (~every 15 seconds) save the state of every single window, tab, iframe, form being displayed, so as to be able to restore the session quickly in case of crash, power failure, etc. As this can be done only on the main thread, just the jank of data collection is often noticeable (i.e. > 50 ms). This bug is about cutting data collection in smaller chunks, to remove that jank.

Status Working prototype.

Off Main Thread database storage (Bug 702559)

Summary We are in the process of moving as many uses of mozStorage out of the main thread. While mozStorage already supports doing much I/O off the main thread, there is no clear way for developers to enforce this. This bug is about providing a subset of mozStorage that performs all I/O off the main thread and that will serve as target for both ongoing refactorings and future uses of mozStorage, in particular by add-ons.

Status Working prototype.

Improvements transaction management by JavaScript API for Off Main Thread database storage (Bug 856925)

Summary Sqlite.jsm is our JavaScript library for convenient Off Main Thread database storage. This bug is about improving how implicit transactions are handled by the library, hence improving performance.

Status In progress.

Refactor how Places data is backed up (Bugs 852040, 852041, 852034, 852032, 855638, 865643, 846636, 846635, 860625, 854927, 855190)

Summary Places is the database containing bookmarks, history, etc. Historically, Places was implemented purely on the main thread, which is something we very much want to remove, as any main thread I/O can block the user interface for arbitrarily lengthy durations. This set of bugs is part of the larger effort to get rid of Places main thread I/O. The objective here is to isolate and cleanup Places backup, to later allow removing it entirely from the main thread.

Status Working prototype.

APIs for updating/reading Places Off Main Thread (Bugs 834539, 834545)

Summary These bugs are part of the effort to provide a clean API, usable by platform and add-ons, to access/modify Places information off the main thread.

Status In progress.

Move Form History to use Off Main Thread storage (Bug 566746)

Summary This bug is part of the larger effort to get rid of Places main thread I/O. The objective here is to move Form History I/O off the main thread.

Status In progress.

Make about:home use IndexedDB instead of LocalStorage (Bug 789348)

Summary Currently, page about:home uses localStorage to store some data. This is not good, as localStorage does blocking main thread I/O. This bug is about porting about:home to use indexedDB instead.

Status In progress.

Download Architecture Improvements (Bug 825588 and sub-bugs)

Summary Our Architecture for Downloads has grown organically for about 15 years. Part of it is executed on the main thread and synchronously. The objective of this meta-bug is to re-implement Downloads with a modern architecture, asynchronous, off the main thread, and accessible from JavaScript.

Status In progress.

Constant stack space Promise (Bug 810490)

Summary Much of our main thread asynchronous work uses Promises. The current implementation of Promise is very recursive and eats considerable amounts of stack. The objective here is to replace it with a new implementation of Promise that works in (almost) constant stack space.

Status Working partial prototype.

Reduce amount of I/O in session restore (Bug 833286)

Summary The algorithm used by Session Restore to back up its state is needlessly expensive. The objective of this bug is to replace it by an alternative implementation that requires much less I/O.

Status Working prototype.

Planning stage

Move session recording and activity monitoring to Gecko (Bug 841561)

Summary Firefox Health Report implements a sophisticated mechanism for determining whether a user is actively using the browser. This mechanism could be reimplemented in a more efficient and straightforward manner by moving it to Gecko itself. This is the objective of this bug.

Status Looking for developer.

Non-blocking Thumbnailing (Bug 744100)

Summary Currently, we capture thumbnails of all pages, e.g. for display in about:newtab, in Panorama or in add-ons. This blocks the main thread temporarily, as capturing a page requires repainting it into memory from the main thread. This bug is about removing completely the “repaint into memory” step and rather collaborate with the renderer to obtain a copy of the latest image rendered.

Status Design in progress.

Evaluate dropping “www.” from rev_host column (Bug 843357)

Summary The objective of this bug is to simplify parts of the Places database and its users by removing some data that appears unnecessary.

Status Evaluation in progress.

Optimize worker communication for large messages (Bug 852187)

Summary We sometimes need to communicate very large messages between the main thread and workers. For large enough messages, the communication itself ends up being very expensive for the main thread. This bug is about optimizing such communications.

Status Design in progress.

Asynchronous file I/O for the Mozilla Platform

October 3, 2012 § 18 Comments

The Mozilla platform has recently been extended with a new JavaScript library for asynchronous, efficient, file I/O. With this library, developers of Firefox, Firefox OS and add-ons can easily write code that behave nicely with respect to the process and the operating system. Please use it, report bugs and contribute.

Off-main thread file I/O

Almost one year ago, Mozilla started Project Snappy. The objective of Project Snappy is to improve, wherever possible, the responsiveness of Firefox, the Mozilla Platform, and now, Firefox OS, based on performance data collected from volunteer users. Thanks to this real-world performance data, we have been able to identify a number of bottlenecks at all levels of Firefox. As it turns out, one of the main bottlenecks is main thread file I/O, i.e. reading from a file or writing to a file from the thread that also runs most of the code of Firefox and its add-ons.

« Read the rest of this entry »

Where Am I?

You are currently browsing entries tagged with async at Il y a du thé renversé au bord de la table.

Follow

Get every new post delivered to your Inbox.

Join 25 other followers