Trapping uncaught asynchronous errors

October 14, 2013 § 2 Comments

While the official specifications of DOM Promise is still being worked on, Mozilla has been using Promise internally for several years already. This API is available to the platform front-end and to add-ons. In the past few weeks, Promise.jsm (our implementation of Promise) and Task.jsm (our implementation of Beautiful Concurrency in JavaScript, built on top of Promise) have been updated with a few new features that should make everybody’s life much easier.

Reporting errors

The #1 issue developers encounter with the use of Promise and Task is error-handling. In non-Promise code, if a piece of code throws an error, by default, that error will eventually be reported by window.onerror or any of the other error-handling mechanisms.

function fail() {
  let x;
  return x.toString();
}

fail(); // Displays somewhere: "TypeError: x is undefined"

By opposition, with Promise and/or Task, if a piece of code throws an error or rejects, by default, this error will be completely ignored:

Task.spawn(function*() {
  fail(); // Error is ignored
});

 

Task.spawn(function*() {
  yield fail(); // Error is ignored
});

 

somePromise.then(function onSuccess() {
  fail(); // Error is ignored
});

 

somePromise.then(function onSuccess() {
  return fail(); // Error is ignored
});

Debugging the error requires careful instrumentation of the code, which is error-prone, time-consuming, often not compositional and generally ugly to maintain:

Task.spawn(function*() {
  try {
    fail();
  } catch (ex) {
    Components.utils.reportError(ex);
    throw ex;
    // The error report is incomplete, re-throwing loses stack information
    // and can cause double-reporting
  }
});

The main reason we errors end up dropped silently is that it is difficult to find out whether an error is eventually caught by an error-handler – recall that, with Promise and Task, error handlers can be registered long after the error has been triggered.

Well, after long debates, we eventually found solutions to fix the issue :)

Simple case: Reporting programming errors

Our first heuristic is that programming errors are, well, programming errors, and that programmers are bound to be looking for them.

So,

Task.spawn(function*() {
  fail(); // Error is not invisible anymore
});

will now cause the following error message

*************************
A coding exception was thrown and uncaught in a Task.

Full message: TypeError: x is undefined
Full stack: fail@Scratchpad/2:23
@Scratchpad/2:27
TaskImpl_run@resource://gre/modules/Task.jsm:217
TaskImpl@resource://gre/modules/Task.jsm:182
Task_spawn@resource://gre/modules/Task.jsm:152
@Scratchpad/2:26
*************************

The message appears on stderr (if you have launched Firefox from the command-line) and in the system logs, so it won’t disrupt your daily routine, but if you are running tests or debugging your code, you should see it.

A similar error message will be printed out if the error is thrown from a raw Promise, without use of Task.

These error messages are limited to programming errors and appear only if the errors are thrown, not passed as rejections.

General case: Reporting uncaught errors

Now, we have just landed a more general support for displaying uncaught errors.

Uncaught thrown error

Task.spawn(function* () {
  throw new Error("BOOM!"); // This will be displayed
});

Uncaught rejection

Task.spawn(function* () {
  yield Promise.reject("BOOM!"); // This will also be displayed
});

Uncaught and clearly incorrect rejection

Task.spawn(function* () {
  Promise.reject("BOOM!");
  // Oops, forgot to yield.
  // Nevermind, this will be displayed, too
});

These will be displayed in the browser console as follows:

A promise chain failed to handle a rejection: on Mon Oct 14 2013 16:50:15 GMT+0200 (CEST), Error: BOOM! at
@Scratchpad/2:27
TaskImpl_run@resource://gre/modules/Task.jsm:217
TaskImpl@resource://gre/modules/Task.jsm:182
Task_spawn@resource://gre/modules/Task.jsm:152
@Scratchpad/2:26

These error messages appear for every uncaught error or rejection, once it is certain that the error/rejection cannot be caught anymore. If you are curious about the implementation, just know that it hooks into the garbage-collector to be informed once the error/promise cannot be caught anymore.

This should prove very helpful when debugging Promise- or Task-related errors. Have fun :)

Support for ES6 generators

You may have noticed that the above examples use function*() instead of function(). Be sure to thank Brandon Benvie who has recently updated Task.jsm to be compatible with ES6 generators :)

Making Firefox Feel as Fast as its Benchmarks – Part 2 – Towards multi-process

October 9, 2013 § 2 Comments

As we saw in the first post of this series, our browser behaves as follows:

function browser() {
  while (true) {
    handleEvents();  // Let's make this faster
    updateDisplay();
  }
}

As we discussed, the key to making the browser smooth is to make handleEvents() do less. One way of doing this is to go multi-process.

Going multi-process

Chrome is multi-process. Internet Explorer 4 was multi-process and so is Internet Explorer 8+ do (don’t ask me where IE 5, 6, 7 went). Well, Firefox OS is multi-process, too and Firefox for Android used to be multi-process until we canceled this approach due to Android-specific issues. For the moment, Firefox Desktop is only slightly multi-process, although we are heading further in this direction with project electrolysis (e10s, for short).

In a classical browser (i.e. not FirefoxOS, not the Firefox Web Runtime), going multi-process means running the user interface and system-style code in one process (the “parent process”) and running code specific to the web or to a single tab in another process (the “child process”). Whether all tabs share a process or each tab is a separate process, or even each iframe is a process, is an additional design choice that, I believe, is still open to discussion in Firefox. In FirefoxOS and in the Firefox Web Runtime (part of Firefox Desktop and Firefox for Android), going multi-process generally means one process per (web) application.

Since code is separated between processes, each handleEvents() has less to do and will therefore, at least in theory, execute faster. Additionally, this is better for security, insofar as a compromised web-specific process affords an attacker less control than compromising the full process. Finally, this gives the possibility to crash a child process if necessary, without having to crash the whole browser.

Coding for multi-process

In the Firefox web browser, the multi-process architecture is called e10s and looks as follows:

function parent() {
  while (true) {
    handleEvents();  // Some of your code here
    updateDisplay(); // Just the ui
  }
}
function child() {
  while (true) {
    handleEvents();  // Some of your code here
    updateDisplay(); // Just some web
  }
}

parent() ||| child()

The parent process and the child process are not totally independent. Very often, they need to communicate. For instance, when the user browses in a tab, the parent needs to change the history menu displayed by the parent process. Similarly, every few seconds, Firefox saves its state to provide quick recovery in case of crash – the parent asks each tab for its information and, once all replies have arrived, gathers them into one data structure and saves them all.

For this purpose, parent and the child can send messages to each other through the Message Manager. A Message Manager can let a child process communicate with a single parent process and can let a parent process communicate with one or more children processes:

// Sender-side
messageManager.sendAsyncMessage("MessageTopic", {data})

// Receiver-side
messageManager.addMesageListener("MessageTopic", this);
// ...
receiveMessage: function(message) {
  switch (message.name) {
  case "MessageTopic":
    // do something with message.data
    // ...
    break;
  }
}

Additionally, code executed in the parent process can inject code in the child process using the Message Manager, as follows:

messageManager.loadFrameScript("resource://path/to/script.js", true);

Once injected, the code behaves as any (privileged) code in the child process.

As you may see, communications are purely asynchronous – we do not wish the Message Manager to stop a process and wait until another process is done with it tasks, as this would totally defeat the purpose of multi-processing. There is an exception, called the Cross Process Object Wrapper, which I am not going to cover, as this mechanism is meant to be used only during a transition phase.

Limitations

It is tempting to see multi-process architecture as a silver bullet that semi-magically makes Firefox (or its competitors) fast and smooth. There are, however, quite clear limitations to the model.

Firstly, going multi-process has a cost. As demonstrated by Chrome, each process consumes lots of memory. Each process needs to load its libraries, its JavaScript VM, each script must be JIT-ed for each process, each process needs its communication channgel towards the GPU etc. Optimizing this is possible, as demonstrated by FirefoxOS (which runs nicely with 256 Mb) but is a challenge.

Similarly, starting a multi-process browser can be much slower than starting a single-process browser. Between the instant the user launches the browser and the instant it becomes actually usable, many things need to happen: launching the parent process, which in turn launches the children processes, setting up the communication channels, JIT compiling all the scripts that need compilation, etc. The same cost appears when shutting down the processes.

Also, using several processes brings about a risk of contention on resources. Two processes may need to access the disk cache at the same time, or the cookies, or the session storage, or the GPU or the audio device. All of this needs to be managed carefully and can, in certain cases, slow down considerably both processes.

Also, some APIs are synchronous by specifications. If, for some reason, a child process needs to access the DOM of another child process – as may happen in the case of iframes – both child processes need to become synchronous. During the operation, they both behave as a single process, with just extremely slower DOM operations.

And finally, going multi-process will of course not make a tab magically responsive if this tab itself is the source of the slowdown – in other words, multi-process it not very useful for games.

Refactoring for multi-process

Many APIs, both in Firefox itself and in add-ons, are not e10s-compliant yet. The task of refactoring Firefox APIs into something e10s-compliant is in progress and can be followed here. Let’s see what needs to be done to refactor an API for multi-process.

Firstly, this does not apply to all APIs. APIs that access web content for non-web content need to be converted to e10s-style – an example is Page Info, which needs to access web content (the list of links and images from that page) for the purpose of non-web content (the Page Info button and dialog). As multi-process communications is asynchronous, this means that such APIs must be asynchronous already or must be made asynchronous if they are not, and that code that calls these APIs needs to be made asynchronous if it is not asynchronous already, which in itself is already a major task. We will cover making things asynchronous in another entry of this series.

Once we have decided to make an asynchronous API e10s-compliant, the following step is to determine which part of the implementation needs to reside in a child process and which part in the parent process. Typically, anything that touches the web content must reside in the child process. As a rule of thumb, we generally consider that the parent process is more performance-critical than children-processes, so if you have code that could reside either in the child process or in the parent process, and if placing that code in the child process will not cause duplication of work or very expensive communications, it is a good idea to move it to the child process. This is, of course, a rule of thumb, and nothing replaces testing and benchmarking.

The next step is to define a communication protocol. Messages exchanged between the parent process and children processes all need a name. If are working on feature Foo, by conventions, the name of your messages should start with “Foo:”. Recall that message sending is asynchronous, so if you need a message to receive an answer, you will need two messages: one for sending the request (“Foo:GetState”) and one for replying once the operation is complete (“Foo:State”). Messages can carry arbitrary data in the form of a JavaScript structure (i.e. any object that can be converted to/from JSON without loss). If necessary, these structures may be used to attach unique identifiers to messages, so as to easily match a reply to its request – this feature is not built into the message manager but may easily be implemented on top of it. Also, do not forget to take into account communication timeouts – recall that a process may fail to reply because it has crashed or been killed for any reason.

The last step is to actually write the code. Code executed by the parent process typically goes into some .js file loaded from XUL (e.g. browser.js) or a .jsm module, as usual. Code executed by a child process goes into its own file, typically a .js, and must be injected into the child process during initialization by using window.messageManager.loadFrameScript (to inject in all children process) or browser.messageManager.loadFrameScript (to inject in a specific child process).

That’s it! In a future blog entry, I will write more about common patterns for writing or refactoring asynchronous code, which comes in very handy for code that uses your new API.

Contributing to e10s

The ongoing e10s refactoring of Firefox is a considerable task. To make it happen, the best way is to contribute to coding or testing.

What’s next?

In the next blog entry, I will demonstrate how to make front-end and add-on code multi-threaded.

Making Firefox Feel as Fast as its Benchmarks – Part 1 – Towards Async

October 8, 2013 § 17 Comments

Most of us have seen the numbers: according to benchmarks, Firefox is faster than Chrome. Sadly, that is sometimes hard to believe, as our eyes often indicate the opposite. The good news is that Firefox really is very, very fast. However, this is misleading: what our eyes notice is not speed (an indication of how much time an operation takes to complete) but smoothness (an indication of how often we have refreshed the screen in the meantime). And on that domain, on many pages, we are still lagging behind Chrome.

In this series of blog entries about the subject,  I plan to introduce the issues and discuss the solutions that are envisioned.

How browsers work

Let’s start by a short refresher about how browsers work. All current browser engines (except Servo) share the same overall behavior:

function browser() {
  while (true) {
    handleEvents(); // <-- Your code goes here
    updateDisplay();
  }
}

Function handleEvents() handles all the system-specific event handling (“oh, it looks like the mouse has moved”, “some time has passed”, “the http request is complete”) and dispatches this to all higher-level code designed to provide the user experience (“call onmousemove“, “add something to the history menu”, “capture thumbnail”, “call XMLHttpRequest callback”, …). Whether you are writing back-end code, front-end code, an add-on or even web-side JavaScript, unless you are part of the Layout team, chances are that your C++ or JavaScript code is executed somewhere in handleEvents() and your CSS is executed in updateDisplay().

To ensure perfect smoothness, updateDisplay() must be called at least 30 times per second, preferably 60. This means that the whole loop should be executed in roughly 15ms, which is pretty short. This is a problem that hits all browsers and we have a number of techniques to fit our code in this limit. As I am not a member of the Layout team, I will concentrate on the code that runs in handleEvents().

So, we need to make handleEvents() faster. Can we optimize it? Certainly we can. However, recall that benchmarks suggest that Firefox is already the fastest browser around, and this is not sufficient. Perhaps we can improve that speed by a few percents, but that is not the best course of action.

We need to go the other way: handleEvents() must do less things.

In the following parts of this series, I will explore several options: going multi-process, multi-thread, or interruptible. If I have time, I will also discuss Servo, Mozilla’s next-generation rendering engine.

Async & Responsive: What’s next?

October 1, 2013 § 2 Comments

For the past few months/years adding APIs to the Mozilla Platform to simplify everybody’s task of writing asynchronous or, even better, off-main thread code.

From the top of my head, we have added and gradually improved:

  • support for async programming with Promise, including Promise.jsm, Task.jsm, better error reporting …
  • support for off main thread I/O with OS.File, Sqlite.jsm, mozIStorageAsyncConnection, nsIBackgroundFileSaver, async transactions for places, …
  • support for making things safer with AsyncShutdown.jsm …
  • support for testing async code, with  add_task for xpcshell and mochitest-browser …
  • support for clear off main thread code with the chrome worker module loader, extensions to js-ctypes …

Do you have wishes for Q4 or beyond? [De]compressing files on chrome workers? Accessing sqlite from chrome workers? More tooling for Promise? New preference APIs?

Please drop us a line, either here or in the relevant Bugzilla component.

Additionally, I will attend the Mozilla Summit in Brussels where I will host an Open Session on Async development in the Mozilla platform. Don’t hesitate to come and have a chat.

Async and Responsive, issue 5

August 6, 2013 § 7 Comments

Summer, mosquitoes and vacations could not stop our heroes. Some of them braved the perils of working on the beach to improve the responsiveness of Firefox.

Session Restore

The plans for Session Restore have been overhauled to make way for code that is more e10s-friendly. Our plans to provide a main thread async API are therefore postponed indefinitely, in favor of a content script implementation of state collection. While less asynchronous than earlier prototypes, this implementation should also provide considerable responsiveness benefits, even before we turn on the e10s switch in Firefox. We have also overhauled part of the implementation of Session Restore to ensure that it caches state whenever appropriate, hence removing most of the jank due to state collection. Finally, we have started working on compressing communications between the main thread and the worker to minimize serialization costs. Additional clean-up has been completed, with more under way or is waiting for darker skies before landing.

Places

The titanic work to refactor Places into a non-blocking, off main thread, efficient API, continues. Ongoing works include making backups non-blocking and reducing the I/O cost of backups. We are also preparing a new asynchronous transaction manager for Places. Once this transaction manager has reached a satisfying state, it might be possible to reuse the code (or at least the design) for projects unrelated to Places.

Downloads

The JavaScript downloads API is basically ready. This API is fully asynchronous, JS-friendly and does just about everything off the main thread.

Misc

We are getting close to having a fully off main thread implementation of mozStorage. Partial rewrite/cleanup/modularization of OS.File continues, as well as extending and converting code to Promise.jsm. Also, additional work on porting synchronous code to OS.File.

Side-note

Our work with Async + Responsive has uncovered a number of bugs in the Firefox implementation of Workers and Chrome Workers. This, in itself, is not very surprising, as we are pushing these technologies into a number of places that hadn’t been explored yet. However, the owners of DOM:Workers have been extraordinarily reactive and we would like to thank them. Kudos, guys.

Asynchronous database connections in the Mozilla Platform

July 19, 2013 § 2 Comments

One of the core components of the Mozilla Platform is mozStorage, our low-level database, based on sqlite3. mozStorage is used just about everywhere in our codebase, to power indexedDB, localStorage, but also site permissions, cookies, XUL templates, the download manager (*), forms, bookmarks, the add-ons manager (*), Firefox Health Report, the search service (*), etc. – not to mention numerous add-ons.

(*) Some components are currently moving away from mozStorage for performance and footprint reasons as they do not need the safety guarantees provided by mozStorage.

A long time ago, mozStorage and its users were completely synchronous and main-thread based. Needless to say, this eventually proved to be a design that doesn’t scale very well. So, we set out on a quest to make mozStorage off main thread-friendly and to move all these uses off the main thread.

These days, whether you are developing add-ons or contributing to the Mozilla codebase, everything you need to access storage off the main thread are readily available to you. Let me introduce the two recommended flavors.

Note: This blog entry does not cover using database from *web applications* but from the *Mozilla Platform*. From web applications, you should use indexedDB.

« Read the rest of this entry »

Chrome Workers, now with modules!

July 17, 2013 § 3 Comments

One of the main objectives of Project Async is to encourage Firefox developers and Firefox add-on developers to use Chrome Workers to ensure that whatever they do doesn’t block Firefox’ UI thread. The main obstacle, for the moment, is that Chrome Workers have access to very few features, so one the tasks of Project Async is to add features to Chrome Workers.

Today, let me introduce the Module Loader for Chrome Workers.

« Read the rest of this entry »

Project Async & Responsive, issue 4

July 4, 2013 § 4 Comments

It is told that, on the far side of the Sea of Ocean, in a mythical city sitting on a Great Lake, the Fellowship finally met. They saw face-to-face, reviewed each other’s quests, prepared for future adventures and ate perhaps a little too much.

Storage: Support for mozIStorageAsyncConnection (bug 702559) has finally landed, thus bringing type-safe off-main thread storage and adding off-main thread opening of storage. The never-ending work to convert synchronous database clients to asynchronous storage proceeds, including inline autocomplete (bug 791776), async annotations (bug 699844), async bookmarks backups (https://etherpad.mozilla.org/places-backups-changes).
Other: We have improve the startup speed for Session Restore (bug 887780), added a backup feature during upgrade for session data
 (bug 876168) and we are still progressing on cleaning up and making Session Restore asynchronous. Finally, a battle-tested version of the the module loader for workers is now nearly ready to land, along with a refactored version of the core of OS.File (bug 888479).

Alors comme ça, votre projet a besoin de contributeurs ?

June 21, 2013 § Leave a comment

Comment décevoir un contributeur

Il était une fois un projet de logiciel libre (ou, d’ailleurs, un projet associatif). Un jour, un anonyme se présenta et annonça qu’il souhaitait aider. C’était une bonne nouvelle, car le projet avait bien besoin de contributeurs supplémentaires. Malheureusement, au bout de quelques jours, l’anonyme disparût, car il n’arrivait pas à aider.

C’est une histoire assez triste. Elle vous est peut-être familière, soit dans la peau du contributeur existant, soit dans la peau de l’anonyme qui voulait contribuer. Cette histoire est malheureusement fréquente dans les projets qui cherchent à s’étendre. Voyons ce que nous pourrions faire pour changer la fin du conte.

Il était une fois un projet de logiciel libre (ou, d’ailleurs, un projet associatif). Un jour, un anonyme se présenta et annonça qu’il souhaitait aider. C’était une bonne nouvelle, car le projet avait bien besoin de contributeurs supplémentaires. Les contributeurs existants avaient justement préparé des documents pour guider des nouveaux venus et étaient prêts à répondre aux questions de l’anonyme. L’anonyme suivit le guide de contribution. Après avoir suivi ce guide, l’anonyme chercha à quoi il pouvait bien contribuer. Malheureusement, au bout de quelques jours ou quelques semaines, l’anonyme n’avait pas trouvé en quoi il pouvait aider et il disparût.

Malgré tous les efforts des contributeurs existants, l’histoire est toujours aussi triste. Alors que faire pour arriver à une fin heureuse ?

Il était une fois un projet de logiciel libre (ou, d’ailleurs, un projet associatif). Comme dans tous les projets de ce genre, il y avait énormément de choses à faire et pas assez de contributeurs pour tout accomplir. Les contributeurs avaient pris pour habitude de noter sur une liste de tâche facilement accessible tout ce qu’ils n’avaient pas encore eu le temps de mener à bien. Certaines de ces tâches étaient accessibles à des nouveaux venus. Pensant aux futurs contributeurs, les contributeurs existants s’assuraient donc que ces tâches accessibles étaient faciles à trouver et que n’importe quel nouveau venu pouvait facilement contacter la personne qui avait ajouté la tâche dans la liste, pour lui demander des conseils. De plus, les contributeurs existants avaient préparé des documents pour guider des nouveaux venus et étaient prêts à répondre aux questions de l’anonyme. L’anonyme suivit le guide de contribution, qui le mena à des tâches accessibles. Il trouva une tâche qui l’intéressait et un mentor pour l’aider à démarrer. Ils vécurent heureux et eurent beaucoup de contributions.

Le système des bugs mentorés

Chez Mozilla, nous utilisons depuis quelques années le système que je viens de décrire, avec un succès impressionnant. Tous les quelques jours, sur les projets que je suis, de nouveaux contributeurs se présentent, suivent les tutoriels, choisissent une tâche, se mettent immédiatement au travail – et finissent la plupart du temps par publier leurs contributions, et un peu plus tard par devenir eux-mêmes mentors sur d’autres tâches.

Marquer une tâche comme mentorée prend environ deux secondes.

  1. je viens d’ouvrir un bug sur Bugzilla et je réalise qu’un débutant pourrait certainement le traiter avec un peu d’aide ;
  2. j’ajoute dans le champ libre (« whiteboard ») du bug l’information [mentor=Yoric] – à partir de ce moment-là, les nouveaux venus peuvent trouver ce bug dans notre moteur de recherche de bugs mentorés ;
  3. j’en profite pour ajouter dans ce même champ libre l’information [lang=js][lang=c++] – à partir de ce moment-là, les nouveaux venus cherchant des bugs dans l’une des deux technologies “JavaScript” ou “C++” verront s’afficher ce bug ;
  4. c’est fini – un de ces jours, un contributeur me contactera peut-être pour demander s’il peut travailler sur ce bug.

Bien entendu, l’exemple utilise Bugzilla et des contributions techniques mais il est assez simple d’étendre le système à d’autres gestionnaires de tâches et à des tâches purement non techniques.

Pour un nouveau venu, commencer est aussi très simple :

  1. lire notre document d’introduction et suivre le lien vers le moteur de recherche de bugs mentorés ;
  2. choisir des centres d’intérêt et un bug ;
  3. contacter le mentor par mail ou par irc.

Certaines étapes peuvent encore être fluidifiées (le nom et l’adresse du mentor ne sont pas toujours évidents à trouver à l’écran, etc.) mais c’est en cours. Nous espérons que le système pourra, à terme, être généralisé à tous les projets de Mozilla, techniques ou non.

Du coup, si vous participez à un projet (Mozilla ou autre) qui n’emploie pas un tel système de bugs mentorés et qui cherche des contributeurs, je vous invite vivement à essayer.

Project Async & Responsive, issue 3

June 5, 2013 § 2 Comments

Our heroes continued on their quest to move add-on- and front-end-accessible APIs off the main thread. Despite reviewer starvation, they managed to progress.

Headlines

  • Constant stack implementation of Promise (bug 810490) have finally landed! This should resolve the stack explosion witnessed by a few of the libraries using promises.

Recent progress

  • Async mozStorage connection (bug 702559). Working prototype. Review in progress.
  • Converting consumers of non-constant stack Promise. In progress.
  • Improving speed of postMessage (bugs 879083, 852187, 873293). Design in progress. Early prototype.
  • Readahead for OS.File (bug 865389). Prototype. Review in progress.
  • Handling Tasks in mochitest (bug 872229). Working prototype. Pending review.
  • Removing main thread I/O from the URL classifier (bug 867776). In progress.

Currently blocked

  • JavaScript module loader for workers (bug 872421). Working prototype [waiting for review for 2 weeks].
  • Making Session Restore non-blocking (bug 838577). Working prototype [waiting for review for many weeks].

Where Am I?

You are currently browsing the Firefox category at Il y a du thé renversé au bord de la table.

Follow

Get every new post delivered to your Inbox.