Making Firefox Feel as Fast as its Benchmarks – part 3 – Going multi-threaded

October 29, 2013 § 11 Comments

As we saw in the previous posts, our browser behaves as follows

function browser() {
  while (true) {
    handleEvents();  // Let's make this faster
    updateDisplay();
  }
}

The key to making the browser smooth is to make handleEvents() do less. We have already discussed the ongoing work to make Firefox multi-process, their goals and their limitations. Another, mostly orthogonal, path, is to go multi-threaded.

Going multi-threaded

Going multi-threaded is all about splitting the event loop in several loops, executed concurrently, on several cores whenever applicable and possible:

function browser() {
  main() ||| worker() ||| worker() // Running concurrently
}

task main() { // Main thread (time-critical)
  while (true) {
    handleEvents(); // Some of your code here
    updateDisplay();
  }
}

task worker() {
  while (true) {
    handleEvents(); // Some of your code here
  }
}

task worker() {
  while (true) {
    updateDisplay();
  }
}

The main thread remains time-critical and needs to loop 60 times per second, while other threads handle some of the workload of both handleEvents() and updateDisplay(). Now, this treatment is only useful if we can isolate operations that slow down the main loop measurably. As it turns out, there are many such operations lying around, including:

  • Network I/O;
  • Disk I/O;
  • Database I/O;
  • GPU I/O;
  • Treating large amounts of data.

It is easy to see why Network I/O could considerably slow down the main loop, if it were handled on the main thread – after all, some requests take several seconds to receive a reply, or never do, and if the main thread had to wait for the completion of these requests before it proceeded, this would cause multi-second gaps between two frames, which is simply not acceptable.

The cost of disk I/O, however, is often underestimated. Few developers realize that _any_ disk operation can take an unbounded amount of time – even closing a file or checking whether a file exists can, in some cases, take several seconds. This may seem counter-intuitive, as these operations do very little besides book-keeping, but one must not forget that they rely upon the device itself and that said device can unpredictably become very slow, typically because it is otherwise busy, or asleep – or even because that device is actually a network device. Database I/O is a special case of Disk I/O that we generally single out because its cost is often much higher than users suspect – recall that, in addition to saving, a database management system will typically need to maintain a journal and to flush the drive regularly, to protect data against both software or hardware failures, including sudden power loss. Consequently, unless the database has been heavily customized to lift the safety requirements in favor of performance, you should expect that every operation on your database will cause heavy disk I/O.

Finally, treating large amounts of data, or applying any other form of heavy algorithm, will of course take time.

None of these operations should take place on the main thread. Moving them off the main thread will largely contribute to getting rid of the jank caused by these operations.

Coding for multi-threading

In the Firefox web browser, threads are materialized as instances of nsIThread in C++ code and as instances of ChromeWorker in JavaScript code. For this discussion, I will concentrate on JavaScript code as refactoring C++ code is, well, complicated. Side-note: if you are new here, recall that Chrome Workers have nothing to with the Chrome Web Browser and everything to do with the Mozilla Chrome, i.e. the parts of Gecko and Firefox written in JavaScript.

Chrome Workers are an extension of Web Workers, and have the same semantics, plus a few additions. Instantiating a ChromeWorker requires a source file:

let worker = new ChromeWorker("resource://path/to/my_file.js");

We may send messages to and from a Chrome Worker

// In the parent
worker.postMessage(someValue);

// In the worker
self.postMessage(someValue);

and, of course, receive messages

// In the parent
worker.addEventListener("message", function(msg) {
// A copy of the message appears in msg.data
});

// In the worker
self.addEventListener("message", function(msg) {
// A copy of the message appears in msg.data
});

In either case, the contents of the message gets copied between threads, with essentially the same semantics as JSON.stringify/JSON.parse. If necessary, binary data in messages (ArrayBuffer or the upcoming Typed Objects) can be transferred instead of being copied, which is faster.

As Web Workers, Chrome Workers are very good to perform computations. In addition, they have a number of low-level libraries to access system features. Such libraries can be loaded with the chrome worker module loader:

let MyModule = require("resource://...");

Further modules can be defined for consumption with the chrome worker module loader:

module.exports = {
  foo: // ...
};

Finally, they can call into C code using the js-ctypes foreign function interface:

let lib = ctypes.open("path/to/my_lib");
let fun = lib.declare("myFunction", ctypes.void); // void myFunction()
fun(); // Call into C

Combining the module loader and js-ctypes makes for a powerful combination that has been used to provide access to low-level libraries, including low-level file manipulation (module OS.File), phone communication (module RIL, shorthand for Radio Interface Layer), file (de)compression, etc.

Limitations

Where multi-process is good at protecting a process against other processes, going multi-threaded works nicely to protect a process (a tab, the ui, etc.) against itself. Threads take up much less resources than processes and are also much faster to start and stop. However, they have very strict limitations.

The main limitation is that they do not have access to all the main thread APIs. Each API needs to be ported individually to chrome workers. Until recently, there was no manner to define or load modules. At the moment, there is no way to read or write a compressed file from a Chrome Worker, or to access a database from a Chrome Worker. In most cases, this is only a question of time and manpower, and we can hope to eventually bring almost all important APIs to Chrome Workers. However, some APIs cannot be ported at all, in particular any API that requires a DOM window, which is most (fortunately not all) DOM APIs.

Also, the paradigm behind Chrome Workers is purely asynchronous. This means that there is no way for a Chrome Worker to wait synchronously until some treatment has been completed by the main thread. This complicates code in a few cases but, in general, this is rarely a problem.

Also, the communication mechanism needs to be taken into account:  as copying long messages can block the main thread. In some cases, it may be necessary to perform aggressive optimization of messages to avoid such situations.

Refactoring for multi-threading

The first thing to take into consideration while refactoring for multi-process is whether this is the best strategy. Since most APIs and most customization possibilities live on the main thread, most features need to be produced and/or consumed by the main thread. This does not mean that going multi-threaded is not possible, only that your code will probably end up looking like an asynchronous API meant to be used mostly on the main thread but implemented off the main thread. This also means that your consumers must be architectured to accept an asynchronous API. We will cover making things asynchronous in another entry of this series.

Once we have decided to go multi-threaded, the next part is to determine what goes of the main thread. Generally, you want to move as much as you can off the main thread. The only limits are things that you simply cannot move off the main thread (e.g. access to the document), or if you realize that the data you need to copy (not transfer) across threads will slow down the main thread inacceptably. This, of course, is something that can be determined only by benchmarking.

Next, you will need to define a communication protocol between the main thread and the worker. Threads communicate by sending pure data (i.e. objects without methods, without DOM nodes, etc.) and binary data can be transfered for high-performance. Recall that communications are asynchronous, so if you want a thread to respond to another one, you will need to build into your protocol identification to match a reply to a request. This is not built-in, but quite easy to do. Handling errors requires a little finesse, as uncaught exceptions on the worker are transmitted to a onerror listener instead of the usual onmessage listener, and lose some information along the way.

In some (hopefully rare) cases, you will need to add new bindings to native code, so as to call C functions (only C, not C++) from JavaScript. For this purpose, take a look at the documentation of js-ctypes, our JavaScript FFI, and osfile_shared_allthreads.jsm, a set of lightweight extensions to js-ctypes that handle a number of platform-specific gotchas. As finding the correct libraries to link is sometimes tricky, you should take advantage of OS.Constants.Path, that already lists some of them. Don’t hesitate to file bugs if you realize that something important is missing. Also, in a few (hopefully almost non-existent) cases, you will need to expose additional C code to native code, typically to expose some C++-only features. For this purpose, take a look at an example.

Unsurprisingly, the next step is to write the JS code. The usual caveats apply, just don’t forget to use the module system. Worker code goes into its own file, typically with extension “.js”. It is generally a good idea to mention “worker” in the name of the file, e.g. “foo_worker.js”, and to deploy your code to "resource://.../worker/..." or "chrome://.../worker/..." to avoid ambiguities. To construct the worker, it is then sufficient to call new ChromeWorker("resource://path/to/your/file.js"). The worker code will be started lazily when the first message is sent.

For automated testing, you can for instance use mochitest-chrome or (once bug 930924 has landed) xpcshell-tests. In the latter, if you need to add new worker code for the sake of testing, you should install it with the chrome:// protocol. Also, for any testing, don’t forget to look at your system console, as worker errors are displayed on that console by default.

That’s it! In a future blog entry, I will write more about common patterns for writing or refactoring asynchronous code, which comes in very handy for code that uses your new API.

Contributing

Refactoring Firefox as a set of asynchronous APIs backed by off main thread implementations is a considerable task. To make it happen, the best way is to contribute to coding, testing or documentation

Asynchronous database connections in the Mozilla Platform

July 19, 2013 § 2 Comments

One of the core components of the Mozilla Platform is mozStorage, our low-level database, based on sqlite3. mozStorage is used just about everywhere in our codebase, to power indexedDB, localStorage, but also site permissions, cookies, XUL templates, the download manager (*), forms, bookmarks, the add-ons manager (*), Firefox Health Report, the search service (*), etc. – not to mention numerous add-ons.

(*) Some components are currently moving away from mozStorage for performance and footprint reasons as they do not need the safety guarantees provided by mozStorage.

A long time ago, mozStorage and its users were completely synchronous and main-thread based. Needless to say, this eventually proved to be a design that doesn’t scale very well. So, we set out on a quest to make mozStorage off main thread-friendly and to move all these uses off the main thread.

These days, whether you are developing add-ons or contributing to the Mozilla codebase, everything you need to access storage off the main thread are readily available to you. Let me introduce the two recommended flavors.

Note: This blog entry does not cover using database from *web applications* but from the *Mozilla Platform*. From web applications, you should use indexedDB.

« Read the rest of this entry »

Chrome Workers, now with modules!

July 17, 2013 § 3 Comments

One of the main objectives of Project Async is to encourage Firefox developers and Firefox add-on developers to use Chrome Workers to ensure that whatever they do doesn’t block Firefox’ UI thread. The main obstacle, for the moment, is that Chrome Workers have access to very few features, so one the tasks of Project Async is to add features to Chrome Workers.

Today, let me introduce the Module Loader for Chrome Workers.

« Read the rest of this entry »

Announcing Project Async & Responsive

April 10, 2013 § 24 Comments

tl;dr

Project Snappy has been retired and replaced by several smaller projects, including Async & Responsive. The objective of this project is to improve the responsiveness of Firefox and the Mozilla Platform by converting key components to make them asynchronous and, wherever possible, to move them off the main thread.

The setting

Firefox and other Mozilla applications are great products, in particular in terms of performance. They are based on an extremely fast rendering engine, Gecko, and its companion JavaScript engine, which in addition to being the richest JS engine around, is also, these days, quite possibly the fastest. What is not so great, unfortunately, is that despite these great core performances, Mozilla applications have often been perceived as slow and sluggish.

Project Snappy was formed about 18 months ago to focus the effort by Mozilla developers to fight this perceived sluggishness. During this period, we have made tremendous progress, thanks to the commitment of everyone involved. Indeed, most of the long-term objectives of Snappy have been reached already. We have therefore decided to retire project Snappy, in favor of both a larger project Performance, and several sub-projects focusing on distinct aspects of Performance.

Let me introduce Asynchronous & Responsive [1], one of the sub-projects of Performance.

Project outline

Despite considerable progress, much of Firefox still behaves as a single-threaded application. Most services and components are initialized sequentially in the main thread, run in the main thread, are shutdown sequentially in the main thread. Also, most add-ons execute essentially in the main thread. As a consequence, any long-lived task can disrupt the user experience.

There are historical reasons for this, but in most cases, there is not deep blocker that would prevent us from rewriting services. Project Asynchronous & Responsive is now starting to support and focus the ongoing effort to get rid of main thread services and components, both in platform code and in add-on code, for the betterment of all Mozillakind.

This entails:

  • identifying blockers that prevent platform and add-on developers from deploying their code on non-main threads (generally, worker threads);
  • helping platform and add-on developers transition their code off-main thread;
  • actually transitioning some of our services and components off the main thread.

Please note that we have no intention of working on the JavaScript VM, on DOM or Graphics. These teams already have dedicated developers working on moving things off the main thread.

Following our progress

As I am the tech lead of this project, you will find more information on this blog, under category Performance.

I will try and post updates every second week.

[1] If you have an idea of a nicer name that does not sound too much like “Snappy”, we are interested :) Marxist jokes about Workers might or might not be accepted.

Beautiful Off Main Thread File I/O

October 18, 2012 § 7 Comments

Now that the main work on Off Main Thread File I/O for Firefox is complete, I have finally found some time to test-drive the combination of Task.js and OS.File. Let me tell you one thing: it rocks!

« Read the rest of this entry »

(re)introducing OS.File

June 27, 2012 § 6 Comments

OS.File is a new JavaScript library available to Firefox and Thunderbird developers and add-on developers. This library offers efficient, low-level, backgrounded, interaction with the file system, with a number of primitives to take advantage of the specific features of each platform. It is also a nice example of systems programming in JavaScript. Please use it, look at the code, and please report bugs and missing features.

(re)Introducing OS.File

A considerable aspect of our work, at Mozilla, is to ensure that the user experience is smooth and responsive. One of the main tools available to developers to permit such responsive code is multi-threading: any computation or interaction with the system that takes too long can (and should) be pushed into the background, and should interact asynchronously with the user interface.

Now, one of critical bottlenecks in any application is I/O: accessing the disk (or the network, or the database…) is typically orders of magnitude slower than any in-memory operation – plus it can sometimes disrupt the user experience of the complete system. This is true on desktop systems and this is even more true on smartphones and tablets.

What this means is that we need a nice library to perform I/O, and by nice, I mean:

  • I/O should be backgrounded;
  • the number of I/O operations should be carefully controlled.

This is what OS.File is all about: OS.File is a library available to developers (including add-on developers) on the Mozilla platforms
(Firefox, Thunderbird, Songbird, InstantBird, Boot-to-Gecko, etc.). This library is available (only) to JavaScript, and it offers
low-level access to the file system, available to background threads.

As its name implies, OS.File is a system library, not a web library, so web application developers will not have access to it.

A first usable version of OS.File has landed a few days ago and is now available on nightly build of Mozilla Platform applications. We are progressively working on adding features, and I would like to invite all developers who need to do I/O to try it, report any bugs and request any features they need.

Using OS.File

OS.File offers both a cross-platform API (module OS.File itself) and bindings to platform-specific functions (modules OS.Win.File and OS.Unix.File), as well as utilities for system programming (modules OS.Shared and OS.Constants). In this post, I will only discuss module OS.File itself.

By design, in this first delivery, module OS.File is quite minimalistic. Features will be added progressively (see next section). You can find the documentation of OS.File on MDN, as usual.

For the moment, module OS.File can be used only from a chrome worker (i.e. a privileged JavaScript background thread).

Renaming a file


OS.File.move("a.tmp", "b.tmp");

In case of error, this will raise an exception of type OS.File.Error.

Copying a file, handling errors, options


try {
  OS.File.copy("b.tmp", "c.tmp", {noOverwrite: true});
} catch(ex) {
  if (ex.becauseNoSuchFile) {
    // b.tmp does not exist
  } else if (ex.becauseFileExists) {
    // c.tmp exists and we do not want to overwrite it
  }
}

Open a file, read a prefix


let buffer = new ArrayBuffer(12); // Also works with a js-ctypes C pointer
let file
try {
  file = OS.File.open("myfile.tmp"); // No options: open for reading
  let bytes = file.read(buffer, 12);
  // Do something with these bytes
  // ...
} finally {
  if (file) {
    file.close();
  }
}

Open a file for writing


let file = OS.File.open("myfile.tmp", {create:true}); // Fail if the file already exists

Note that this operation will only require one I/O interaction with the operating system – this is much faster than first checking whether the file already exists, and then creating it if it does not.

Open a file with OS-specific options


let file = OS.File.open("myfile.tmp",
  {create:true},
  {unixMode: OS.Constants.libc.S_IRWXU | OS.Constants.libc.S_IRWXG }
);

Short FAQ

What’s good about OS.File?

  • Finally, file I/O for JavaScript workers.
  • An API much more JavaScript-friendly than what already existed in the Mozilla Platform.
  • Options and low-level functions to ensure that we perform minimal amount of actual I/O.

Wasn’t all that already possible?

The existing I/O libraries on the Mozilla Platform could not be used from background threads. Some functions could be backgrounded, but only very few of them.

JavaScript-friendly wrappers had been written around these libraries, but they only covered a few of the features of these libraries, in addition to which they could not be used from background threads either.

How is OS.File implemented?

OS.File is implemented in pure JavaScript, using the (very nice) js-ctypes library to perform calls to the OS APIs.

Why JavaScript and not C++?

Because we want the code to be easily accessible to the community.

Isn’t that slow?

Well, firstly, JavaScript has grown into a very fast language. These days, expecting without benchmarks that C++ is faster than JavaScript on hot code can cause surprises.

In addition, writing the library in C++ would have meant that we needed to cross language barriers quite often, which is bad for performance, due to:

  • complex memory management;
  • bad JIT-ability; and
  • need to convert all data structures, in particular strings.

We attempt to avoid this as much as possible.

For the moment, however, OS.File has not been benchmarked. We await real-world applications.

Work in progress

We are currently hard at work extending OS.File. The next few landings should add:

Features are driven by application requirements, so if you need some other feature, please do not hesitate to contact me on IRC or to file a bug on Bugzilla.

Introducing JavaScript native file management

December 6, 2011 § 28 Comments

Summary

The Mozilla Platform keeps improving: JavaScript native file management is an undergoing work to provide a high-performance JavaScript-friendly API to manipulate the file system.

The Mozilla Platform, JavaScript and Files

The Mozilla Platform is the application development framework behind Firefox, Thunderbird, Instantbird, Camino, Songbird and a number of other applications.

While the performance-critical components of the Mozilla Platform are developed in C/C++, an increasing number of components and add-ons are implemented in pure JavaScript. While JavaScript cannot hope to match the speed or robustness of C++ yet (edit: at least not on all aspects), the richness and dynamism of the language permit the creation of extremely flexible and developer-friendly APIs, as well as quick prototyping and concise implementation of complex algorithms without the fear of memory errors and with features such as higher-level programming, asynchronous programming and now clean and efficient multi-threading. If you combine this with the impressive speed-ups experienced by JavaScript in the recent years, it is easy to understand why the language has become a key element in the current effort to make the Mozilla Platform and its add-ons faster and more responsive at all levels.

« Read the rest of this entry »

OCaml Discussions

January 26, 2008 § Leave a comment

La communauté OCaml vient de gagner un lieu de discussion pour les recommandations sur la standardisation.

Pour ceux qui sont intéressés, rendez-vous sur le wiki de l’Alliance OCaml.

Des projets sont aussi en train de se mettre en place pour un prochain Summer of Code. Si vous avez des sujets à suggérer, rendez-vous sur la page correspondante.

« Read the rest of this entry »

Where Am I?

You are currently browsing entries tagged with threads at Il y a du thé renversé au bord de la table.

Follow

Get every new post delivered to your Inbox.