Designing the Firefox Performance Monitor (2): Monitoring Add-ons and Webpages

November 6, 2015 § Leave a comment

In part 1, we discussed the design of time measurement within the Firefox Performance Monitor. Despite the intuition, the Performance Monitor had neither the same set of objectives as the Gecko Profiler, nor the same set of constraints, and we ended up picking a design that was not a sampling profiler. In particular, instead of capturing performance data on stacks, the Monitor captures performance data on Groups, a notion that we have not discussed yet. In this part, we will focus on bridging the gap between our low-level instrumentation and actual add-ons and webpages, as may be seen by the user.

« Read the rest of this entry »

Designing the Firefox Performance Stats Monitor, part 1: Measuring time without killing battery or performance

October 27, 2015 § Leave a comment

For a few versions, Firefox Nightly has been monitoring the performance of add-ons, thanks to the Performance Stats API. While we are waiting for the greenlight to let it graduate to Firefox Aurora, as well as investigating a few lingering false-positives, and while v2 is approaching steadily, it is time for a brain dump on this toolbox and its design.

The initial objective of this monitor is to be able to flag both add-ons and webpages that cause noticeable slowdowns, so as to let users disable/close whatever is making their use of Firefox miserable. We also envision more advanced uses that could let us find out if features of webpages cause slowdowns on specific OS/hardware combinations.

« Read the rest of this entry »

Detecting slow add-ons

May 6, 2015 § 13 Comments

When it is at its best, Firefox is fast. Really, really fast. When things start slowing down, though, using Firefox is much less fun. So, one of main objectives of the developers of Firefox is making sure that Firefox is and remains as smooth and responsive as humanly possible. There is, however, one thing that can slow down Firefox, and that remains out of the control of the developers: add-ons. Good add-ons are extraordinary, but small coding errors – or sometimes necessary hacks – can quickly drive the performance of Firefox into the ground.

So, how can an add-on developer (or add-on reviewer) find out whether her add-on is fast? Sadly, not much. Testing certainly helps, and the Profiler is invaluable to help pinpoint a slowdown once it has been noticed, but what about the performance of add-ons in everyday use? What about the experience of users?

To solve this issue, we decided to work on a set of tools to help add-on developers and reviewers find out the performance of their add-ons. Oh, and also to let users find out quickly if an add-on is slowing down their everyday experience.


On recent Nightly builds of Firefox, you may now open about:performance to get an overview of the performance cost of add-ons and webpages :

Screen Shot 2015-05-06 at 17.46.15

The main resources we monitor are :

  • jank, which measures how much the add-on impacts the responsiveness of Firefox. For 60fps performance, jank should always remain ≤ 4. If an add-on regularly causes jank to increase past 6, you should be worried.
  • CPOW aka blocking cross-process communications, which measures how much the add-on is causing Firefox to freeze waiting for a process to respond. Anything above 0 is bad.

Note that the design of this page is far from stable. I realise it’s not very user-friendly at the moment, so don’t hesitate to file bugs to help us improve it. Also note that, when running with e10s, the page doesn’t display all the useful information. We are working on it.

add-on Telemetry

Add-on developers and reviewers can now find information on the performance of their add-ons on a dedicated dashboard.

These are real-world performance data, as extracted from user’s computers. The two histograms available for the time being are:

  • MISBEHAVING_ADDONS_JANK_LEVEL, which measures the jank, as detailed above;
  • MISBEHAVING_ADDONS_CPOW_TIME_MS, which measure the amount of time spent in CPOW, as detailed above.

If you are an add-on developer, you should monitor regularly the performance of your add-on on this page. If you notice suspicious values, you should try and find out what causes these performance issues. Don’t hesitate and reach out to us, we will try and help you.

Slow add-on Notification

Add-on developers and reviewers, as well as end-users, are now informed when an add-on causes either jank or CPOW performance issues:

Screen Shot 2015-05-06 at 19.16.19

Note that this feature is not ready to ride the trains, and we do not have a specific idea of when it will be made available for users of Aurora/DeveloperEdition. This is partly because the UX is not good enough yet, partly because the thresholds will certainly change, and partly because we want to give add-on developers time to fix any issue before the users see a dialog that suggest that an add-on should be uninstalled.

Performance Stats API

By the way, we have an API for accessing performance stats. Very imaginatively, it’s called PerformanceStats.jsm [link]. While this API will still change during the coming weeks you can start playing with it if you are interested. Some add-ons may be able to throttle their performance use based on this data. Also, I hope that, in time, someone will be able to write a version of about:performance much nicer than mine :)

Challenges and work ahead

For the moment, we are in the process of stabilizing the API, its implementation and its performance. In parallel, we are working on making the UX of about:performance more useful. Once both are done, we are going to proceed with adding more measurements, making the code more e10s-friendly and measuring the performance of webpages.

If you are an add-on developer and if you feel that your add-on is tagged as slow by error, or if you have great ideas on how to make this data useful, feel free to ping me, preferably on IRC. You can find me on, channel #developers, where I am Yoric.

What David Did During Q3

September 30, 2014 § 6 Comments

September is ending, and with it Q3 of 2014. It’s time for a brief report, so here is what happened during the summer.

Session Restore

After ~18 months working on Session Restore, I am progressively switching away from that topic. Most of the main performance issues that we set out to solve have been solved already, we have considerably improved safety, cleaned up lots of the code, and added plenty of measurements.

During this quarter, I have been working on various attempts to optimize both loading speed and saving speed. Unfortunately, both ongoing works were delayed by external factors and postponed to a yet undetermined date. I have also been hard at work on trying to pin down performance regressions (which turned out to be external to Session Restore) and safety bugs (which were eventually found and fixed by Tim Taubert).

In the next quarter, I plan to work on Session Restore only in a support role, for the purpose of reviewing and mentoring.

Also, a rant The work on Session Restore has relied heavily on collaboration between the Perf team and the FxTeam. Unfortunately, the resources were not always available to make this collaboration work. I imagine that the FxTeam is spread too thin onto too many tasks, with too many fires to fight. Regardless, the symptom I experienced is that during the course of this work, both low-priority, high-priority and safety-critical patches have been left to rot without reviews, despite my repeated requests, for 6, 8 or 10 weeks, much to the dismay of everyone involved. This means man·months of work thrown to /dev/null, along with quarterly objectives, morale, opportunities, contributors and good ideas.

I will try and blog about this, eventually. But please, in the future, everyone: remember that in the long run, the priority of getting reviews done (or explaining that you’re not going to) is a quite higher than the priority of writing code.

Async Tooling

Many improvements to Async Tooling landed during Q3. We now have the PromiseWorker, which simplifies considerably the work of interacting between the main thread and workers, for both Firefox and add-on developers. I hear that the first add-on to make use of this new feature is currently being developed. New features, bugfixes and optimizations landed for OS.File. We have also landed the ability to watch for changes in a directory (under Windows only, for the time being).

Sadly, my work on interactions between Promise and the Test Suite is currently blocked until the DevTools team manages to get all the uncaught asynchronous errors under control. It’s hard work, and I can understand that it is not a high priority for them, so in Q4, I will try to find a way to land my work and activate it only for a subset of the mochitest suites.


I have recently joined the newly restarted effort to improve the performance of Places, the subsystem that handles our bookmarks, history, etc. For the moment, I am still getting warmed up, but I expect that most of my work during Q4 will be related to Places.


Most of my effort during Q3 was spent improving the Shutdown of Firefox. Where we already had support for shutting down asynchronously JavaScript services/consumers, we now also have support for native services and consumers. Also, I am in the process of landing Telemetry that will let us find out the duration of the various stages of shutdown, an information that we could not access until now.

As it turns out, we had many crashes during asynchronous shutdown, a few of them safety-critical. At the time, we did not have the necessary tools to determine to prioritize our efforts or to find out whether our patches had effectively fixed bugs, so I built a dashboard to extract and display the relevant information on such crashes. This proved a wise investment, as we spent plenty of time fighting AsyncShutdown-related fires using this dashboard.

In addition to the “clean shutdown” mechanism provided by AsyncShutdown, we also now have the Shutdown Terminator. This is a watchdog subsystem, launched during shutdown, and it ensures that, no matter what, Firefox always eventually shuts down. I am waiting for data from our Crash Scene Investigators to tell us how often we need this watchdog in practice.


I lost track of how many code contributors I interacted with during the quarter, but that represents hundreds of e-mails, as well as countless hours on IRC and Bugzilla, and a few hours on This year’s mozEdu teaching is also looking good.

We also launched FirefoxOS in France, with big success. I found myself in a supermarket, presenting the ZTE Open C and the activities of Mozilla to the crowds, and this was a pleasing experience.

For Q4, expect more mozEdu, more mentoring, and more sleepless hours helping contributors debug their patches :)

Souhaitez-vous aider le renard à accélérer ?

July 2, 2014 § Leave a comment

Je reçois régulièrement des propositions de volontaires qui souhaiteraient contribuer à Firefox. Habituellement, je les guide vers Bugs Ahoy – si vous ne connaissez pas Bugs Ahoy, foncez le voir, ce moteur de recherche dédié aux tâches accessibles aux débutants est fabuleux. Aujourd’hui, changement de programme : si vous souhaitez contribuer à Firefox, et plus précisément si vous souhaitez contribuer à améliorer les performances de Firefox, voici quelques manières de participer à l’effort de l’équipe Performance.

Session Restore

Session Restore est le composant de Firefox chargé de sauvegarder l’état du navigateur en permanence pour permettre de récupérer d’un crash du navigateur, du système ou du matériel ou d’un redémarrage intempestif sans perdre de données. Je suis en train de réécrire certaines parties de Session Restore pour améliorer sa réactivité (en le rendant parallèle) et sa contribution au temps de démarrage.

Pour vous lancer, quelques bugs d’introduction, e-mentorés par moi :

Tous ces bugs sont en JavaScript.

File I/O

OS.File est le composant de Firefox qui permet à JavaScript d’accéder au disque à haute performance. Je suis en train de réécrire certaines parties de OS.File pour le rendre plus extensible et pour améliorer la réactivité de certaines fonctions critiques. Quelques bugs d’introduction, e-mentorés par moi :

Q2 2014 Report

July 1, 2014 § Leave a comment

Q2 2014 was a difficult quarter at Mozilla, with all the agitation around Brendan Eich, Australis, Media Extensions, etc. Still, I have the feeling that we managed to get a lot done despite the intense pressure. Here is a quick highlight of my main accomplishments for Q2 2014.

Session Restore

A considerable amount of my time was spent working on Session Restore. The main objective is to decrease the jank caused by Session Restore taking snapshots of the session and to decrease the time Session Restore takes to restore the state of Firefox. Much of the activity this quarter dealt with measuring performance, so as to best optimize it and improving safety.

Reworking Session Restore backups

With Firefox 33, the backups of Session Restore state have been completely redesigned. The new system should prove orders of magnitude safer, in addition to now being fully transparent.

Next steps We are still lacking measurements to confirm that this is as successful as the mathematics suggest. If you are interested, there is a mentored bug open.

Talos tests and Telemetry on Session Restore startup

Optimizing startup is difficult, and generally impossible if you do not know what to optimize. With Firefox 32 and 33, we have new benchmarks and real world measurements to help us determine immediately the influence of patches on Session Restore startup.

Next steps Using these benchmarks to experiment with possible optimizations. This is in progress.

Cleaning up Session Restore file

One of our objectives is to decrease the size of the Session Restore file, to reduce the amount of I/O (hence battery use and hardware wear and tear) and memory usage. As a first step, we have introduced a mechanism that progressively removes from the “Undo Close” feature tabs and windows that have been closed at least 2 weeks ago. Interestingly, Telemetry indicates that this clean-up has no effect on the size of the Session Restore file. Experiments run later during the quarter, using the Talos tests, also strongly suggest that the data that we could clean up and that we do not clean up yet have essentially no influence on startup duration.

Next steps I believe that this strategy will therefore not be pursued during the next quarters.

Preserving compatibility with Tor Browser

While refactoring Session Restore, we have hit a number of obstacles in the form of add-ons using private or semi-private APIs that we wished to remove. We have managed to work along with add-on authors and, as far as I know, we have not broken any add-on yet. In particular, we have maintained compatibility with the Tor Browser, which is a heavily customized distribution of Firefox targeted towards privacy.

Next steps Providing a clean API for add-ons. This will require discussing with add-on authors to find out what they need.

Async tooling

I am in charge of the Async Project, which is all about giving front-end and add-on developers tools to develop asynchronous code that does not jank. As usual, this involved plenty of activity in a number of different directions.

Auto-closing Sqlite.jsm databases (mentoring Michael Brennan)

Sqlite.jsm databases can now be closed automatically during garbage-collection. On user’s computers, this will increase safety, as failing to close a database causes shutdown-time assertion failures. However, to use resources effectively, pragmatism dictates that databaes should be closed manually, so failing to close a database in the Mozilla codebase will still cause test failures.

Reworking OS.File shutdown

On devices with little memory (typically Firefox Phones), one of the techniques used to save memory is to shutdown the OS.File worker as early as possible, re-launching it later if necessary. As it turns out, the task is more complicated than it seems, due to possibilities of race conditions. Unfortunately, this means that in some extreme cases, Firefox OS applications could lock down and fail to shutdown properly without being killed by the OS. This is now fixed. Somewhere along the way, this helped us to make the PromiseWorker used by OS.File more resilient to low-level errors.

Next steps Making the PromiseWorker usable by other modules than OS.File, including testing and add-ons.

OS.File for Android and Firefox OS

OS.File was initially designed for desktop devices. Now that it is used in a number of places on mobile devices, I have mercilessly hunted down all compatibility issues between OS.File and our two mobile platforms. Compatibility tests are now activated on all platforms and should avoid any regression.

AsyncShutdown Barrier mechanism

The shutdown process of Firefox has always been a dark and scary place, full of unspecified dependencies. As a result, any refactoring or addition a new dependency could break many things in new and interesting ways. I have introduced the AsyncShutdown Barrier mechanism that lets us specify clear, explicit and extensible dependencies, handles ordering of shutdowns, as well as error reporting if a dependency is unmet. This Barrier is now used by Sqlite.jsm, OS.File, Firefox Health Report, Session Restore, Page Thumbnails and fixes a number of major issues.

Next steps Porting AsyncShutdown Barrier to allow native components to register with it.

Fixing Firefox 30 shutdown freezes (with Tim Taubert)

Many users of Firefox 30 encountered issues that caused Firefox to freeze during shutdown. We found out that the issue was caused was triggered by Page Thumbnails and caused by a bug in ChromeWorkers, which did not handle an error case gracefully. I applied AsyncShutdown Barrier to ensure that Page Thumbnails always completed without triggering the error case, while Tim Taubert ensured that the Chrome Workers handled the error robustly.

Making Firefox Health Report shutdown more robust

While porting Firefox Health Report to AsyncShutdown, we encountered an elusive bug that manifested itself by causing rare shutdown crashes. After months of experimenting, instrumenting and attempting to fix the issue, we eventually traced it back to a more serious bug in shutdown, which apparently does not always send the proper notifications. Using the AsyncShutdown Barrier, we managed to work around the issue and make FHR’s shutdown both more robust and better instrumented in case of crash. This later helped us locate another issue that prevents a proper shutdown when some databases have been corrupted.

Next steps Fix the upstream shutdown bug, make our shutdown more resilient in case of database corruption.

Async testing

The other aspect of writing asynchronous code is making sure that developers can debug it. Now that we have hit a critical mass of developers writing async code, it was high time to help them work with it.

Rewriting Task stack traces to be meaningful

Now that we know how to handle uncaught errors, the main remaining weaknesses of Promise-based and Task-based code is that their stack traces lose much information. Since Firefox 33, Task-based stack traces are now transparently rewritten into something developer-redable. Somewhere along the way, I have also patched xpcshell and mochitests to ensure that they take advantage of this rewriting. Experience shows that this is very useful and that the runtime cost is negligible.

Next steps Evaluate the runtime cost of doing the same thing for Promise-based code.

Making xpcshell tests fail in case of uncaught promise error

Uncaught promise errors were treated by the test suites as warnings, TBPL did not report them, and they remained consequantly more often than not ignored (or even unseen) by the developers. I have reworked the xpcshell test harness to consider all uncaught promise errors as oranges and fixed all offenders.

Next steps Doing the same for mochitests. Code is ready, but a few offenders remain.


Dealing with political feedback around the nomination and departure of Brendan Eich

Along with many others, I made my best to engage people who voiced their negative feedback either at the nomination or at the departure of Brendan Eich. Unfortunately, this took time and efforts, but I believe that staying in touch with our users is part of what makes the difference between Mozilla and other browser vendors.

Working with new contributors

I estimate that I have worked with ~30 potential new contributors during the quarter. Many have unfortunately decided to postpone or abandon their efforts towards contributing, but a few have stayed, to work either with me or with other teams. At the moment, I am following 5 promising contributors. In particular, I am quite happy to welcome Dexter (who is working on a very sophisticated patch to let code watch for file modifications) and Kushagra (who has landed several test suite bugs).

Next steps More of it!

Working with universities

A group of École Centrale de Lyon successfully completed an online tool to help grassroot projects find volunteers. It was nice mentoring them.


I was invited to deliver a presentation on performance at Zedge, in Trondheim, Norway. That was fun :)

Next steps Publish the slides.

And now?

Let’s get started with Q3!

Recent changes to OS.File

April 8, 2014 § 5 Comments

A quick post to summarize some of the recent improvements to OS.File.


To write a string, you can now pass the string directly to writeAtomic:

OS.File.writeAtomic(path, "Here is a string", { encoding: "utf-8"})

Similarly, you can now read strings from read:, { encoding: "utf-8" } ); // Resolves to a string.

Doing this is at least as fast as calling TextEncoder/TextDecoder yourself (see below).

Native implementation has been reimplemented in C++. The main consequence is that this function can now be used safely during startup, without having to wait for the underlying OS.File ChromeWorker to start. Also, decoding (see above) is performed off the main thread, which makes it much faster.

According to my benchmarks, using to read strings is about 2-5x faster than NetUtil.asyncFetch on large files and doesn’t block the main thread for more than 5ms, while asyncFetch performs string decoding on the main thread. Also, it doesn’t perform any main thread I/O by opposition to NetUtil.asyncFetch.


When using writeAtomic, it is now possible to request existing files to be backed up almost atomically. In many cases, this is a good strategy to ensure that data is safely written to disk, without having to use a flush, which would be expensive for the whole system.

yield OS.File.writeAtomic(path, data, { tmpPath: path + ".tmp", backupTo: path + ".backup} } );


writeAtomic and read both now support an implementation of lz4 compression

yield OS.File.writeAtomic(path, data, { compression: "lz4"});
yield, { compression: "lz4"});

Note that this format will not be understood by any command-line tool. It is somewhat proprietary. Also note that (de)compression is performed on the ChromeWorker thread for the time being, so it doesn’t benefit from the native reimplementation mentioned above.

Creating directories recursively

let dir = OS.Path.join(OS.Constants.Path.profileDir, "a", "b", "c", "d");
yield OS.File.makeDir(dir, { from: OS.Constants.Path.profileDir });

Where Am I?

You are currently browsing the C++ category at Il y a du thé renversé au bord de la table.


Get every new post delivered to your Inbox.

Join 37 other followers