July 1, 2014 § Leave a comment
Q2 2014 was a difficult quarter at Mozilla, with all the agitation around Brendan Eich, Australis, Media Extensions, etc. Still, I have the feeling that we managed to get a lot done despite the intense pressure. Here is a quick highlight of my main accomplishments for Q2 2014.
A considerable amount of my time was spent working on Session Restore. The main objective is to decrease the jank caused by Session Restore taking snapshots of the session and to decrease the time Session Restore takes to restore the state of Firefox. Much of the activity this quarter dealt with measuring performance, so as to best optimize it and improving safety.
Reworking Session Restore backups
With Firefox 33, the backups of Session Restore state have been completely redesigned. The new system should prove orders of magnitude safer, in addition to now being fully transparent.
Next steps We are still lacking measurements to confirm that this is as successful as the mathematics suggest. If you are interested, there is a mentored bug open.
Talos tests and Telemetry on Session Restore startup
Optimizing startup is difficult, and generally impossible if you do not know what to optimize. With Firefox 32 and 33, we have new benchmarks and real world measurements to help us determine immediately the influence of patches on Session Restore startup.
Next steps Using these benchmarks to experiment with possible optimizations. This is in progress.
Cleaning up Session Restore file
One of our objectives is to decrease the size of the Session Restore file, to reduce the amount of I/O (hence battery use and hardware wear and tear) and memory usage. As a first step, we have introduced a mechanism that progressively removes from the “Undo Close” feature tabs and windows that have been closed at least 2 weeks ago. Interestingly, Telemetry indicates that this clean-up has no effect on the size of the Session Restore file. Experiments run later during the quarter, using the Talos tests, also strongly suggest that the data that we could clean up and that we do not clean up yet have essentially no influence on startup duration.
Next steps I believe that this strategy will therefore not be pursued during the next quarters.
Preserving compatibility with Tor Browser
While refactoring Session Restore, we have hit a number of obstacles in the form of add-ons using private or semi-private APIs that we wished to remove. We have managed to work along with add-on authors and, as far as I know, we have not broken any add-on yet. In particular, we have maintained compatibility with the Tor Browser, which is a heavily customized distribution of Firefox targeted towards privacy.
Next steps Providing a clean API for add-ons. This will require discussing with add-on authors to find out what they need.
I am in charge of the Async Project, which is all about giving front-end and add-on developers tools to develop asynchronous code that does not jank. As usual, this involved plenty of activity in a number of different directions.
Auto-closing Sqlite.jsm databases (mentoring Michael Brennan)
Sqlite.jsm databases can now be closed automatically during garbage-collection. On user’s computers, this will increase safety, as failing to close a database causes shutdown-time assertion failures. However, to use resources effectively, pragmatism dictates that databaes should be closed manually, so failing to close a database in the Mozilla codebase will still cause test failures.
Reworking OS.File shutdown
On devices with little memory (typically Firefox Phones), one of the techniques used to save memory is to shutdown the OS.File worker as early as possible, re-launching it later if necessary. As it turns out, the task is more complicated than it seems, due to possibilities of race conditions. Unfortunately, this means that in some extreme cases, Firefox OS applications could lock down and fail to shutdown properly without being killed by the OS. This is now fixed. Somewhere along the way, this helped us to make the PromiseWorker used by OS.File more resilient to low-level errors.
Next steps Making the PromiseWorker usable by other modules than OS.File, including testing and add-ons.
OS.File for Android and Firefox OS
OS.File was initially designed for desktop devices. Now that it is used in a number of places on mobile devices, I have mercilessly hunted down all compatibility issues between OS.File and our two mobile platforms. Compatibility tests are now activated on all platforms and should avoid any regression.
AsyncShutdown Barrier mechanism
The shutdown process of Firefox has always been a dark and scary place, full of unspecified dependencies. As a result, any refactoring or addition a new dependency could break many things in new and interesting ways. I have introduced the AsyncShutdown Barrier mechanism that lets us specify clear, explicit and extensible dependencies, handles ordering of shutdowns, as well as error reporting if a dependency is unmet. This Barrier is now used by Sqlite.jsm, OS.File, Firefox Health Report, Session Restore, Page Thumbnails and fixes a number of major issues.
Next steps Porting AsyncShutdown Barrier to allow native components to register with it.
Fixing Firefox 30 shutdown freezes (with Tim Taubert)
Many users of Firefox 30 encountered issues that caused Firefox to freeze during shutdown. We found out that the issue was caused was triggered by Page Thumbnails and caused by a bug in ChromeWorkers, which did not handle an error case gracefully. I applied AsyncShutdown Barrier to ensure that Page Thumbnails always completed without triggering the error case, while Tim Taubert ensured that the Chrome Workers handled the error robustly.
Making Firefox Health Report shutdown more robust
While porting Firefox Health Report to AsyncShutdown, we encountered an elusive bug that manifested itself by causing rare shutdown crashes. After months of experimenting, instrumenting and attempting to fix the issue, we eventually traced it back to a more serious bug in shutdown, which apparently does not always send the proper notifications. Using the AsyncShutdown Barrier, we managed to work around the issue and make FHR’s shutdown both more robust and better instrumented in case of crash. This later helped us locate another issue that prevents a proper shutdown when some databases have been corrupted.
Next steps Fix the upstream shutdown bug, make our shutdown more resilient in case of database corruption.
The other aspect of writing asynchronous code is making sure that developers can debug it. Now that we have hit a critical mass of developers writing async code, it was high time to help them work with it.
Rewriting Task stack traces to be meaningful
Now that we know how to handle uncaught errors, the main remaining weaknesses of Promise-based and Task-based code is that their stack traces lose much information. Since Firefox 33, Task-based stack traces are now transparently rewritten into something developer-redable. Somewhere along the way, I have also patched xpcshell and mochitests to ensure that they take advantage of this rewriting. Experience shows that this is very useful and that the runtime cost is negligible.
Next steps Evaluate the runtime cost of doing the same thing for Promise-based code.
Making xpcshell tests fail in case of uncaught promise error
Uncaught promise errors were treated by the test suites as warnings, TBPL did not report them, and they remained consequantly more often than not ignored (or even unseen) by the developers. I have reworked the xpcshell test harness to consider all uncaught promise errors as oranges and fixed all offenders.
Next steps Doing the same for mochitests. Code is ready, but a few offenders remain.
Dealing with political feedback around the nomination and departure of Brendan Eich
Along with many others, I made my best to engage people who voiced their negative feedback either at the nomination or at the departure of Brendan Eich. Unfortunately, this took time and efforts, but I believe that staying in touch with our users is part of what makes the difference between Mozilla and other browser vendors.
Working with new contributors
I estimate that I have worked with ~30 potential new contributors during the quarter. Many have unfortunately decided to postpone or abandon their efforts towards contributing, but a few have stayed, to work either with me or with other teams. At the moment, I am following 5 promising contributors. In particular, I am quite happy to welcome Dexter (who is working on a very sophisticated patch to let code watch for file modifications) and Kushagra (who has landed several test suite bugs).
Next steps More of it!
Working with universities
A group of École Centrale de Lyon successfully completed an online tool to help grassroot projects find volunteers. It was nice mentoring them.
I was invited to deliver a presentation on performance at Zedge, in Trondheim, Norway. That was fun
Next steps Publish the slides.
Let’s get started with Q3!
July 17, 2013 § 3 Comments
One of the main objectives of Project Async is to encourage Firefox developers and Firefox add-on developers to use Chrome Workers to ensure that whatever they do doesn’t block Firefox’ UI thread. The main obstacle, for the moment, is that Chrome Workers have access to very few features, so one the tasks of Project Async is to add features to Chrome Workers.
Today, let me introduce the Module Loader for Chrome Workers.
March 16, 2013 § 5 Comments
These days, everybody seems to be talking about Firefox OS. About how removing the barrier of the marketplace will make the world a better place, or about how HTML5 is so darn great, or about the fact that a gazillion constructors and operators are supporting Firefox OS. And that’s great, because Firefox OS is an impressively good product and deserves this attention.
However, all this craze is missing one feature that makes Firefox OS my choice of mobile operating system: I can write a playable prototype for a simple game, from scratch, in two hours.
Of course, this was a prototype, and completing the game took me a few more days of adding 8 bit graphics, optimizing, toying with the rules, adding difficulty levels, high scores, etc. But after just two hours, I could play the game on computer, tablet and cellphone, and decide where to proceed from here. This was both my first HTML5 game and my first mobile game, by the way. It is by no means an AAA game, but it is fun enough that I sometimes play it in the subway. By the way, did I mention that, once I was satisfied with this game, I could publish it in just a few seconds, simply by hosting it anywhere on the web?
Oh, and another feature: I wrote a quite usable comic book reader in the subway, while commuting from/to work. It took me a few days of commuting (three days, I seem to remember) to obtain a tool that works quite nicely. Due to screen size, I prefer using it on my Android tablet than on a cellphone, but that’s the wonders of HTML5 and Open Web Applications: I developed for one, and it worked for both. Did I mention that this was my first attempt at writing a web application that does file I/O or that uses the touch screen intelligently? I will try and finalize and release this application one of these days.
Now, other developers or users might not share this feeling, but this simplicity to start coding and publish and evolve a game or application is of tremendous importance to me. Because one day, I will have a child in age of playing video games. And for his birthday, I will have a chance to download a 5€ game from the Firefox Marketplace (or anywhere else), but more importantly, I will be able to build a game with his favorite characters as support cast and him as a hero. I hope he will love it. And I will not need to ask for permission.
If there is some application you want to develop, neither will you.
February 27, 2013 § 3 Comments
It has been quite some time since the last update. Since then, many things have happened, both with the Student Projects and with the world of Mozilla. We have had the exciting FirefoxOS AppDays, many alpha, beta and near-final versions of FirefoxOS, and the MWC launch of FirefoxOS.
Well, without further ado, let us see how the student projects have progressed.
November 12, 2012 § 4 Comments
Have you ever encountered one of these bugs? One in which every single line of your code is correct, in which every type-check passes, every single unit test succeeds, the specifications are fulfilled but somehow, for no reason that can be explained rationally, it just does not work? I call them Science-Fiction Bugs. I am sure that you have met some of them. For some reason, the Mozilla Performance Team seems to stumble upon such bugs rather often, perhaps because we spend so much time refactoring other team’s code long after the original authors have moved on to other features, and combining their code with undertested libraries and technologies. Truly, this is life on the Frontier.
Today, I would like to tell you the tale of one of these Science-Fiction Bugs: The Thing That Killed Talos.
December 13, 2011 § 7 Comments
One of the key components of
In a previous post, I introduced
OS.File, a Mozilla Platform library designed make the life of developers easier and to help them produce high-performance, high-responsiveness file management routines.
In this post, I would like to concentrate on one of the core items of
OS.File: the Schedule API. Note that the Schedule API is not limited to
OS.File and is designed to be useful for all sorts of other modules.
December 6, 2011 § 28 Comments