Firefox, the Browser that has your Back[up]

June 26, 2014 § 19 Comments

One of the most important features of Firefox, in my opinion, is Session Restore. This component is responsible for ensuring that, even in case of crash, or if you upgrade your browser or an add-on that requires restart, your browser can reopen immediately and in the state in which you left it. As far as I am concerned, this feature is a life-safer.

Unfortunately, there are a few situations in which the Session Restore file may be corrupted – typically, if the computer is rebooted before the write is complete, or if it loses power, or if the operating system crashes or the disk is disconnected, we may end up losing our precious Session Restore. While any of these circumstances happens quite seldom, it needs to be applied as part of the following formula:

seldom · .5 billion users = a lot

I am excited to announce that we have just landed a new and improved Session Restore component in Firefox 33 that protects your precious data better than ever.

How it works

Firefox needs Session Restore to handle the following situations:

  • restarting Firefox without data loss after a crash of either Firefox, the Operating System, a driver or the hardware, or after Firefox has been killed by the Operating System during shutdown;
  • restarting Firefox without data loss after Firefox has been restarted due to an add-on or an upgrade;
  • quitting Firefox and, later, restarting without data loss.

In order to handle all of this, Firefox needs to take a snapshot of the state of the browser whenever anything happens, whether the user browses, fills a form, scrolls, or an application sets a Session Cookie, Session Storage, etc. (this is actually capped to one save every 15 seconds, to avoid overloading the computer). In addition, Firefox performs a clean save during shutdown.

While at the level of the application, the write mechanism itself is simple and robust, a number of things beyond the control of the developer can prevent either the Operating System or the hard drive itself from completing this write consistently – a typical example being tripping on the power plug of a desktop computer during the write.

The new mechanism involves two parts:

  • keeping smart backups to maximize the chances that at least one copy will be readable;
  • making use of the available backups to transparently avoid or minimize data loss.

The implementation actually takes very few lines of code, the key being to know the risks against which we defend.

Keeping backups

During runtime, Firefox remembers which files are known to be valid backups and which files should be discarded. Whenever a user interaction or a script requires it, Firefox writes the contents of Session Restore to a file called sessionstore-backups/recovery.js. If it is known to be good, the previous version of sessionstore-backups/recovery.js is first moved to sessionstore-backups/recovery.bak. In most cases, both files are valid and recovery.js contains a state less than 15 seconds old, while recovery.bak contains a state less than 30 seconds old. Additionally, the writes on both files are separated by at least 15 seconds. In most circumstances, this is sufficient to ensure that, even of hard drive crash during a write to recover.js, at least recovery.bak has been entirely written to disk.

During shutdown, Firefox writes a clean startup file to sessionstore.js. In most cases, this file is valid and contains the exact state of Firefox at the time of shutdown (minus some privacy filters). During startup, if sessionstore.js is valid, Firefox moves it to sessiontore-backup/previous.js. Whenever this file exists, it is valid and contains the exact state of Firefox at the time of the latest clean shutdown/startup. Note that, in case of crash, the latest clean shutdown/startup might be older than the latest actual startup, but this backup is useful nevertheless.

Finally, on the first startup after an update, Firefox copies sessionstore.js, if it is available and valid, to sessionstore-backups/upgrade.js-[build id]. This mechanism is designed primarily for testers of Firefox Nightly, who keep on the very edge, upgrading Firefox every day to check for bugs. Testers, if we introduce a bug that affects Session Restore, this can save your life.

As a side-note, we never use the operating system’s flush call, as 1/ it does not provide the guarantees that most developers expect; 2/ on most operating systems, it causes catastrophic slowdowns.


All in all, Session Restore may contain the following files:

  • sessionstore.js (contains the state of Firefox during the latest shutdown – this file is absent in case of crash);
  • sessionstore-backups/recovery.js (contains the state of Firefox ≤ 15 seconds before the latest shutdown or crash – the file is absent in case of clean shutdown, if privacy settings instruct us to wipe it during shutdown, and after the write to sessionstore.js has returned);
  • sessionstore-backups/recovery.bak (contains the state of Firefox ≤ 30 seconds before the latest shutdown or crash – the file is absent in case of clean shutdown, if privacy settings instruct us to wipe it during shutdown, and after the removal of sessionstore-backups/recovery.js has returned);
  • sessionstore-backups/previous.js (contains the state of Firefox during the previous successful shutdown);
  • sessionstore-backups/upgrade.js-[build id] (contains the state of Firefox after your latest upgrade).

All these files use the JSON format. While this format has drawbacks, it has two huge advantages in this setting:

  • it is quite human-readable, which makes it easy to recover manually in case of an extreme crash;
  • its syntax is quite rigid, which makes it easy to find out whether it was written incompletely.

As our main threat is a crash that prevents us from writing the file entirely, we take advantage of the latter quality to determine whether a file is valid. Based on this, we test each file in the order indicated above, until we find one that is valid. We then proceed to restore it.

If Firefox was shutdown cleanly:

  1. In most cases, sessionstore.js is valid;
  2. In most cases in which sessionstore.js is invalid, sessionstore-backups/recovery.js is still present and valid (the likelihood of it being present is obviously higher if privacy settings do not instruct Firefox to remove it during shutdown);
  3. In most cases in which sessionstore-backups/recovery.js is invalid, sessionstore-backups/recovery.bak is still present, with an even higher likelihood of being valid (the likelihood of it being present is obviously higher if privacy settings do not instruct Firefox to remove it during shutdown);
  4. In most cases in which the previous files are absent or invalid, sessionstore-backups/previous.js is still present, in which case it is always valid;
  5. In most cases in which the previous files are absent or invalid, sessionstore-backups/upgrade.js-[...] is still present, in which case it is always valid.

Similarly, if Firefox crashed or was killed:

  1. In most cases, sessionstore-backups/recovery.js is present and valid;
  2. In most cases in which sessionstore-backups/recovery.js is invalid, sessionstore-backups/recovery.bak is pressent, with an even higher likelihood of being valid;
  3. In most cases in which the previous files are absent or invalid, sessionstore-backups/previous.js is still present, in which case it is always valid;
  4. In most cases in which the previous files are absent or invalid, sessionstore-backups/upgrade.js-[...] is still present, in which case it is always valid.

Numbers crunching

Statistics collected on Firefox Nightly 32 suggest that, out of 11.95 millions of startups, 75,310 involved a corrupted sessionstore.js. That’s roughly a corrupted sessionstore.js every 158 startups, which is quite a lot. This may be influenced by the fact that users of Firefox Nightly live on pre-alpha, so are more likely to encounter crashes or Firefox bugs than regular users, and that some of them use add-ons that may modify sessionstore.js themselves.

With the new algorithm, assuming that the probability for each file to be corrupted is independent and is p = 1/158, the probability of losing more than 30 seconds of data after a crash goes down to p^3 ≅ 1 / 4,000,000. If we haven’t removed the recovery files, the probability of losing more than 30 seconds of data after a clean shutdown and restart goes down to p^4 ≅ 1 / 630,000,000. This still means that , statistically speaking, at every startup, there is one user of Firefox somewhere around the world who will lose more than 30 seconds of data, but this is much, better than the previous situation by several orders of magnitude.

It is my hope that this new mechanism will transparently make your life better. Have fun with Firefox!

Shutting down things asynchronously

February 14, 2014 § Leave a comment

This blog entry is part of the Making Firefox Feel As Fast As Its Benchmarks series. The fourth entry of the series was growing much too long for a single blog post, so I have decided to cut it into bite-size entries.

A long time ago, Firefox was completely synchronous. One operation started, then finished, and then we proceeded to the next operation. However, this model didn’t scale up to today’s needs in terms of performance and performance perception, so we set out to rewrite the code and make it asynchronous wherever it matters. These days, many things in Firefox are asynchronous. Many services get started concurrently during startup or afterwards. Most disk writes are entrusted to an IO thread that performs and finishes them in the background, without having to stop the rest of Firefox.

Needless to say, this raises all sorts of interesting issues. For instance: « how do I make sure that Firefox will not quit before it has finished writing my files? » In this blog entry, I will discuss this issue and, more generally, the AsyncShutdown mechanism, designed to implement shutdown dependencies for asynchronous services.

« Read the rest of this entry »

Fighting the good fight for fun and credits, with Mozilla Education

October 30, 2012 § Leave a comment

Are you a student?

Do you want to fight the good fight, for the Future of the Web, and earn credits along the way?

Mozilla Education maintains a tracker of student project topics. Each project is followed by one (or more) mentor from the Mozilla Community.

Then what are you waiting for? Come and pick or join a project or get in touch to suggest new ideas!

Are you an educator?

The tracker is also open to you. Do not hesitate to pick projects for your students, send students to us or contact us with project ideas.

We offer/accept both Development-oriented, Research-oriented projects and not-CS-oriented-at-all projects.

Are you an open-source developer/community?

If things work out smoothly, we intend to progressively open this tracker to other (non-Mozilla) projects related to the future of the web. Stay tuned – or contact us!

Asynchronous file I/O for the Mozilla Platform

October 3, 2012 § 18 Comments

The Mozilla platform has recently been extended with a new JavaScript library for asynchronous, efficient, file I/O. With this library, developers of Firefox, Firefox OS and add-ons can easily write code that behave nicely with respect to the process and the operating system. Please use it, report bugs and contribute.

Off-main thread file I/O

Almost one year ago, Mozilla started Project Snappy. The objective of Project Snappy is to improve, wherever possible, the responsiveness of Firefox, the Mozilla Platform, and now, Firefox OS, based on performance data collected from volunteer users. Thanks to this real-world performance data, we have been able to identify a number of bottlenecks at all levels of Firefox. As it turns out, one of the main bottlenecks is main thread file I/O, i.e. reading from a file or writing to a file from the thread that also runs most of the code of Firefox and its add-ons.

« Read the rest of this entry »

Stages chez Mozilla Paris… ou ailleurs

November 19, 2011 § 1 Comment

edit Nous sommes pleins jusqu’à Juin. Nous ne pouvons plus prendre de stagiaires sur Paris dont les stages commencent avant Juin.

Comme tous les ans, Mozilla propose des stages en informatique, orientés Développement, R&D ou Recherche. Selon le sujet, le stage peut vous emmener à Paris, aux États-Unis, au Canada, en Chine…

À propos de Mozilla

La Fondation Mozilla est une association à but non-lucratif, fondée pour encourager un Internet ouvert, innovant et participatif. Vous avez probablement entendu parler de Mozilla Firefox, le navigateur open-source qui a réintroduit sur le web les standards ouverts et la sécurité, ou de Mozilla Thunderbird, le client de messagerie multi-plateforme, open-source et extensible. Les activités de Mozilla ne s’arrêtent pas à ces deux produits et se prolongent à de nombreux projets pour le présent et l’avenir, tels que :

  • Boot-to-Gecko, système d’exploitation totalement ouvert et construit par la communauté, pour les téléphones portables, tablettes et autres machines communicantes ;

  • SpiderMonkey, une famille de Machines Virtuelles conçues pour l’analyse statique et dynamique, la compilation et l’exécution des langages web, en particulier JavaScript ;
  • DeHydra et JSHydra, outils d’analyse statique pour les langages C++ et JavaScript ;

  • Rust, un nouveau langage de programmation conçu pour le développement d’applications système parallèles sûres ;

  • WebAPI, un ensemble d’outils qui permettent d’étendre les capacités des applications web au-delà de celles des applications traditionnelles, la sécurité et la confidentialité en plus ;

  • Gecko, le moteur de rendu extensible et portable pour le HTML, le XML et les interfaces graphiques, qui a permis Firefox, Thunderbird et de nombreuses autres applications ;

  • BrowserID, une technique innovante qui fournit aux utilisateurs et aux développeurs les outils cryptographiques pour assurer l’identification sur le web, sans compromettre la vie privée, la simplicité ou la sécurité ;

  • les fonctionnalités Mozilla Services de gestion d’identité par le Cloud ;

  • et d’autres encore…

À propos de vous

Mozilla proposes plusieurs stages dans ses installations à travers le monde sur de nombreux sujets.

Votre profil :

  • vous voulez faire du web un endroit meilleur, sur lequel chacun peut naviguer et contribuer en toute sécurité, sans avoir à craindre pour sa sécurité ou sa vie privée ;
  • vous souhaitez prendre part à un projet utilisé par plus de 33% de la population du web ;
  • vous voulez que votre travail soit utile à tous et visible par tous ;
  • vous avez de fortes compétences en Algorithmique et en Informatique ;
  • vous avez de fortes compétences dans au moins l’un des domaines suivants :
    • systèmes d’exploitation ;
    • réseaux ;
    • géométrie algorithmique ;
    • compilation ;
    • cryptographie ;
    • analyse statique ;
    • langages de programmation ;
    • extraire des informations pertinentes à partir de données exotiques ;
    • algorithmique distribuée ;
    • le web en tant que plate-forme ;
    • interactions avec les communautés du logiciel libre ;
    • toute autre compétence qui, à votre avis, pourrait nous servir.
  • sur certains sujets, un excellent niveau d’Anglais peut être indispensable ;
  • les stages sont généralement prévus pour des étudiants M1 ou M2 mais si vous arrivez à nous impressionner par vos réalisations ou par vos connaissances, le diplôme n’est pas indispensable.

Si vous vous reconnaissez, nous vous invitons à nous contacter. En fonction du sujet, les stages peuvent vous emmener à Paris, Mountain View, San Francisco, Toronto, Taipei, ou d’autres lieux à travers le monde.

Les meilleurs stagiaires peuvent espérer un contrat freelance, un CDI ou/et une bourse de doctorat.

Pour nous contacter

Pour toute question, contactez :

  • pour tout ce qui concerne les stages chez Mozilla, Julie Deroche (à, jderoche) – Mozilla Mountain View, College Recruiting ;
  • pour les stages à Paris, David Rajchenbach-Teller (à, dteller) – Mozilla Paris, Développeur / Chercheur.

Hi from ICFP

September 1, 2009 § Leave a comment

Well, it’s that time of the year. Back from the holidays and into ICFP…

I’ll try and post my MLWorkshop and my DEFUN slides before the end of the week. I’m talking respectively about Batteries Included and about development with OCaml.

Oh, and starting from today, I’m working at MLState, on one possible future of web development: concurrent, functional, strongly typed, persistent programming. More information will follow, eventually.

Last lecture

April 22, 2009 § 4 Comments

Note: This post is written on the 79th day of strike of Universities. Despite the overwhelming consensus against these bills, the government has just passed the application decrees implementing the possibility of arbitrarily increasing the teaching charge of researchers, without need for any justification. The government obviously fails to see how much this will hurt Research. Simultaneously, the government has announced that, since the reform of primary and secondary schools cannot proceed in compliance with the government’s own decrees, it will simply proceed illegally. Once the shock is gone, expect increased strike actions. Expect Resarch strikes on publications, on patents, on contracts with the government or French companies. Expect difficulties with baccalauréat, exams and degrees.

Where headhunters had failed, the government has finally succeeded. Today was my last lecture.

As the government obviously doesn’t want Researchers to have the means and time to undertake Research, I have accepted a position in the private sector, where I should be able to pursue my work on semantics, security and functional concurrent/distributed programming languages.

While I’m glad to start in a position where I will have both more leeway and both students and engineers to work with me, I am saddened that the situation had to reach the point where I felt I had no choice.

Barring any accident, starting September 1st, you will be able to find me at MLState.

OCaml Batteries Included Beta 1

April 6, 2009 § 9 Comments

Note This post is written on the 64th day of University strikes in France. During these 64 days, the government has rejected any negociation on the core reasons for the strike, has attempted to discredit the contestation (if I understand correctly, I am both « improductive » and a « mask-wearing commando ») and has used police intimidation and repression. The strike continues. Quite possibly, there will be no university exams this year and no baccalauréat. If repression continues increasing, no one can tell for sure what will happen. Nothing good, for sure.

After days and nights of coding, debugging and fighting over naming conventions, the OCaml Batteries Included team is proud to announce OCaml Batteries Included Beta 1. You can find the binaries here,  read the API documentation, the platform documentation, the release notes and the ChangeLog or the list of individual commits. A GODI package and a Debian/testing package are also available.

« Read the rest of this entry »

Dangereux d’être Lumière

February 1, 2009 § 2 Comments

C’est officiel, à partir du 2 février, les universités françaises sont en grève contre les réformes. La grève ne prendra pas la même forme partout, qu’il s’agisse de grèves administratives, de grève des notes, de grève des publications ou encore de grève des cours.

Plutôt que d’analyser une nouvelle fois les réformes, les discours ou interviews de Valérie Pécresse ou Nicolas Sarkozy, laissez-moi vous résumer l’un des nombreux points préoccupants des réformes, qui vous rappelleront d’autres réformes dans le domaine de l’audiovisuel ou de la justice.

Dans cette université nouvelle, la machine politique contrôle:

  • directement le financement sur chaque sujet de recherche (loi LRU, agences de moyens)
  • directement la carrière de chaque enseignant-chercheur (réforme du statut d’enseignant-chercheur / précarisation du statut d’enseignant-chercheur)
  • indirectement le salaire de chaque enseignant-chercheur (réforme du statut d’enseignant-chercheur)
  • indirectement le nombre d’heures d’enseignement de chaque enseignant-chercheur, c’est-à-dire la directement possibilité de faire de la recherche ou/et la possibilité de faire de l’enseignement (réforme du statut d’enseignant-chercheur).

En d’autres termes, le gouvernement se dote de l’attirail nécessaire pour choisir, à tout moment, ce qui doit être étudié ou ce qui doit être oublié. Ce degré de contrôle, qui est simplement absurde dans le monde des sciences dures, devient inquiétant dès qu’il s’agit d’histoire, de littérature, de sociologie ou plus généralement de sciences humaines. Un enseignant-chercheur trublion, qui travaillerait sur des sujets délicats, pourra donc voir sa carrière immédiatement foudroyée par un coup de fil à son université tutélaire.

Insistons sur ce dernier point :

Il devient dangereux de faire des recherches ou des découvertes qui fâchent.

Je vous laisse imaginer des scénarios dans tous les domaines. Personnellement, j’en vois déjà en histoire, en littérature, en sociologie, en ethnologie, en sécurité informatique, et plus généralement tous les domaines  perçus comme pouvant mettre à mal le discours officiel du gouvernement, les secrets de ses amis ou de n’importe quel grand groupe industriel français.

Let’s do it with Batteries

January 27, 2009 § 6 Comments

Or, OCaml is a scripting language, too.

Note: These extracts use the latest version of Batteries, currently available from the git. Barring any accident, this version should be made public within the next few days.

A few days ago, when writing some code for OCaml Batteries Included, I realized that, to properly embed Camomile’s Unicode transcoding module, I would need to manually write 500+ boring lines, all of them looking like:

| `ascii -> Encoding.of_name "ASCII"

The idea behind that pattern matching was to define a type-safe phantom type for text encodings. Upon installation, Camomile generates a directory containing about 540 files, one per text encoding, and it seemed like a good idea to rely upon something less fragile than a string name.

Of course, writing this pattern-matching manually was out of the question: it was boring, error-prone, and while Batteries deserves sacrifices, it doesn’t quite deserve that level of mind-numbing activities. The alternative was to generate both the list of constructors and the pattern-matching code from the contents of the directory. I could have done it with some scripting language but that sounded like a good chance to test-drive the numerous new functions of the String module of Batteries (73 for 28 in the Base Library).

The main program

The structure of the program is easy: read the contents of a directory. For each file, do some treatment on the file name and print the result:

open Shell
foreach (files_of argv.(1)) do_something

Here, foreach is the same function as iter but with its arguments reversed. It’s sometimes much more readable. Instead of reading the contents of a directory with Shell.files_of, we could just as well have traversed the command-line arguments with args, or read the lines of standard input using IO.lines_of stdin.

Actually, we could just as well generalize to a (possibly empty) set of directories. For this purpose, we just need to map our function files_of to the enumeration of command-line arguments. This yields an enumeration of enumerations, which we turn into a flat enumeration with flatten. In my mind, that’s somewhat nicer and more readable than nested loops.

Our main program now looks like:

open Shell, Enum
foreach (flatten (map files_of (args ()))) do_something

Or, for those of us who prefer operators to parenthesis:

open Shell, Enum
(foreach **> flatten **> map files_of **> args ()) do_something

String manipulation

It’s now time to take a file name and turn it into

  1. a nice constructor name
  2. a file name without extension,

That second point is the easiest, so let’s start with it. We have a function Filename.chop_extension just for this purpose. So, if we were interested only in printing the list of files without their extension, we could define

let do_something x = print_endline (Filename.chop_extension x)

The first point is slightly trickier, as we need to

  1. remove the extension from the file name (done)
  2. prepend character ` (trivial)
  3. replace any illicit character with _ (slightly more annoying, I know that the list of illicit characters which may actually appear in my list of files contains :, -, (, ) and whitespaces but I’d rather not go and check manually  which other characters may turn out problematic)
  4. prepend something before names which start with a digit, as digits cannot appear as the first character of an OCaml constructor (a tad annoying, too)
  5. make everything lowercase, just because it’s nicer (trivial).

Let’s deal with the third item, it’s bound to be central. Let’s see, replacing characters could be done with regular expressions, something I dislike, or with function It’s nicer, type-safer, and it has a counterpart for Unicode, if we ever need one. Now, functions Char.is_letter and Char.is_digit will help us determine which names are safe. Using them together, we obtain the following function:

open Char
let replace s = (fun c -> if is_letter c || is_digit c then c else '_') s

Let’s solve the fourth item on our list. We need to check the first character of a string and to determine whether it’s a digit. Well, we already know how to do this. Let’s call our prefix p:

let clean_digit p s = if is_digit s.[0] then p^s else s

If we chain up everything, we obtain

let constructor p s = "`" ^ (if is_digit r.[0] then p^r else r)
    where         r = lowercase ( (fun c -> if is_letter c || is_digit c then c else '_') s)

I like this where syntax.


Now that we have both our strings, we just need to be able to combine and print them. For this purpose, Printf is probably the most concise tool. Here, we can just write

let print s1 s2 = Printf.printf " | %s -> %S\n" s1 s2

We could parameterize upon the format used by printf and we’re bound to do this sooner or later, but let’s keep it simple for now.

The complete program

open Shell, Enum

foreach (flatten **> map files_of **> args ()) do_something
  where do_something s =
   let name = Filename.chop_extension s in Printf.printf " | %s -> %S\n" c name
     where c = "`" ^ (if Char.is_digit r.[0] then "codemap_"^r else r)
     where r = lowercase ( (fun c -> if Char.is_letter c || Char.is_digit c then c else '_') name)

I don’t know about you but I find this pretty nice, for a type-safe language. I’m sure it would have been possible to make something shorter in Perl or awk, and suggestions are welcome regarding how to improve this but I’m rather happy. And, once again, we’re not trying to beat Python, Perl or awk in concision, just to do something comparably good, because we already beat them by far in speed and safety.

So, what do you think?

Where Am I?

You are currently browsing entries tagged with computer science at Il y a du thé renversé au bord de la table.


Get every new post delivered to your Inbox.

Join 30 other followers