The Battle of Session Restore – Pilot

March 26, 2014 § 7 Comments

Plot Our heroes received their assignment. They had to go deep into the Perflines, in the long lost territory of Session Restore, and do whatever it took to get Session Restore back into Perfland. However, they quickly realized that they had been sent on a mission without return – and without a map. This is their tale.

Session Restore is a critical component of Firefox. This component records the current state of your browser to ensure that you can always resume browsing without losing the state of your browser, even if Firefox crashes, if your computer loses power, or if your browser is being upgraded. Unfortunately, we have had many reports of Session Restore slowing down Firefox. In February 2013, a two person Perf/Fx-team task force started working on the Performance of Session Restore. This task force eventually grew to four persons from Perf, Fx-team and e10s, along with half a dozen of punctual contributors.

To this day, the effort has lasted 13 months. In this series of blg entries, I intend to present our work, our results and, more importantly, the lessons we have learnt along the way, sometimes painfully.

Fixing yes, but fixing what?

We had reports of Session Restore blocking Firefox for several seconds every 15 seconds, which made Firefox essentially useless.

The job of Session Restore is to record everything possible of the state of the current browsing session. This means the list of windows, the list of tabs, the current address of each tab, but also the history of each tab, scroll position, anchors, DOM SessionStorage. session cookies, etc. Oh, and this goes recursively for both nested frames and history. All of this is saved to a JSON-formatted file called sessionstore.js, every 15 seconds of user activity. To this day, the largest reported sessionstore.js files is 150Mb, but Telemetry indicates that 95% of users used to have a file of less than 1Mb (numbers are lower these days, after we spent time eliminating unnecessary data from sessionstore.js).

We started the effort to fix Session Restore from only a few bug reports:

 

  • sometimes, users lost sessionstore.js data;
  • sometimes, data collection took ages.

Unfortunately, we had no data on:

 

  • the size of the file;
  • the actual duration of data collection;
  • how long it took to write data to the disk.

To complicate things further, Session Restore had been left without owner for several years. Irregular patching to support new features of the web and new configurations had progressively turned the code and data structures into a mess that nobody fully understood.

We had, however, a few hints:

  • Session Restore needs to collect lots of data;
  • Session Restore had been designed a long time ago, for users with few tabs, and when web pages stored very little information;
  • serializing and writing to JSON is inefficient;
  • in bad cases, saving could take several seconds;
  • the collection of data was purely monolithic;
  • reading and writing data was done entirely on the main thread, which was a very bad thing to do;
  • the client API caused full recollections at each request;
  • the data structure used by Session Restore had progressively become an undocumented mess.

While there were a number of obvious sources of inefficiency that we could fix without further data, and that we set out to fix immediately. In a few cases, however, we found out the hard way that optimizing without hard data is a time-consuming and useless exercise. Consequently, a considerable part of our work has been to use Telemetry to determine where we could best apply our optimization effort, and to confirm that this effort yielded results. In many cases, this meant adding coarse-grained probes, then progressively completing them with finer-grained probes, in parallel with actually writing optimizations.

To be continued…

In the next episode, our heroes will fight Main Thread File I/O… and the consequences of removing it.

Beautiful Off Main Thread File I/O

October 18, 2012 § 7 Comments

Now that the main work on Off Main Thread File I/O for Firefox is complete, I have finally found some time to test-drive the combination of Task.js and OS.File. Let me tell you one thing: it rocks!

« Read the rest of this entry »

Asynchronous file I/O for the Mozilla Platform

October 3, 2012 § 18 Comments

The Mozilla platform has recently been extended with a new JavaScript library for asynchronous, efficient, file I/O. With this library, developers of Firefox, Firefox OS and add-ons can easily write code that behave nicely with respect to the process and the operating system. Please use it, report bugs and contribute.

Off-main thread file I/O

Almost one year ago, Mozilla started Project Snappy. The objective of Project Snappy is to improve, wherever possible, the responsiveness of Firefox, the Mozilla Platform, and now, Firefox OS, based on performance data collected from volunteer users. Thanks to this real-world performance data, we have been able to identify a number of bottlenecks at all levels of Firefox. As it turns out, one of the main bottlenecks is main thread file I/O, i.e. reading from a file or writing to a file from the thread that also runs most of the code of Firefox and its add-ons.

« Read the rest of this entry »

Getting file information with OS.File

July 31, 2012 § 2 Comments

OS.File keeps gaining new features.

Today, let me show you OS.File.stat and OS.File.prototype.stat, two data structures used to get information on a file, such as its size, its creation date or its nature.

How to

There are two ways to get information on a file.

The first technique is to simply call OS.File.stat with the path of the file you wish to open:

// File sessionstore.js in the user’s profile directory
let path = OS.Path.join(OS.Constants.Path.profileDir, "sessionstore.js");
let stat = OS.File.stat(path)

This returns a OS.File.Info object containing all the interesting information on the file.

if (stat.isDir) {
  dump("This is a directory\n");
} else if (stat.isSymLink) {
  dump("This is a symbolic link\n");
}
dump("The file contains " + stat.size + "bytes\n”);
dump("The file was created at " + stat.creationDate + "\n");
dump("The file was last accessed at " + stat.lastAccessDate + "\n");
dump("The file was last modified at " + stat.lastModificationDate + "\n");

Additionally, under Unix, some security information is available:

if ("unixOwner" in OS.File.Info.prototype) {
  dump("The file belongs to user " + stat.unixOwner +
    " in group " + stat.unixGroup +
    " and has mode " + stat.unixMode);
}

That’s it.

The second technique will let you get information on a file that is already opened:

let file = OS.File.open(path);
let stat = file.stat();

The result is exactly the same. Of course, file.stat() is faster if you have already opened the file, while OS.File.stat(path) if faster than opening the file, calling file.stat() then closing it.

Exercise

Let’s put OS.File.stat and OS.File.DirectoryIterator to good use for getting the list of all files in a directory, ordered by last modification date.

function sortedEntries(path) {
  // Get the list of all files in directory
  let iterator = new OS.File.DirectoryIterator(path);
  let entries;
  try {
    entries = [entry for (entry in iterator)];
  } finally {
    iterator.close();
  }

  // If we are under Windows, we have all information in entries already
  // We can make this happen without any further I/O
  if ("winLastModificationDate" in OS.File.DirectoryIterator.prototype) {
    return entries.sort(function compare(x,y) {
      return x.winLastModificationDate - y.winLastModificationDate;
    }
  } else {
    // On other systems, we have to call stat before we can order
    let sortable = [{entry: entry, stat: OS.File.stat(entry.path)} for (entry in entries)];
    // Array comprehension is cool
    let sorted = sortable.sort(function compare(x, y)) {
      return x.stat.lastModificationDate - y.stat.lastModificationDate;
    }
    return [x.entry for (x in sorted)];
  }
}

Note that OS.File.DirectoryIterator does not return special files “.” and “..”.

For bonus points, let’s do the same, but only for non-directory files in the directory:

function nonDirectoryEntries(path) {
  // Get the list of all files in directory
  let iterator = new OS.File.DirectoryIterator(path);
  try {
    for (let entry in iterator) {
      if (!entry.isDir) {
        // Generators are cool, too
        yield entry;
      }
    }
  } finally {
    iterator.close();
  }
}

function sortedEntries(path) {
  // Get the list of all non-directory files in directory
  let entries = nonDirectoryEntries(path);
  if ("winLastModificationDate" in OS.File.DirectoryIterator.prototype) {
    // ... as above
  } else {
    // ... as above
  }
}

We could of course remove directories after sorting, but removing it initially saves both computation time (we sort through a shorter array) and I/O (under non-Windows platforms, we only need to call stat on a smaller set of files).

Homework

As a Programming Language guy, I see an opportunity to develop this API into a nice Domain Specific Language that would let developers formulate queries and would let the engine generate OS-optimized functions to execute these queries.

For instance:

OS.File.Query.SelectFromDir().
  where({isDir: false}).
  sortedBy({lastModificationDate: true})
  // returns the above function, including the optimizations

 

OS.File.Query.SelectFromDir().
  where({path: /.*\.tmp^/, isSymLink:false}).
  sortedBy({creationDate: true})

I do not have plans to implement anything such at the moment, but this sounds like a nice student project. If you are interested, do not hesitate to drop me a line.

That’s all folks.

In the next entries of this blog, I expect to introduce, in no particular order:

  • path manipulation with OS.File;
  • reading/writing with encodings in OS.File;
  • off-main-thread async I/O for the main thread;

benchmarks.

Security analysis

May 25, 2008 § Leave a comment

A few weeks ago, I promised I would tell you more about ExtraPol, my current research project. Well, before doing so, here’s a short reminder about the notion of security in computer science — and the manners of enforcing that security.

While most members of the computer science community agree that safety and security are desirable properties, there is little consensus on the methods to be used for ensuring safety or security. Indeed, even the actual meaning of these properties often remains an open question.

One possibility is to define security in terms of authorizations and safety in terms of real-world hazard. In this context, a system or subsystem is therefore secure if there is no way for something forbidden to happen, while it is safe if its use may only cause acceptable risks. Both notions are very broad and their enforcement is far from trivial. Even the reduced problem of ensuring that the installation and execution of a software application will not breach simple cases of security of a desktop station is an open research issue.

In practice, techniques used or investigated in the domain of security tend to fall roughly into three groups:

  • static analysis — try and detect security holes before running the program
  • dynamic analysis — try and detect security breaches as they happen
  • trace analysis — try and detect security breaches after they have happened.

« Read the rest of this entry »

Where Am I?

You are currently browsing entries tagged with i/o at Il y a du thé renversé au bord de la table.

Follow

Get every new post delivered to your Inbox.

Join 29 other followers