RFC: We deserve better than runtime warnings

November 20, 2014 § Leave a comment

Consider the following scenario:

  1. Module A prints warnings when it’s used incorrectly;
  2. Module B uses module A correctly;
  3. Some future refactoring of module B starts using module A incorrectly, hence displaying the warnings;
  4. Nobody realises for months, because we have too many warnings;
  5. Eventually, something breaks.

How often has this happened to everyone of us?

This scenario has many variants (e.g. module A changed and nobody realized that module B is now in a situation it misuses module A), but they all boil down to the same thing: runtime warnings are designed to be lost, not fixed. To make things worse, many of our warnings are not actionable, simply because we have no way of knowing where they come from – I’m looking at you, Cu.reportError.

So how do we fix this?

We would certainly save considerable amounts of time if warnings caused immediate assertion failures, or alternatively test failures (i.e. fail, but only when running the unit tests). Unfortunately, we can do neither, as we have a number of tests that trigger the warnings either

  • by design (e.g. to check that we can recover from such misuses of A, or because we still need a now-considered-incorrect use of an API to keep working until we have ported all the clients to the better API);
  • or for historical reasons (e.g. the now incorrect use of A used to be correct, but we haven’t fixed all tests that depend on it yet).

However, I believe that causing test failures is still the solution. We just need a mechanism that supports a form of whitelisting to cope with the aforementioned cases.

Introducing RuntimeAssert

RuntimeAssert is an experiment at provoding a standard mechanism to replace warnings. I have a prototype implemented as part of bug 1080457. Feedback would be appreciated.

The key features are the following:

  • when a test suite is running, a call to `RuntimeAssert` causes the test suite to fail;
  • when a test suite is running, a call to `RuntimeAssert` contains at least the filename/line number of where it was triggered, preferably a stack wherever available;
  • individual tests can whitelist families of calls to `RuntimeAssert` and mark them either as expected;
  • individual tests can whitelist families of calls to `RuntimeAssert` and mark them as pending fix;
  • when a test suite is not running, a call to `RuntimeAssert` does nothing costly (it may default to PR_LOG or Cu.reportError).

Possible API:

  • in JS, we trigger a test failure by calling RuntimeAssert.fail(keyword, string or Error) from production code;
  • in C++, we likewise trigger a test failure by calling MOZ_RUNTIME_ASSERT(keyword, string);
  • in the testsuite, we may whitelist errors by calling Assert.whitelist.expected(keyword, regexp)  or Assert.whitelist.FIXME(keyword, regexp).

Examples:

//
// Module
//
let MyModule = {
  oldAPI: function(foo) {
    RuntimeAssert.fail(“Deprecation”, “Please use MyModule.newAPI instead of MyModule.oldAPI”);
    // ...
  },
  newAPI: function(foo) {
    // ...
  },
};

let MyModule2 = {
  api: function() {
    return somePromise().then(null, error => {
      RuntimeAssert.fail(“MyModule2.api”, error);
      // Rather than leaving this error uncaught, let’s make it actionable.
    });
  },

  api2: function(date) {
    if (typeof date == “number”) {
      RuntimeAssert.fail(“MyModule2.api2”, “Passing a number has been deprecated, please pass a Date”);
      date = new Date(date);
    }
    // ...
   }
}


//
// Whitelisting a RuntimeAssert in a test.
//

// This entire test is about MyModule.oldAPI, warnings are normal.
Assert.whitelist.expected(“Deprecation”, /Please use MyModule.newAPI/);

// We haven’t fixed all calls to MyModule2.api2, so they should still warn, but not cause an orange.
Assert.whitelist.FIXME(“MyModule2.api2”, /please pass a Date/);

Assert.whitelist.expected(“MyModule2.api”, /TypeError/, function() {
  // In this test, we will trigger a TypeError in MyModule2.api, that’s entirely expected.
  // Ignore such errors within the (async) scope of this function.
});

Applications

In the long-term, I believe that RuntimeAssert (or some other mechanism) should replace almost all our calls to Cu.reportError.

In the short-term, I plan to use this for reporting

  • uncaught Promise rejections, which currently require a bit too much hacking for my tastes;
  • errors in XPCOM.lazyModuleGetter & co;
  • failures during AsyncShutdown;
  • deprecation warnings as part of Deprecated.jsm.

Tales of Science-Fiction Bugs: The Thing That Killed Talos

November 12, 2012 § 4 Comments

Have you ever encountered one of these bugs? One in which every single line of your code is correct, in which every type-check passes, every single unit test succeeds, the specifications are fulfilled but somehow, for no reason that can be explained rationally, it just does not work? I call them Science-Fiction Bugs. I am sure that you have met some of them. For some reason, the Mozilla Performance Team seems to stumble upon such bugs rather often, perhaps because we spend so much time refactoring other team’s code long after the original authors have moved on to other features, and combining their code with undertested libraries and technologies. Truly, this is life on the Frontier.

Today, I would like to tell you the tale of one of these Science-Fiction Bugs: The Thing That Killed Talos.

« Read the rest of this entry »

Where Am I?

You are currently browsing the Science-Fiction Bug category at Il y a du thé renversé au bord de la table.

Follow

Get every new post delivered to your Inbox.

Join 33 other followers