Q2 2014 Report

July 1, 2014 § Leave a comment

Q2 2014 was a difficult quarter at Mozilla, with all the agitation around Brendan Eich, Australis, Media Extensions, etc. Still, I have the feeling that we managed to get a lot done despite the intense pressure. Here is a quick highlight of my main accomplishments for Q2 2014.

Session Restore

A considerable amount of my time was spent working on Session Restore. The main objective is to decrease the jank caused by Session Restore taking snapshots of the session and to decrease the time Session Restore takes to restore the state of Firefox. Much of the activity this quarter dealt with measuring performance, so as to best optimize it and improving safety.

Reworking Session Restore backups

With Firefox 33, the backups of Session Restore state have been completely redesigned. The new system should prove orders of magnitude safer, in addition to now being fully transparent.

Next steps We are still lacking measurements to confirm that this is as successful as the mathematics suggest. If you are interested, there is a mentored bug open.

Talos tests and Telemetry on Session Restore startup

Optimizing startup is difficult, and generally impossible if you do not know what to optimize. With Firefox 32 and 33, we have new benchmarks and real world measurements to help us determine immediately the influence of patches on Session Restore startup.

Next steps Using these benchmarks to experiment with possible optimizations. This is in progress.

Cleaning up Session Restore file

One of our objectives is to decrease the size of the Session Restore file, to reduce the amount of I/O (hence battery use and hardware wear and tear) and memory usage. As a first step, we have introduced a mechanism that progressively removes from the “Undo Close” feature tabs and windows that have been closed at least 2 weeks ago. Interestingly, Telemetry indicates that this clean-up has no effect on the size of the Session Restore file. Experiments run later during the quarter, using the Talos tests, also strongly suggest that the data that we could clean up and that we do not clean up yet have essentially no influence on startup duration.

Next steps I believe that this strategy will therefore not be pursued during the next quarters.

Preserving compatibility with Tor Browser

While refactoring Session Restore, we have hit a number of obstacles in the form of add-ons using private or semi-private APIs that we wished to remove. We have managed to work along with add-on authors and, as far as I know, we have not broken any add-on yet. In particular, we have maintained compatibility with the Tor Browser, which is a heavily customized distribution of Firefox targeted towards privacy.

Next steps Providing a clean API for add-ons. This will require discussing with add-on authors to find out what they need.

Async tooling

I am in charge of the Async Project, which is all about giving front-end and add-on developers tools to develop asynchronous code that does not jank. As usual, this involved plenty of activity in a number of different directions.

Auto-closing Sqlite.jsm databases (mentoring Michael Brennan)

Sqlite.jsm databases can now be closed automatically during garbage-collection. On user’s computers, this will increase safety, as failing to close a database causes shutdown-time assertion failures. However, to use resources effectively, pragmatism dictates that databaes should be closed manually, so failing to close a database in the Mozilla codebase will still cause test failures.

Reworking OS.File shutdown

On devices with little memory (typically Firefox Phones), one of the techniques used to save memory is to shutdown the OS.File worker as early as possible, re-launching it later if necessary. As it turns out, the task is more complicated than it seems, due to possibilities of race conditions. Unfortunately, this means that in some extreme cases, Firefox OS applications could lock down and fail to shutdown properly without being killed by the OS. This is now fixed. Somewhere along the way, this helped us to make the PromiseWorker used by OS.File more resilient to low-level errors.

Next steps Making the PromiseWorker usable by other modules than OS.File, including testing and add-ons.

OS.File for Android and Firefox OS

OS.File was initially designed for desktop devices. Now that it is used in a number of places on mobile devices, I have mercilessly hunted down all compatibility issues between OS.File and our two mobile platforms. Compatibility tests are now activated on all platforms and should avoid any regression.

AsyncShutdown Barrier mechanism

The shutdown process of Firefox has always been a dark and scary place, full of unspecified dependencies. As a result, any refactoring or addition a new dependency could break many things in new and interesting ways. I have introduced the AsyncShutdown Barrier mechanism that lets us specify clear, explicit and extensible dependencies, handles ordering of shutdowns, as well as error reporting if a dependency is unmet. This Barrier is now used by Sqlite.jsm, OS.File, Firefox Health Report, Session Restore, Page Thumbnails and fixes a number of major issues.

Next steps Porting AsyncShutdown Barrier to allow native components to register with it.

Fixing Firefox 30 shutdown freezes (with Tim Taubert)

Many users of Firefox 30 encountered issues that caused Firefox to freeze during shutdown. We found out that the issue was caused was triggered by Page Thumbnails and caused by a bug in ChromeWorkers, which did not handle an error case gracefully. I applied AsyncShutdown Barrier to ensure that Page Thumbnails always completed without triggering the error case, while Tim Taubert ensured that the Chrome Workers handled the error robustly.

Making Firefox Health Report shutdown more robust

While porting Firefox Health Report to AsyncShutdown, we encountered an elusive bug that manifested itself by causing rare shutdown crashes. After months of experimenting, instrumenting and attempting to fix the issue, we eventually traced it back to a more serious bug in shutdown, which apparently does not always send the proper notifications. Using the AsyncShutdown Barrier, we managed to work around the issue and make FHR’s shutdown both more robust and better instrumented in case of crash. This later helped us locate another issue that prevents a proper shutdown when some databases have been corrupted.

Next steps Fix the upstream shutdown bug, make our shutdown more resilient in case of database corruption.

Async testing

The other aspect of writing asynchronous code is making sure that developers can debug it. Now that we have hit a critical mass of developers writing async code, it was high time to help them work with it.

Rewriting Task stack traces to be meaningful

Now that we know how to handle uncaught errors, the main remaining weaknesses of Promise-based and Task-based code is that their stack traces lose much information. Since Firefox 33, Task-based stack traces are now transparently rewritten into something developer-redable. Somewhere along the way, I have also patched xpcshell and mochitests to ensure that they take advantage of this rewriting. Experience shows that this is very useful and that the runtime cost is negligible.

Next steps Evaluate the runtime cost of doing the same thing for Promise-based code.

Making xpcshell tests fail in case of uncaught promise error

Uncaught promise errors were treated by the test suites as warnings, TBPL did not report them, and they remained consequantly more often than not ignored (or even unseen) by the developers. I have reworked the xpcshell test harness to consider all uncaught promise errors as oranges and fixed all offenders.

Next steps Doing the same for mochitests. Code is ready, but a few offenders remain.

Community

Dealing with political feedback around the nomination and departure of Brendan Eich

Along with many others, I made my best to engage people who voiced their negative feedback either at the nomination or at the departure of Brendan Eich. Unfortunately, this took time and efforts, but I believe that staying in touch with our users is part of what makes the difference between Mozilla and other browser vendors.

Working with new contributors

I estimate that I have worked with ~30 potential new contributors during the quarter. Many have unfortunately decided to postpone or abandon their efforts towards contributing, but a few have stayed, to work either with me or with other teams. At the moment, I am following 5 promising contributors. In particular, I am quite happy to welcome Dexter (who is working on a very sophisticated patch to let code watch for file modifications) and Kushagra (who has landed several test suite bugs).

Next steps More of it!

Working with universities

A group of École Centrale de Lyon successfully completed an online tool to help grassroot projects find volunteers. It was nice mentoring them.

Zedge

I was invited to deliver a presentation on performance at Zedge, in Trondheim, Norway. That was fun :)

Next steps Publish the slides.

And now?

Let’s get started with Q3!

Chrome Workers, now with modules!

July 17, 2013 § 3 Comments

One of the main objectives of Project Async is to encourage Firefox developers and Firefox add-on developers to use Chrome Workers to ensure that whatever they do doesn’t block Firefox’ UI thread. The main obstacle, for the moment, is that Chrome Workers have access to very few features, so one the tasks of Project Async is to add features to Chrome Workers.

Today, let me introduce the Module Loader for Chrome Workers.

« Read the rest of this entry »

Why Firefox OS matters to me

March 16, 2013 § 5 Comments

These days, everybody seems to be talking about Firefox OS. About how removing the barrier of the marketplace will make the world a better place, or about how HTML5 is so darn great, or about the fact that a gazillion constructors and operators are supporting Firefox OS. And that’s great, because Firefox OS is an impressively good product and deserves this attention.

However, all this craze is missing one feature that makes Firefox OS my choice of mobile operating system: I can write a playable prototype for a simple game, from scratch, in two hours.

Of course, this was a prototype, and completing the game took me a few more days of adding 8 bit graphics, optimizing, toying with the rules, adding difficulty levels, high scores, etc. But after just two hours, I could play the game on computer, tablet and cellphone, and decide where to proceed from here. This was both my first HTML5 game and my first mobile game, by the way. It is by no means an AAA game, but it is fun enough that I sometimes play it in the subway. By the way, did I mention that, once I was satisfied with this game, I could publish it in just a few seconds, simply by hosting it anywhere on the web?

Oh, and another feature: I wrote a quite usable comic book reader in the subway, while commuting from/to work. It took me a few days of commuting (three days, I seem to remember) to obtain a tool that works quite nicely. Due to screen size, I prefer using it on my Android tablet than on a cellphone, but that’s the wonders of HTML5 and Open Web Applications: I developed for one, and it worked for both. Did I mention that this was my first attempt at writing a web application that does file I/O or that uses the touch screen intelligently? I will try and finalize and release this application one of these days.

Now, other developers or users might not share this feeling, but this simplicity to start coding and publish and evolve a game or application is of tremendous importance to me. Because one day, I will have a child in age of playing video games. And for his birthday, I will have a chance to download a 5€ game from the Firefox Marketplace (or anywhere else), but more importantly, I will be able to build a game with his favorite characters as support cast and him as a hero. I hope he will love it. And I will not need to ask for permission.

If there is some application you want to develop, neither will you.

Mozilla Student Projects update

February 27, 2013 § 3 Comments

It has been quite some time since the last update. Since then, many things have happened, both with the Student Projects and with the world of Mozilla. We have had the exciting FirefoxOS AppDays, many alpha, beta and near-final versions of FirefoxOS, and the MWC launch of FirefoxOS.

Well, without further ado, let us see how the student projects have progressed.

« Read the rest of this entry »

Tales of Science-Fiction Bugs: The Thing That Killed Talos

November 12, 2012 § 4 Comments

Have you ever encountered one of these bugs? One in which every single line of your code is correct, in which every type-check passes, every single unit test succeeds, the specifications are fulfilled but somehow, for no reason that can be explained rationally, it just does not work? I call them Science-Fiction Bugs. I am sure that you have met some of them. For some reason, the Mozilla Performance Team seems to stumble upon such bugs rather often, perhaps because we spend so much time refactoring other team’s code long after the original authors have moved on to other features, and combining their code with undertested libraries and technologies. Truly, this is life on the Frontier.

Today, I would like to tell you the tale of one of these Science-Fiction Bugs: The Thing That Killed Talos.

« Read the rest of this entry »

OS.File, step-by-step: The Schedule API

December 13, 2011 § 7 Comments

Summary

One of the key components of OS.File is the Schedule API, a tiny yet powerful JavaScript core designed to considerably simplify the development of asynchronous modules. In this post, we introduce the Schedule API.

Introduction

In a previous post, I introduced OS.File, a Mozilla Platform library designed make the life of developers easier and to help them produce high-performance, high-responsiveness file management routines.

In this post, I would like to concentrate on one of the core items of OS.File: the Schedule API. Note that the Schedule API is not limited to OS.File and is designed to be useful for all sorts of other modules.

« Read the rest of this entry »

Introducing JavaScript native file management

December 6, 2011 § 28 Comments

Summary

The Mozilla Platform keeps improving: JavaScript native file management is an undergoing work to provide a high-performance JavaScript-friendly API to manipulate the file system.

The Mozilla Platform, JavaScript and Files

The Mozilla Platform is the application development framework behind Firefox, Thunderbird, Instantbird, Camino, Songbird and a number of other applications.

While the performance-critical components of the Mozilla Platform are developed in C/C++, an increasing number of components and add-ons are implemented in pure JavaScript. While JavaScript cannot hope to match the speed or robustness of C++ yet (edit: at least not on all aspects), the richness and dynamism of the language permit the creation of extremely flexible and developer-friendly APIs, as well as quick prototyping and concise implementation of complex algorithms without the fear of memory errors and with features such as higher-level programming, asynchronous programming and now clean and efficient multi-threading. If you combine this with the impressive speed-ups experienced by JavaScript in the recent years, it is easy to understand why the language has become a key element in the current effort to make the Mozilla Platform and its add-ons faster and more responsive at all levels.

« Read the rest of this entry »

Stages chez Mozilla Paris… ou ailleurs

November 19, 2011 § 1 Comment

edit Nous sommes pleins jusqu’à Juin. Nous ne pouvons plus prendre de stagiaires sur Paris dont les stages commencent avant Juin.

Comme tous les ans, Mozilla propose des stages en informatique, orientés Développement, R&D ou Recherche. Selon le sujet, le stage peut vous emmener à Paris, aux États-Unis, au Canada, en Chine…

À propos de Mozilla

La Fondation Mozilla est une association à but non-lucratif, fondée pour encourager un Internet ouvert, innovant et participatif. Vous avez probablement entendu parler de Mozilla Firefox, le navigateur open-source qui a réintroduit sur le web les standards ouverts et la sécurité, ou de Mozilla Thunderbird, le client de messagerie multi-plateforme, open-source et extensible. Les activités de Mozilla ne s’arrêtent pas à ces deux produits et se prolongent à de nombreux projets pour le présent et l’avenir, tels que :

  • Boot-to-Gecko, système d’exploitation totalement ouvert et construit par la communauté, pour les téléphones portables, tablettes et autres machines communicantes ;

  • SpiderMonkey, une famille de Machines Virtuelles conçues pour l’analyse statique et dynamique, la compilation et l’exécution des langages web, en particulier JavaScript ;
  • DeHydra et JSHydra, outils d’analyse statique pour les langages C++ et JavaScript ;

  • Rust, un nouveau langage de programmation conçu pour le développement d’applications système parallèles sûres ;

  • WebAPI, un ensemble d’outils qui permettent d’étendre les capacités des applications web au-delà de celles des applications traditionnelles, la sécurité et la confidentialité en plus ;

  • Gecko, le moteur de rendu extensible et portable pour le HTML, le XML et les interfaces graphiques, qui a permis Firefox, Thunderbird et de nombreuses autres applications ;

  • BrowserID, une technique innovante qui fournit aux utilisateurs et aux développeurs les outils cryptographiques pour assurer l’identification sur le web, sans compromettre la vie privée, la simplicité ou la sécurité ;

  • les fonctionnalités Mozilla Services de gestion d’identité par le Cloud ;

  • et d’autres encore…

À propos de vous

Mozilla proposes plusieurs stages dans ses installations à travers le monde sur de nombreux sujets.

Votre profil :

  • vous voulez faire du web un endroit meilleur, sur lequel chacun peut naviguer et contribuer en toute sécurité, sans avoir à craindre pour sa sécurité ou sa vie privée ;
  • vous souhaitez prendre part à un projet utilisé par plus de 33% de la population du web ;
  • vous voulez que votre travail soit utile à tous et visible par tous ;
  • vous avez de fortes compétences en Algorithmique et en Informatique ;
  • vous avez de fortes compétences dans au moins l’un des domaines suivants :
    • systèmes d’exploitation ;
    • réseaux ;
    • géométrie algorithmique ;
    • compilation ;
    • cryptographie ;
    • analyse statique ;
    • langages de programmation ;
    • extraire des informations pertinentes à partir de données exotiques ;
    • algorithmique distribuée ;
    • le web en tant que plate-forme ;
    • interactions avec les communautés du logiciel libre ;
    • toute autre compétence qui, à votre avis, pourrait nous servir.
  • sur certains sujets, un excellent niveau d’Anglais peut être indispensable ;
  • les stages sont généralement prévus pour des étudiants M1 ou M2 mais si vous arrivez à nous impressionner par vos réalisations ou par vos connaissances, le diplôme n’est pas indispensable.

Si vous vous reconnaissez, nous vous invitons à nous contacter. En fonction du sujet, les stages peuvent vous emmener à Paris, Mountain View, San Francisco, Toronto, Taipei, ou d’autres lieux à travers le monde.

Les meilleurs stagiaires peuvent espérer un contrat freelance, un CDI ou/et une bourse de doctorat.

Pour nous contacter

Pour toute question, contactez :

  • pour tout ce qui concerne les stages chez Mozilla, Julie Deroche (à mozilla.com, jderoche) – Mozilla Mountain View, College Recruiting ;
  • pour les stages à Paris, David Rajchenbach-Teller (à mozilla.com, dteller) – Mozilla Paris, Développeur / Chercheur.

First look at Google Dart

October 13, 2011 § 4 Comments

A few weeks ago, the browser and web development communities started wondering about this mysterious new web language that Google was about to unveil: Dart. Part of the interrogation was technical – what would that language look like? how would a new language justify its existence? what problems would it solve? – and part was more strategic – what was Google doing preparing a web language in secret? where the leaked memos that seemed to imply a web-standards-breaking stance something that Google would indeed pursue? was Google trying to solve web-related problems, Google-related problems or Oracle-related problems?

Now, Google has unveiled the specifications of Dart, as well as library documentation. Neither will be sufficient to answer all questions, but they give us an opportunity to look at some of the technical sides of the problem. As a programming language researcher/designer and a member of the web browser community, I just had to spend some quality time with the Dart specifications.

So, how’s Dart? Well, let’s look at it.

What Dart is

Dart is a programming language and a Virtual Machine. As a programming language, Dart positions itself somewhere in the scope between scripting/web development and application development. From the world of application development, Dart brings

  • clean concurrency primitives that would feel at home in Scala, Clojure or Erlang – including a level of concurrent error reporting;
  • a clean module mechanism, including a notion of privacy;
  • a type system offering genericity, interfaces and classes;
  • compilation and a virtual machine;
  • a library of data structures;
  • no eval();
  • data structures that do not change shape with time.

From the world of scripting/web development, Dart brings:

  • usability in any standards-compliant browser, without any plug-in (although it will work better in a plug-in and/or in Chrome);
  • DOM access;
  • emphasis on fast start-up;
  • a liberal approach to typing (i.e. types are optional and the type system is incorrect, according to the specifications);
  • dynamic errors;
  • closures (which are actually not scripting/web development related, but until Java 8 lands or until Scala, F# or Haskell gain popularity, most developers will believe that they are).

Where Dart might help

Web development has a number of big problems. I have trolled written about some of them in previous posts, and Dart was definitely designed to help, at least a little.

Security

JavaScript is interpreted, can be written inline in html and supports eval(). By opposition, Dart code is compiled. Dart does not have eval() and Dart code is not written inline in html. Consequently, Dart itself offers a smaller attack surface for cross-site scripting.  Note that Dart can still be used as a component for a XSS targeting the document itself, and that using Dart does not prevent an attacker from using JavaScript to inject XSS in the page.

Safety and Code Hygiene

Out-of-the-box, JavaScript does not offer any static or hybrid typing. Dart offers (optional, hybrid) typing. This is a very useful tool for helping developers and developer groups find errors in their code quickly.

JavaScript offers prototype-based object-oriented programming, without explicit private methods/fields. By opposition, Dart offers modules, classes (with support for private methods/fields) and interfaces. Again, very useful for providing abstractions that do not leak [too much].

For historical reasons, JavaScript offers weird and error-prone scoping and will let developers get away without realizing that they are dereferencing undefined variables. Dart does away with this. Again, this is a good way to find errors quickly.

Libraries

Out-of-the-box, JavaScript does not provide data structures, or much in the way of libraries. By opposition, Dart provides a few data structures and libraries.

Exceptions

For a long time, JavaScript exceptions were not extensible. Eventually, it became possible to define new kinds of exceptions. However, JavaScript still doesn’t support matching the exception constructor, by opposition to what almost all other programming languages do. Dart makes no exception and allows matching upon the exception constructor. This makes exception-handling a little nicer and debugging exception traces a little more robust.

Concurrency

For a long time, JavaScript did not provide any form of concurrency primitive. Recent versions of JavaScript do offer Workers. Similarly, Dart offers Isolates, with a paradigm very similar to Workers. Where Workers are always concurrent, Isolates can also be made non-concurrent, for better performance at the expense of reactivity. Initialization and error-reporting are also a little different, but otherwise, Isolates and Workers are quite comparable.

Speed

Dart promises better speed than JavaScript. I cannot judge about it.

Niceties

Dart offers “string interpolation” to insert a value in a string. Nice but not life-altering. Also, out-of-the-box, JavaScript DOM access is quite verbose. By opposition, Dart provides syntactic sugar that makes it a little nicer.

Where Dart might hinder

Vendor control/adoption

The single biggest problem with Dart is, of course, its source. To get the VM in the browsers, Google will have to convince both developers and other browser vendors to either reimplement the VM by themselves or use a Google-issued VM. This is possible, but this will be difficult for Google.

The open vehicle for this is to convince developers to us Dart for server-side programming – where Dart will be competing with Java, Scala, C#, F#, Python, JavaScript, Erlang and even Google’s Go – and for client-side programming by getting through JavaScript – which will severely hinder performance, safety and security.

The vendor controlled vehicle will be to integrate the VM in Chrome and Android and encourage developers targeting the Chrome Market and Android Market to use Dart. Some speculate that this is a manner for Google to get rid of the Java dependency on the Android Market. In this case, of course, there will be little competition.

Libraries and documentation

JavaScript has a host of libraries and considerable documentation. I will admit that much of the documentation one may find around the web is not good (hint: use Mozilla’s documentation, it is the only reliable source I have found), but that is still infinitely more than what Dart can provide at the moment.

In other words, for the moment, Dart cannot take advantage of the special effects, the game-building libraries, the streaming libraries, etc. that have been developed for JavaScript. This, of course, is something that Google has the resources to change relatively fast, but, by experience, I can tell that many developers are averse to relearning.

Doing it without Dart

Security

We’re not going to get rid of XSS without some effort, even with Dart. However, making sure that JavaScript offers an attack surface no larger than Dart is easy: forbid eval() and forbid any inline JavaScript. It would be quite easy to add an option to HTML documents to ensure that this is the case. Note that this option remains necessary even if all the code is written in Dart, as Dart does not prevent from injecting JavaScript.

Code Hygiene

Out-of-the-box, JavaScript does not offer any support for static/hybrid typing. However, Google has demonstrated how to add static typing with the Google Closure Compiler and Mozilla has demonstrated how to add hybrid typing with Dynamic Type Inference. Both projects indicate that we can obtain something at least as good as Dart in JavaScript, without altering/reinventing the language.

Out-of-the-box, JavaScript does not offer modules. However, Mozilla has been offering a module system for JavaScript for years and new versions of the language are in the process of standardizing this.

Also, while classes and private fields are probably the least surprising techniques for application developers coming to the web, developers used to dynamic or functional languages know that closures and prototypes are essentially equivalent. So, this is essentially a matter of taste.

Finally, clean, lexical scoping will be welcomed by all developers who know what these words mean and quite a few others. Fortunately, it is also coming to JavaScript with recent versions of the language.

Concurrency

Isolates are nice. Workers are nice. Isolates are a little easier to set-up, so I would like to see an Isolate-like API for Workers. Other than that, they are essentially equivalent.

Speed

So far, Google has always managed to deliver on speed promises with V8, so I would tend to believe them. However, recent improvements in JavaScript analysis also promise to analyze away all the cases that can make JavaScript slower than Java, and I also tend to believe these promises. Consequently, I will venture no guess about it.

Libraries

It is a shame that JavaScript does not come with more libraries. However, many frameworks are available that implement standard data structures and more.

Exceptions

Dart exceptions are a little nicer than JavaScript exceptions, there no doubt about that. However, making JavaScript exceptions as good as Dart exceptions would be quite simple. The only difficulty is getting this improvement into the standard.

Niceties

String interpolations are nice to have, but not really life-altering. If necessary, they can trivially be implemented by a pre-processor. CoffeeScript might already do it, I’m not sure. Adding this to the JS standard might be tricky, for reasons of backwards compatibility, but there is not much to it.

Dart-style DOM access is nice, too. However, adding this to JavaScript would be quite trivial, in particular with next-generation DOM implementations such as dom.js.

The result

I admit that I am a little disappointed. When Dart was announced, I was hoping something truly evolutionary. So far, what I have found out is a nice language, certainly, but not much more. While Dart is definitely better in many aspects than today’s JavaScript, given the current evolution of JavaScript, none of these aspects is a deal-breaker. However, several aspects of Dart (in particular, typing and exceptions) indicate a good direction in which I believe JavaScript should evolve, and I hope that the presence of Dart can get JavaScript standardization moving faster.

If we consider I my opinion, there are three ways that Google can get Dart adopted on the web:

  • make it the default choice for Android & Chrome development;
  • provide a set of killer libraries for the web, that work on all browsers but are truly usable only with Dart (DirectX anyone? something Cocoa-style, perhaps?);
  • spend Google-sized budgets on adoption (PR, marketing, GSoC, open-source projects, etc.).

Nevertheless, for the moment, I will keep far away from Dart and look hopefully at Scala-GWT, WebSharper or Ocsigen.

[Rant] Web development is just broken

April 18, 2011 § 104 Comments

No, really, there’s something deeply flawed with web development. And I’m not (just) talking browser incompatibilities, or Ruby vs. Java vs. PHP vs. anything else or about Flash or HTML5, I’m talking about something deeper and fundamental.

Edit Added TL;DR

Edit Interesting followups on Hacker News and Reddit.

Edit Added a comparison with PC programming.

Edit Added a follow up introducing the Opa project.

TL;DR

The situation

  • Plenty of technologies somehow piled on top of each other.

My nightmares:

  • Dependency nightmare
  • Documentation nightmare
  • Scalability nightmare
  • Glue nightmare
  • Security nightmare

Just to clarify: None of these nightmares is about having to make choices.

My dream:

  • Single computer development had the same set of issues in the 80s-90s. They’re now essentially solved. Let’s do the same for the web.

Shop before you code

Say you want to write a web applications. If it’s not a trivial web application, chances are that you’ll need to choose:

  • a programming language;
  • a web server;
  • a database management system;
  • a server-side web framework;
  • a client-side web framework.

And you do need all the above. A programming language, because you’re about to program something, so no surprise here. You also need a web server, because you’re about to write a web application and you need to deliver it, so again, no surprise.

A database management system, because you’ll want to save data and/or to share data, and it’s just too dangerous to access the file system. Strangely, though, your programming language will give you access to the file system, and it’s somewhere else, at the operating system layer, that you’ll have to restrict this. Now, depending on your application and your DBMS, your data may fit completely with the DBMS, but often, that’s not the case, because your application manipulates object-oriented data structures, while your DBMS manipulates either records and relations or keys and values. And at this stage, you have two possibilities: either you forcefeed your data into your database – essentially reinventing (de)serialization and storage on top of an antagonistic technology – or you add to your stack a form of Object Relational Manager to do this for you.

You need a server-side framework – perhaps more than one –, too, because, let’s face it, at this stage, your (empty) application already feels so complicated that you’ll need all the help you can get to avoid having to reinvent templating, POST/GET management, sessions, etc. Oh, actually, what I wrote above, that’s not quite true: depending on your framework, you may need some access to the file system for your images, your pages, etc. and all other stuff that may or may not fit naturally in the DBMS. So back to the OS layer to configure it more finely.

And finally, you’ll certainly want to write your application for several browsers. As all developers know, there is no such thing as a pair of compatible browsers, and since you certainly don’t want to spend all of your time juggling with (largely undocumented) browser incompatibilities and/or limitations of the JavaScript standard library, it’s time to add a client-side framework, perhaps more.

So, in addition to the first list, you probably have to choose and configure:

  • an OS;
  • OS-level security layers;
  • an ORM (unless you’re reinventing yours) or an approximation thereof;

At this stage, you haven’t written “Hello, world” yet.

On the other hand, you’re about to enter dependency nightmare, because not all web servers fit with all frameworks, not all server-side frameworks with all client-side frameworks, or with all ORMs, or with all OSes, not to mention incompatibilities between OS and DBMS, etc. You also have entered documentation nightmare, because information on how to configure the security layers of your OS is marginal at best, and of course totally separate from information on how to configure your DBMS, or your ORM, or your frameworks, etc.

Note that I haven’t mentioned anything about scaling up yet, because the scaling nightmare would deserve a complete post.

Sure, you will solve all of these issues. You will handpick your tools, discard a few, and eventually, since you’re a developer (you’re a developer, right?), you’ll eventually assemble a full development platform in which every technology somehow accepts to talk to its neighbors. Heavens forbid that you make a mistake at that stage, because once you start with actual coding, there will be no coming back, but yes, you’re now ready to code.

At this stage, a few questions cross my mind:

  • You have reached that stage, because you have the time and skills to do this, but what about Joe beginner? Do they deserve this?
  • Remember that you haven’t written “Hello, world” yet. These hours of your life you have spent to get to this stage, do you have a feeling that they were well-spent?
  • What if you made a mistake, i.e. what if something is subtly incompatible but you haven’t noticed yet, or if one of the technologies you’re using is deprecated, or doesn’t match your security policy, how much time will you spend rooting out all the stuff that’s hardwired with this technology?

So, yes, for all these reasons, I decree that web development is broken. But that’s not all there is to it.

So you have started coding. Good for you.

Now, you have a set of tools that should be sufficient to develop your application – again, possibly not for scaling it up, but that’s a different story. So, you can start coding.

Welcome to the third nightmare: the glue nightmare. What’s the glue? Well, it’s that sticky stuff that you put between two technologies that don’t know about each other, that don’t really fit with each other, but that you need to get working together.

You have data on the client and you want to send it to the server. Time to encode them as JSON or XML, send them with Ajax, open an Ajax entry point on the server (temporary? permanent?), parse the received data on the server (what do you do if it doesn’t parse?), decode the parsed data to your usual data structures, validate the values in these data structures (or should you have done that before decoding?), and then use them (I really hope that you have validated everything carefully). That was the easy part. Now, say you have data on the server and you want to send it to the client. Time to encode them as JSON or XML, send them with Comet – oops, there’s no such thing as “sending with Comet”, so you should open an Ajax entry point on the server (same one? temporary? permanent?) and let the client come and fetch the data (how do you ensure it’s the right client?). Plus the usual parsing and decoding. Except the server code you wrote for parsing and decoding doesn’t work in your browser. Plus, be careful with parsing, because you can get some nasty injections at that stage, or you can just crash a number of JS engines accidentally. Add a little debugging, some more work on garbage-collection and you can send “Hello” from the client to the server or from the server to the client.

Again, the question is not: “can you get this to work?” – I’m sure that you can, many of us do this on a regular basis. The questions are more:

  • was this time well-spent?
  • are you sure that it works?
  • really, really sure?
  • even if browsers can crash?
  • even if users are malicious?
  • how can you be certain?

The client-server glue doesn’t stop here – if only we were so luck. There’s more for handling forms or uploads, or to inject user-generated contents into pages, but let’s move to server-side glue.

Storage is full of glue, too. You have data that fits your application and you’ll want to send it to your DBMS. Now, either you’re using an ORM or you’re encoding the data manually in a manner that somehow fits your database paradigm. That’s already quite a sizable layer of glue, but that’s not all. Sending your application to your DBMS means opening a connection, somehow serializing your data (I hope it’s well-validated, too, just in case someone attempts to inject bogus database requests), somehow serializing your database request, handling random disconnections and sometimes unpredictable (de)serialization errors (by the way, I hope you made sure that the database could never be in an inconsistent state, even if the browser crashes), somehow (de)serializing database responses (can you handle the database not responding at all?) and reconnecting in case of disconnection. Oh, and since your database and your application are certainly based on distinct security paradigms, you’ll have to set up both your application security and your database security, and you’ll have to ensure that they work together. Did I mention handling distinct encodings? Ensuring that temporary bindings are never stored in the database? Performing garbage-collection in the database?

Glue doesn’t stop here, of course. Do I need to get started on REST or SOAP or WSDL?

Again, you’ll solve all these issues, eventually. But if you’re like me, you’ll wonder why you had to spend so much time on so little stuff. These are not features, they are not infrastructure, they are not fun, they’re just glue, to get a shamble of technologies – some of which date back to the 1970s – to work together.

Oh, and at this stage, chances are that you have plenty of security holes. Welcome to the security nighmare. Because your programming language has no idea that this piece of XML or JSON or string will end-up being displayed on a user’s browser (XSSi, anyone?) or stored in the database (SQLi or CouchDB injection, anyone?), or that this URI is actually a command with potentially dangerous side-effects (sounds like a XSRF), etc. By default, any web application is broken, security-wise, in dozens of ways. Most of the web application tutorials I’ve seen on the web contain major security holes. And that’s just not right. If you look at these security issues, you’ll realize that most of them are actually not in the features you coded, not even in the infrastructure you assembled, but in the glue itself. Worse than that: many of these issues are actually trivial things, that could be solve once and for all, but we are stuck using tools that don’t even attempt to solve them. So, chances are that in your application, no matter how good you are, you will forget to validate and/or escape data at least once, that you will forget to authenticate before giving access to one of your resources or that you will forget something that somehow will let a malicious user get into your app and cause all sorts of chaos.

I stand my case. Web development is broken.

History of brokenness

But if we look at it closer, a few years ago, so was PC development. Any non-trivial application needed:

  • application-level code, of course;
  • memory-level hackery just to get past the 640kb barrier (or, equivalently, the handle limitation under Windows – 64kb iirc);
  • IRQ-level coding and/or Windows SDK-level coding (the first one was rather fun, the second was a complete nightmare, neither were remotely meant for anybody who was not seriously crazy);
  • (S)VGA BIOS level hackery to get anything cool to display;
  • BIOS-level fooling.

Five layers of antagonist technologies that needed to get hacked into compliance. The next generation was represented by the NeXT frameworks and attempts to bring the same amount of comfort to Windows-land (OWL, MFC, etc.). Still fragile and complex, but a huge improvement. And the next generation was Java/C#/Python programming. Nowadays, you can install/apt-get/emerge/port install your solution, start coding right away, and be certain that you have everything you need. Nightmare solved.

On the web, we’re still stuck somewhere between the first generation and the second. Why don’t we aim for the third?

A manifesto for web development that works

Time to stop the rant and start thinking positive. All the above is web development that’s broken. Now, what would not-broken web development look like?

Let’s go for the following:

I want to start coding now

        without having to learn configuration, dependencies or deployment;

I don’t want to write no glue

        the web is one platform, time to stop forcing us to treat it as a collection of heterogeneous components;

I don’t want to repeat myself

        so don’t force me to write several validators for the same data, or several libs that do the same thing in different components of my web application;

I don’t care about browser wars

        my standard toolkit must work on all browsers, end of the story;

Give me back my agility

        I want to be able to make important refactorings, to move code around the client, the server, the database component without having to rewrite everything.

Secure by default

        all the low-level security issues must be handled automatically and transparently by the platform, not by me. I’ll concentrate on the high-level application-specific issues.

All of this is definitely possible. So please give it to me this and you’ll make many much happier coders.

Disclaimer My company builds related technology. However, this blogs expresses my personal views.

Where Am I?

You are currently browsing entries tagged with development at Il y a du thé renversé au bord de la table.

Follow

Get every new post delivered to your Inbox.

Join 30 other followers