It’s not about Webkit, silly. It’s about evolution.

February 20, 2013 § Leave a comment

« Webkit is a rust bucket. We can’t move away from it, because our users rely on its bugs as much as on its features, but it’s based on deprecated technologies, concepts that don’t scale anymore, and it just won’t match today’s needs or hardware. If we had any choice, we would dump the whole thing and restart from scratch. »

« Read the rest of this entry »

Beautiful Off Main Thread File I/O

October 18, 2012 § 7 Comments

Now that the main work on Off Main Thread File I/O for Firefox is complete, I have finally found some time to test-drive the combination of Task.js and OS.File. Let me tell you one thing: it rocks!

« Read the rest of this entry »

JavaScript, this static language (part 1)

October 20, 2011 § 7 Comments

tl;dr

JavaScript is a dynamic language. However, by borrowing a few pages from static languages – and a few existing tools – we can considerable improve reliability and maintainability.

« Writing one million lines of code of JavaScript is simply impossible »

(source: speaker in a recent open-source conference)

JavaScript is a dynamic language – a very dynamic one, in which programs can rewrite themselves, objects may lose or gain methods through side-effects on themselves on on their prototypes, and, more generally, nothing is fixed.

And dynamic languages are fun. They make writing code simple and fast. They are vastly more suited to prototyping than static languages. Dynamism also makes it possible to write extremely powerful tools that can perform JIT translation from other syntaxes, add missing features to existing classes and functions and more generally fully customize the experience of the developer.

Unfortunately, such dynamism comes with severe drawbacks. Safety-minded developers will tell you that, because of this dynamism, they simply cannot trust any snippet, as this snippet may behave in a manner that does not match its source code. They will conclude that you cannot write safe, or even modular, applications in JavaScript.

Many engineering-minded developers will also tell you that they simply cannot work in JavaScript, and they will not have much difficulty finding examples of situations in which the use of a dynamic language in a complex project can, effectively, kill the project. If you do not believe them, consider a large codebase, and the (rather common) case of a large transversal refactoring, for instance to replace an obsolete API by a newer one. Do this in Java (or, even better, in a more modern mostly-static language such as OCaml, Haskell, F# or Scala), and you can use the compiler to automatically and immediately spot any place where the API has not been updated, and will spot a number of errors that you may have made with the refactoring. Even better, if the API was designed to be safe-by-design, the compiler will automatically spot even complex errors that you may have done during refactoring, including calling functions/methods in the wrong order, or ownership errors. Do the same in JavaScript and, while your code will be written faster, you should expect to be hunting bugs weeks or even months later.

I know that the Python community has considerably suffered from such problems during version transitions. I am less familiar with the world of PHP, but I believe this is no accident that Facebook is progressively arming itself with PHP static analysis tools. I also believe that this is no accident that Google is now introducing a typed language as a candidate replacement for JavaScript.

That is because today is the turn of JavaScript, or if not today, surely tomorrow. I have seen applications consisting in hundreds of thousands of lines of JavaScript. And if just maintaining these applications is not difficult enough, the rapid release cycles of both  Mozilla and Chrome, mean that external and internal APIs are now changing every six weeks. This means breakage. And, more precisely, this means that we need new tools to help us predict breakages and help developers (both add-on developers and browser contributors) react before these breakages hit their users.

So let’s do something about it. Let’s make our JavaScript a strongly, statically typed language!

Or let’s do something a little smarter.

JavaScript, with discipline

At this point, I would like to ask readers to please kindly stop preparing tar and feathers for me. I realize fully that JavaScript is a dynamic language and that turning it into a static language will certainly result in something quite disagreeable to use. Something that is verbose, has lost most of the power of JavaScript, and gained no safety guarantees.

Trust me on this, there is a way to obtain the best of both worlds, without sacrificing anything. Before discussing the manner in which we can attain this, let us first set objectives that we can hope to achieve with a type-disciplined JavaScript.

Finding errors

The main benefit of strong, static typing, is that it helps find errors.

  • Even the simplest analyses can find all syntax errors, all unbound variables, all variables bound several times and consequently almost all scoping errors, which can already save considerable time for developers. Such an analysis requires no human intervention from the developer besides, of course, fixing any error that has been thus detected. As a bonus, in most cases, the analysis can suggest fixes.
  • Similarly trivial forms of analysis can also detect suspicious calls to break or continue, weird uses of switch(), suspicious calls to private fields of objects, as well as suspicious occurrences of eval – in my book, eval is always suspicious.
  • Slightly more sophisticated analyses can find most occurrences of functions or methods invoked with the wrong number of arguments. Again, this is without human intervention. With type annotations/documentation, we can move from most occurrences to all occurrences.
  • This same analysis, when applied to public APIs, can provide developers with more informations regarding how their code can be (mis)used.
  • At the same level of complexity, analysis can find most erroneous access to fields/methods, suspicious array traversals, suspicious calls to iterators/generators, etc. Again, with type annotations/documentation, we can move from most to all.
  • Going a little further in complexity, analysis can find fragile uses of this, uncaught exceptions, etc.

Types as documentation

Public APIs must be documented. This is true in any language, no matter where it stands on the static/dynamic scale. In static languages, one may observe how documentation generation tools insert type information, either from annotations provided by the user (as in Java/JavaDoc) or from type information inferred by the compiler (as in OCaml/OCamlDoc). But look at the documentation of Python, Erlang or JavaScript libraries and you will find the exact same information, either clearly labelled or hidden somewhere in the prose: every single value/function/method comes with a form of type signature, whether formal or informal.

In other words, type information is a critical piece of documentation. If JavaScript developers provide explicit type annotations along with their public APIs, they have simply advanced the documentation, not wasted time. Even better, if such type can be automatically inferred from the source code, this piece of documentation can be automatically written by the type-checker.

Types as QA metric

While disciples of type-checking tend to consider typing as something boolean, the truth is more subtle: it quite possible that one piece of code does not pass type-checking while the rest of the code does. Indeed, with advanced type systems that do not support decidable type inference, this is only to be expected.

The direct consequence is that type-checking can be seen as a spectrum of quality. A code can be seen as failing if the static checking phrase can detect evident errors, typically unbound values or out-of-scope break, continue, etc. Otherwise, every attempt to type a value that results in a type error is a hint of poor QA practice that can be reported to the developer. This yields a percentage of values that can be typed – obtain 100% and get a QA stamp of approval for this specific metric.

Typed JavaScript, in practice

Most of the previous paragraphs are already possible in practice, with existing tools. Indeed, I have personally experienced using JavaScript static type checking as a bug-finding tool and a QA metric. On the first day, this technique has helped me find both plenty of dead code and 750+ errors, with only a dozen false positives.

For this purpose, I have used Google’s Closure Compiler. This tool detects errors, supports a simple vocabulary for documentation/annotations, fails only if very clear errors are detected (typically syntax errors) and provides as metric a percentage of well-typed code. It does not accept JavaScript 1.7 yet, unfortunately, but this can certainly be added.

I also know of existing academic work to provide static type-checking for JavaScript, although I am unsure as to the maturity of such works.

Finally, Mozilla is currently working on a different type inference mechanism for JavaScript. While this mechanism is not primarily aimed at finding errors, my personal intuition is that it may be possible to repurpose it.

What’s next?

I hope that I have convinced you of the interest of investigating manners of introducing static, strong type-checking to JavaScript. In a second part, I will detail how and where I believe that this can be done in Mozilla.

First look at Google Dart

October 13, 2011 § 4 Comments

A few weeks ago, the browser and web development communities started wondering about this mysterious new web language that Google was about to unveil: Dart. Part of the interrogation was technical – what would that language look like? how would a new language justify its existence? what problems would it solve? – and part was more strategic – what was Google doing preparing a web language in secret? where the leaked memos that seemed to imply a web-standards-breaking stance something that Google would indeed pursue? was Google trying to solve web-related problems, Google-related problems or Oracle-related problems?

Now, Google has unveiled the specifications of Dart, as well as library documentation. Neither will be sufficient to answer all questions, but they give us an opportunity to look at some of the technical sides of the problem. As a programming language researcher/designer and a member of the web browser community, I just had to spend some quality time with the Dart specifications.

So, how’s Dart? Well, let’s look at it.

What Dart is

Dart is a programming language and a Virtual Machine. As a programming language, Dart positions itself somewhere in the scope between scripting/web development and application development. From the world of application development, Dart brings

  • clean concurrency primitives that would feel at home in Scala, Clojure or Erlang – including a level of concurrent error reporting;
  • a clean module mechanism, including a notion of privacy;
  • a type system offering genericity, interfaces and classes;
  • compilation and a virtual machine;
  • a library of data structures;
  • no eval();
  • data structures that do not change shape with time.

From the world of scripting/web development, Dart brings:

  • usability in any standards-compliant browser, without any plug-in (although it will work better in a plug-in and/or in Chrome);
  • DOM access;
  • emphasis on fast start-up;
  • a liberal approach to typing (i.e. types are optional and the type system is incorrect, according to the specifications);
  • dynamic errors;
  • closures (which are actually not scripting/web development related, but until Java 8 lands or until Scala, F# or Haskell gain popularity, most developers will believe that they are).

Where Dart might help

Web development has a number of big problems. I have trolled written about some of them in previous posts, and Dart was definitely designed to help, at least a little.

Security

JavaScript is interpreted, can be written inline in html and supports eval(). By opposition, Dart code is compiled. Dart does not have eval() and Dart code is not written inline in html. Consequently, Dart itself offers a smaller attack surface for cross-site scripting.  Note that Dart can still be used as a component for a XSS targeting the document itself, and that using Dart does not prevent an attacker from using JavaScript to inject XSS in the page.

Safety and Code Hygiene

Out-of-the-box, JavaScript does not offer any static or hybrid typing. Dart offers (optional, hybrid) typing. This is a very useful tool for helping developers and developer groups find errors in their code quickly.

JavaScript offers prototype-based object-oriented programming, without explicit private methods/fields. By opposition, Dart offers modules, classes (with support for private methods/fields) and interfaces. Again, very useful for providing abstractions that do not leak [too much].

For historical reasons, JavaScript offers weird and error-prone scoping and will let developers get away without realizing that they are dereferencing undefined variables. Dart does away with this. Again, this is a good way to find errors quickly.

Libraries

Out-of-the-box, JavaScript does not provide data structures, or much in the way of libraries. By opposition, Dart provides a few data structures and libraries.

Exceptions

For a long time, JavaScript exceptions were not extensible. Eventually, it became possible to define new kinds of exceptions. However, JavaScript still doesn’t support matching the exception constructor, by opposition to what almost all other programming languages do. Dart makes no exception and allows matching upon the exception constructor. This makes exception-handling a little nicer and debugging exception traces a little more robust.

Concurrency

For a long time, JavaScript did not provide any form of concurrency primitive. Recent versions of JavaScript do offer Workers. Similarly, Dart offers Isolates, with a paradigm very similar to Workers. Where Workers are always concurrent, Isolates can also be made non-concurrent, for better performance at the expense of reactivity. Initialization and error-reporting are also a little different, but otherwise, Isolates and Workers are quite comparable.

Speed

Dart promises better speed than JavaScript. I cannot judge about it.

Niceties

Dart offers “string interpolation” to insert a value in a string. Nice but not life-altering. Also, out-of-the-box, JavaScript DOM access is quite verbose. By opposition, Dart provides syntactic sugar that makes it a little nicer.

Where Dart might hinder

Vendor control/adoption

The single biggest problem with Dart is, of course, its source. To get the VM in the browsers, Google will have to convince both developers and other browser vendors to either reimplement the VM by themselves or use a Google-issued VM. This is possible, but this will be difficult for Google.

The open vehicle for this is to convince developers to us Dart for server-side programming – where Dart will be competing with Java, Scala, C#, F#, Python, JavaScript, Erlang and even Google’s Go – and for client-side programming by getting through JavaScript – which will severely hinder performance, safety and security.

The vendor controlled vehicle will be to integrate the VM in Chrome and Android and encourage developers targeting the Chrome Market and Android Market to use Dart. Some speculate that this is a manner for Google to get rid of the Java dependency on the Android Market. In this case, of course, there will be little competition.

Libraries and documentation

JavaScript has a host of libraries and considerable documentation. I will admit that much of the documentation one may find around the web is not good (hint: use Mozilla’s documentation, it is the only reliable source I have found), but that is still infinitely more than what Dart can provide at the moment.

In other words, for the moment, Dart cannot take advantage of the special effects, the game-building libraries, the streaming libraries, etc. that have been developed for JavaScript. This, of course, is something that Google has the resources to change relatively fast, but, by experience, I can tell that many developers are averse to relearning.

Doing it without Dart

Security

We’re not going to get rid of XSS without some effort, even with Dart. However, making sure that JavaScript offers an attack surface no larger than Dart is easy: forbid eval() and forbid any inline JavaScript. It would be quite easy to add an option to HTML documents to ensure that this is the case. Note that this option remains necessary even if all the code is written in Dart, as Dart does not prevent from injecting JavaScript.

Code Hygiene

Out-of-the-box, JavaScript does not offer any support for static/hybrid typing. However, Google has demonstrated how to add static typing with the Google Closure Compiler and Mozilla has demonstrated how to add hybrid typing with Dynamic Type Inference. Both projects indicate that we can obtain something at least as good as Dart in JavaScript, without altering/reinventing the language.

Out-of-the-box, JavaScript does not offer modules. However, Mozilla has been offering a module system for JavaScript for years and new versions of the language are in the process of standardizing this.

Also, while classes and private fields are probably the least surprising techniques for application developers coming to the web, developers used to dynamic or functional languages know that closures and prototypes are essentially equivalent. So, this is essentially a matter of taste.

Finally, clean, lexical scoping will be welcomed by all developers who know what these words mean and quite a few others. Fortunately, it is also coming to JavaScript with recent versions of the language.

Concurrency

Isolates are nice. Workers are nice. Isolates are a little easier to set-up, so I would like to see an Isolate-like API for Workers. Other than that, they are essentially equivalent.

Speed

So far, Google has always managed to deliver on speed promises with V8, so I would tend to believe them. However, recent improvements in JavaScript analysis also promise to analyze away all the cases that can make JavaScript slower than Java, and I also tend to believe these promises. Consequently, I will venture no guess about it.

Libraries

It is a shame that JavaScript does not come with more libraries. However, many frameworks are available that implement standard data structures and more.

Exceptions

Dart exceptions are a little nicer than JavaScript exceptions, there no doubt about that. However, making JavaScript exceptions as good as Dart exceptions would be quite simple. The only difficulty is getting this improvement into the standard.

Niceties

String interpolations are nice to have, but not really life-altering. If necessary, they can trivially be implemented by a pre-processor. CoffeeScript might already do it, I’m not sure. Adding this to the JS standard might be tricky, for reasons of backwards compatibility, but there is not much to it.

Dart-style DOM access is nice, too. However, adding this to JavaScript would be quite trivial, in particular with next-generation DOM implementations such as dom.js.

The result

I admit that I am a little disappointed. When Dart was announced, I was hoping something truly evolutionary. So far, what I have found out is a nice language, certainly, but not much more. While Dart is definitely better in many aspects than today’s JavaScript, given the current evolution of JavaScript, none of these aspects is a deal-breaker. However, several aspects of Dart (in particular, typing and exceptions) indicate a good direction in which I believe JavaScript should evolve, and I hope that the presence of Dart can get JavaScript standardization moving faster.

If we consider I my opinion, there are three ways that Google can get Dart adopted on the web:

  • make it the default choice for Android & Chrome development;
  • provide a set of killer libraries for the web, that work on all browsers but are truly usable only with Dart (DirectX anyone? something Cocoa-style, perhaps?);
  • spend Google-sized budgets on adoption (PR, marketing, GSoC, open-source projects, etc.).

Nevertheless, for the moment, I will keep far away from Dart and look hopefully at Scala-GWT, WebSharper or Ocsigen.

Good morning Mozilla!

September 21, 2011 § 3 Comments

It’s an exciting time for us Computer Scientists.

Three years ago, I was an academic researcher, studying the best manner to certify binaries uploaded to clouds and ensure their security. Less than three weeks ago, I was working at MLstate, leading the R&D team and striving to change the face of web programming languages and web security. Today, I am part of Mozilla, as a proud member of the Performance team, where we take steps, each day, to make Firefox, Thunderbird and the Mozilla platform faster and more responsive.

We, at Mozilla, have considerable work ahead of us. Because the web needs to be defended, kept open and compatible with privacy, and, on some of these fronts, only Mozilla stands.

An exciting time, indeed – and an exciting way of doing my part!

I will try and blog about ongoing developments as soon as I find a minute.

Edit It seems that I misredacted some of the announces that were sent. Just to clarify, I am not the leader of the Performance team. That person is Taras Glek.

Crowdsourcing the syntax

May 30, 2011 § 18 Comments

Feedback from Opa testers suggests that we can improve the syntax and make it easier for developers new to Opa to read and write code. We have spent some time both inside the Opa team and with the testers designing two possible revisions to the syntax. Feedback on both possible revisions, as well as alternative ideas, are welcome.

A few days ago, we announced the Opa platform, and I’m happy to announce that things are going very well. We have received numerous applications for the closed preview – we now have onboard people from Mozilla, Google and Twitter, to quote but a few, from many startups, and even from famous defense contractors – and I’d like to start this post by thanking all the applicants. It’s really great to have you guys & gals and your feedback. We are still accepting applications, by the way.

Speaking of feedback, we got plenty of it, too, on just about everything Opa, much of it on the syntax. This focus on syntax is only fair, as syntax is both the first thing a new developer sees of a language and something that they have to live with daily. And feedback on the syntax indicates rather clearly that our syntax, while being extremely concise, was perceived as too exotic by many developers.

Well, we aim to please, so we have spent some time with our testers working on possible syntax revisions, and we have converged on two possible syntaxes. In this post, I will walk you through syntax changes. Please keep in mind that we are very much interested in feedback, so do not hesitate to contact us, either by leaving comments on this blog, by IRC, or at feedback@opalang.org .

Important note: that we will continue supporting the previous syntax for some time and we will provide tools to automatically convert from the previous syntax to the revised syntax.

Let me walk you through syntax changes.

« Read the rest of this entry »

Unbreaking Scalable Web Development, One Loc at a Time

May 23, 2011 § 18 Comments

The Opa platform was created to address the problem of developing secure, scalable web applications. Opa is a commercially supported open-source programming language designed for web, concurrency, distribution, scalability and security. We have entered closed beta and the code will be released soon on http://opalang.org, as an Owasp project .

If you are a true coder, sometimes, you meet a problem so irritating, or a solution so clumsy, that challenging it is a matter of engineering pride. I assume that many of the greatest technologies we have today were born from such challenges, from OpenGL to the web itself. The pain of pure LAMP-based web development begat Ruby on Rails, Django or Node.js, as well as the current NoSQL generation. Similarly, the pains of scalable large system development with raw tools begat Erlang, Map/Reduce or Project Voldemort.

Opa was born from the pains of developing scalable, secure web applications. Because, for all the merits of existing solutions, we just knew that we could do much, much better.

Unsurprisingly, getting there was quite a challenge. Between the initial idea and an actual platform lay blood, sweat and code, many experiments and failed prototypes, but finally, we got there. After years of development and real-scale testing, we are now getting ready to release the result.

The parents are proud to finally introduce Opa. « Read the rest of this entry »

Where Am I?

You are currently browsing the Sûreté / Security category at Il y a du thé renversé au bord de la table.

Follow

Get every new post delivered to your Inbox.

Join 33 other followers