[Rant] Web development is just broken

April 18, 2011 § 104 Comments

No, really, there’s something deeply flawed with web development. And I’m not (just) talking browser incompatibilities, or Ruby vs. Java vs. PHP vs. anything else or about Flash or HTML5, I’m talking about something deeper and fundamental.

Edit Added TL;DR

Edit Interesting followups on Hacker News and Reddit.

Edit Added a comparison with PC programming.

Edit Added a follow up introducing the Opa project.

TL;DR

The situation

  • Plenty of technologies somehow piled on top of each other.

My nightmares:

  • Dependency nightmare
  • Documentation nightmare
  • Scalability nightmare
  • Glue nightmare
  • Security nightmare

Just to clarify: None of these nightmares is about having to make choices.

My dream:

  • Single computer development had the same set of issues in the 80s-90s. They’re now essentially solved. Let’s do the same for the web.

Shop before you code

Say you want to write a web applications. If it’s not a trivial web application, chances are that you’ll need to choose:

  • a programming language;
  • a web server;
  • a database management system;
  • a server-side web framework;
  • a client-side web framework.

And you do need all the above. A programming language, because you’re about to program something, so no surprise here. You also need a web server, because you’re about to write a web application and you need to deliver it, so again, no surprise.

A database management system, because you’ll want to save data and/or to share data, and it’s just too dangerous to access the file system. Strangely, though, your programming language will give you access to the file system, and it’s somewhere else, at the operating system layer, that you’ll have to restrict this. Now, depending on your application and your DBMS, your data may fit completely with the DBMS, but often, that’s not the case, because your application manipulates object-oriented data structures, while your DBMS manipulates either records and relations or keys and values. And at this stage, you have two possibilities: either you forcefeed your data into your database – essentially reinventing (de)serialization and storage on top of an antagonistic technology – or you add to your stack a form of Object Relational Manager to do this for you.

You need a server-side framework – perhaps more than one –, too, because, let’s face it, at this stage, your (empty) application already feels so complicated that you’ll need all the help you can get to avoid having to reinvent templating, POST/GET management, sessions, etc. Oh, actually, what I wrote above, that’s not quite true: depending on your framework, you may need some access to the file system for your images, your pages, etc. and all other stuff that may or may not fit naturally in the DBMS. So back to the OS layer to configure it more finely.

And finally, you’ll certainly want to write your application for several browsers. As all developers know, there is no such thing as a pair of compatible browsers, and since you certainly don’t want to spend all of your time juggling with (largely undocumented) browser incompatibilities and/or limitations of the JavaScript standard library, it’s time to add a client-side framework, perhaps more.

So, in addition to the first list, you probably have to choose and configure:

  • an OS;
  • OS-level security layers;
  • an ORM (unless you’re reinventing yours) or an approximation thereof;

At this stage, you haven’t written “Hello, world” yet.

On the other hand, you’re about to enter dependency nightmare, because not all web servers fit with all frameworks, not all server-side frameworks with all client-side frameworks, or with all ORMs, or with all OSes, not to mention incompatibilities between OS and DBMS, etc. You also have entered documentation nightmare, because information on how to configure the security layers of your OS is marginal at best, and of course totally separate from information on how to configure your DBMS, or your ORM, or your frameworks, etc.

Note that I haven’t mentioned anything about scaling up yet, because the scaling nightmare would deserve a complete post.

Sure, you will solve all of these issues. You will handpick your tools, discard a few, and eventually, since you’re a developer (you’re a developer, right?), you’ll eventually assemble a full development platform in which every technology somehow accepts to talk to its neighbors. Heavens forbid that you make a mistake at that stage, because once you start with actual coding, there will be no coming back, but yes, you’re now ready to code.

At this stage, a few questions cross my mind:

  • You have reached that stage, because you have the time and skills to do this, but what about Joe beginner? Do they deserve this?
  • Remember that you haven’t written “Hello, world” yet. These hours of your life you have spent to get to this stage, do you have a feeling that they were well-spent?
  • What if you made a mistake, i.e. what if something is subtly incompatible but you haven’t noticed yet, or if one of the technologies you’re using is deprecated, or doesn’t match your security policy, how much time will you spend rooting out all the stuff that’s hardwired with this technology?

So, yes, for all these reasons, I decree that web development is broken. But that’s not all there is to it.

So you have started coding. Good for you.

Now, you have a set of tools that should be sufficient to develop your application – again, possibly not for scaling it up, but that’s a different story. So, you can start coding.

Welcome to the third nightmare: the glue nightmare. What’s the glue? Well, it’s that sticky stuff that you put between two technologies that don’t know about each other, that don’t really fit with each other, but that you need to get working together.

You have data on the client and you want to send it to the server. Time to encode them as JSON or XML, send them with Ajax, open an Ajax entry point on the server (temporary? permanent?), parse the received data on the server (what do you do if it doesn’t parse?), decode the parsed data to your usual data structures, validate the values in these data structures (or should you have done that before decoding?), and then use them (I really hope that you have validated everything carefully). That was the easy part. Now, say you have data on the server and you want to send it to the client. Time to encode them as JSON or XML, send them with Comet – oops, there’s no such thing as “sending with Comet”, so you should open an Ajax entry point on the server (same one? temporary? permanent?) and let the client come and fetch the data (how do you ensure it’s the right client?). Plus the usual parsing and decoding. Except the server code you wrote for parsing and decoding doesn’t work in your browser. Plus, be careful with parsing, because you can get some nasty injections at that stage, or you can just crash a number of JS engines accidentally. Add a little debugging, some more work on garbage-collection and you can send “Hello” from the client to the server or from the server to the client.

Again, the question is not: “can you get this to work?” – I’m sure that you can, many of us do this on a regular basis. The questions are more:

  • was this time well-spent?
  • are you sure that it works?
  • really, really sure?
  • even if browsers can crash?
  • even if users are malicious?
  • how can you be certain?

The client-server glue doesn’t stop here – if only we were so luck. There’s more for handling forms or uploads, or to inject user-generated contents into pages, but let’s move to server-side glue.

Storage is full of glue, too. You have data that fits your application and you’ll want to send it to your DBMS. Now, either you’re using an ORM or you’re encoding the data manually in a manner that somehow fits your database paradigm. That’s already quite a sizable layer of glue, but that’s not all. Sending your application to your DBMS means opening a connection, somehow serializing your data (I hope it’s well-validated, too, just in case someone attempts to inject bogus database requests), somehow serializing your database request, handling random disconnections and sometimes unpredictable (de)serialization errors (by the way, I hope you made sure that the database could never be in an inconsistent state, even if the browser crashes), somehow (de)serializing database responses (can you handle the database not responding at all?) and reconnecting in case of disconnection. Oh, and since your database and your application are certainly based on distinct security paradigms, you’ll have to set up both your application security and your database security, and you’ll have to ensure that they work together. Did I mention handling distinct encodings? Ensuring that temporary bindings are never stored in the database? Performing garbage-collection in the database?

Glue doesn’t stop here, of course. Do I need to get started on REST or SOAP or WSDL?

Again, you’ll solve all these issues, eventually. But if you’re like me, you’ll wonder why you had to spend so much time on so little stuff. These are not features, they are not infrastructure, they are not fun, they’re just glue, to get a shamble of technologies – some of which date back to the 1970s – to work together.

Oh, and at this stage, chances are that you have plenty of security holes. Welcome to the security nighmare. Because your programming language has no idea that this piece of XML or JSON or string will end-up being displayed on a user’s browser (XSSi, anyone?) or stored in the database (SQLi or CouchDB injection, anyone?), or that this URI is actually a command with potentially dangerous side-effects (sounds like a XSRF), etc. By default, any web application is broken, security-wise, in dozens of ways. Most of the web application tutorials I’ve seen on the web contain major security holes. And that’s just not right. If you look at these security issues, you’ll realize that most of them are actually not in the features you coded, not even in the infrastructure you assembled, but in the glue itself. Worse than that: many of these issues are actually trivial things, that could be solve once and for all, but we are stuck using tools that don’t even attempt to solve them. So, chances are that in your application, no matter how good you are, you will forget to validate and/or escape data at least once, that you will forget to authenticate before giving access to one of your resources or that you will forget something that somehow will let a malicious user get into your app and cause all sorts of chaos.

I stand my case. Web development is broken.

History of brokenness

But if we look at it closer, a few years ago, so was PC development. Any non-trivial application needed:

  • application-level code, of course;
  • memory-level hackery just to get past the 640kb barrier (or, equivalently, the handle limitation under Windows – 64kb iirc);
  • IRQ-level coding and/or Windows SDK-level coding (the first one was rather fun, the second was a complete nightmare, neither were remotely meant for anybody who was not seriously crazy);
  • (S)VGA BIOS level hackery to get anything cool to display;
  • BIOS-level fooling.

Five layers of antagonist technologies that needed to get hacked into compliance. The next generation was represented by the NeXT frameworks and attempts to bring the same amount of comfort to Windows-land (OWL, MFC, etc.). Still fragile and complex, but a huge improvement. And the next generation was Java/C#/Python programming. Nowadays, you can install/apt-get/emerge/port install your solution, start coding right away, and be certain that you have everything you need. Nightmare solved.

On the web, we’re still stuck somewhere between the first generation and the second. Why don’t we aim for the third?

A manifesto for web development that works

Time to stop the rant and start thinking positive. All the above is web development that’s broken. Now, what would not-broken web development look like?

Let’s go for the following:

I want to start coding now

        without having to learn configuration, dependencies or deployment;

I don’t want to write no glue

        the web is one platform, time to stop forcing us to treat it as a collection of heterogeneous components;

I don’t want to repeat myself

        so don’t force me to write several validators for the same data, or several libs that do the same thing in different components of my web application;

I don’t care about browser wars

        my standard toolkit must work on all browsers, end of the story;

Give me back my agility

        I want to be able to make important refactorings, to move code around the client, the server, the database component without having to rewrite everything.

Secure by default

        all the low-level security issues must be handled automatically and transparently by the platform, not by me. I’ll concentrate on the high-level application-specific issues.

All of this is definitely possible. So please give it to me this and you’ll make many much happier coders.

Disclaimer My company builds related technology. However, this blogs expresses my personal views.

Where Am I?

You are currently browsing entries tagged with security nightmare at Il y a du thé renversé au bord de la table.