Unbreaking Scalable Web Development, One Loc at a Time

May 23, 2011 § 18 Comments

The Opa platform was created to address the problem of developing secure, scalable web applications. Opa is a commercially supported open-source programming language designed for web, concurrency, distribution, scalability and security. We have entered closed beta and the code will be released soon on http://opalang.org, as an Owasp project .

If you are a true coder, sometimes, you meet a problem so irritating, or a solution so clumsy, that challenging it is a matter of engineering pride. I assume that many of the greatest technologies we have today were born from such challenges, from OpenGL to the web itself. The pain of pure LAMP-based web development begat Ruby on Rails, Django or Node.js, as well as the current NoSQL generation. Similarly, the pains of scalable large system development with raw tools begat Erlang, Map/Reduce or Project Voldemort.

Opa was born from the pains of developing scalable, secure web applications. Because, for all the merits of existing solutions, we just knew that we could do much, much better.

Unsurprisingly, getting there was quite a challenge. Between the initial idea and an actual platform lay blood, sweat and code, many experiments and failed prototypes, but finally, we got there. After years of development and real-scale testing, we are now getting ready to release the result.

The parents are proud to finally introduce Opa.

Different means to different ends

Opa is a new approach to scalable, secure web development.

The core idea behind Opa is that once you use the right paradigm, scalability, security and the web model just happen naturally.

To implement our idea, we had to provide developers with a programming language that was:

  1. powerful enough to describe the complete behavior of the web application, including user interface, interactivity, concurrency, general-purpose computations and database manipulation;
  2. clean enough to support automated security analysis;
  3. high-level enough to support transparent distribution, optimization and injection of security checks;
  4. understandable by any developer.

This is not a benign idea. Most approaches to web development, to security or scalability rely either on libraries, external tiers, or reflexivity. For all their merits, and even when applied to the best/most modular/most extensible programming languages available, these techniques are still heavily rooted on whichever paradigm is best handled by that language. Some of the results can be impressive – including your favorite framework, whichever it may be – but in the end, they are necessarily limited by the underlying tools. Unfortunately, we could not find any existing language – whether static, dynamic or hybrid – that could fit all criteria. So, we had to build our own.

This is also not an easy idea for us. We spent years designing, testing, fine-tuning our paradigm, as well as ensuring that the result was indeed usable by any developer.

Opa is a new programming language and its runtime environment

With Opa, write your complete application in just one language, and the compiler will transform it into a self-sufficient executable containing:

  • server-side code;
  • client-side code (cross-browser JavaScript and HTML, generated automatically from your source code);
  • database code (compiled queries for our own NoSQL, scalable database);
  • distribution code;
  • all the glue to connect everything to everything else;
  • security checks at boundaries;
  • the HTTP server itself;
  • the database engine itself;
  • the distribution layers themselves.

Launch this executable locally, or ask it to deploy itself on any number of servers, and your web application is running. Do not deploy or configure a DBMS. Do not deploy or configure a web server. Do not deploy or configure a distributed file system. It just works.

Programming with Opa

Opa may be a new language, but it is quite understandable if you have notions of web development. Let me show you a few simple but complete applications. Should you wish to play with them, I have uploaded the source code of each application on github as AGPL.

Hello, web

First variant: 1 eloc

 server = one_page_server("Hello", -> <>Hello, web!</>)

Second variant: 2 eloc

server = Server.simple_dispatch(_ ->
  html("Hello", <>Hello, web!</>)
)

Build & launch:

$ opa hello_web.opa
$ ./hello_web.exe

That’s it. Your application is launched, you can connect with any (recent) browser.

A minimal (distributed, load-balanced) key-value store

Source code, in 17 eloc:

/**
 * Add a path called [/storage] to the schema of our graph database.
 *
 * This path is used to store one value with type
 * [stringmap(option(string))]. A [stringmap] is a dictionary.
 * An [option(string)] is an optional [string],
 * i.e. a value that may either be a string or omitted.
 *
 * This path therefore stores an association from [string]
 * (the key) to either a [string] (the value) or nothing
 * (no value).
 */
db /storage: stringmap(option(string))

/**
 * Handle requests.
 *
 * @param request The uri of the request. The URI is converted to
 * a key in [/storage], the method determines what should be done,
 * and in the case of [{post}] requests, the body is used to set
 * the value in the db
 *
 * @return If the request is rejected, [{method_not_allowed}].
 * If the request is a successful [{get}], a "text/plain" resource
 * with the value previously stored. If the request is a [{get}] to
 * an unknown key, a [{wrong_address}].
 * Otherwise, a [{success}].
 */
dispatch(request) =
(
  key = List.to_string(request.uri.path)
  match request.method with
   | {some = {get}}    ->
     match /storage[key] with
       | {none}        -> Resource.raw_status({wrong_address})
       | {some = value}-> Resource.raw_response(value,
               "text/plain", {success})
     end
   | {some = {post}}   ->
         do /storage[key] <- request.body
         Resource.raw_status({success})
   | {some = {delete}} ->
         do /storage[key]
         do Db.remove(@/storage[key])
         Resource.raw_status({success})
   | _ -> Resource.raw_status({method_not_allowed})
  end
)

/**
 * Main entry point: launching the server.
 */
server = Server.simple_request_dispatch(dispatch)

Build:

$ opa opa_storage.opa

Launch on one server

$ ./opa_storage.exe

Or auto-deploy and launch on several servers:

$ opa-cloud opa_storage.exe --host localhost --host me@host1 --host me@host2

Again, that’s it. Key/value pairs are replicated/distributed on the various nodes (default settings are generally ok, but replication factor can be configured if necessary), requests are load-balanced and it just works.

Just as importantly, note that we have not written any single line of code for ensuring security with respect to database injection. By construction, Opa ensures automatically that such injections cannot happen.

Real-time web chat

Source code, in 20 eloc:

/**
 * {1 Network infrastructure}
 */

/**
 * The type of messages sent by a client to the chatroom
 */
type message = {author: string /**Arbitrary, untrusted, name*/
              ; text: string  /**Content entered by the user*/}

/**
 * A structure for routing and broadcasting values of type
 * [message].
 *
 * Clients can send values to be broadcasted or register
 * callbacks to be informed of the broadcast. Note that
 * this routing can work cross-client and cross-server.
 *
 * For distribution purposes, this network will be
 * registered to the network as "mushroom".
 */
room = Network.cloud("mushroom"): Network.network(message)

/**
 * {1 User interface}
 */

/**
 * Update the user interface in reaction to reception of a message.
 *
 * This function is meant to be registered with [room] as a callback.
 * Its sole role is to display the new message in [#conversation].
 *
 * @param x The message received from the chatroom
 */
user_update(x) =
(
  line = <div>
     <div>{x.author}:</div>
     <div>{x.text}</div>
  </div>
  do Dom.transform([#conversation +<- line ])
  Dom.scroll_to_bottom(#conversation)
)

/**
 * Broadcast text to the [room].
 *
 * Read the contents of [#entry], clear these contents and send
 * the message to [room].
 *
 * @param author The name of the author. Will be included in the
 * message broadcasted.
 */
broadcast(author) =
  do Network.broadcast({author=author text=Dom.get_value(#entry)}, room)
  Dom.clear_value(#entry)

/**
 * Build the user interface for a client.
 *
 * Pick a random author name which will be used throughout the chat.
 *
 * @return The user interface, ready to be sent by the server to the client
 * on connection.
 */
start() =
(
    author = Random.string(8)
    <div id=#conversation
     onready={_ -> Network.add_callback(user_update, room)}></div>
   <input id=#entry  onnewline={_ -> broadcast(author)}/>
   <div class="button" onclick={_ -> broadcast(author)}>Send!</div>
)

/**
 * {1 Application}
 */

/**
 * Main entry point.
 *
 * Construct an application called "Chat" (users will see the name in the title bar),
 * embedding statically the contents of directory "resources", using the global
 * stylesheet "resources/css.css" and the user interface defined in [start].
 */
server = Server.one_page_bundle("Chat",
    [@static_resource_directory("resources")],
    ["resources/css.css"], start)

Fork me on github

Build and launch as above:

$ opa opa_chat.opa

$ opa-cloud opa_chat.exe --host localhost --host me@host1 --host me@host2

Users connecting to the launch server are load-balanced among servers. Users connecting to one server can chat transparently with users connected to other servers.

Just as importantly, note that we have not written any single line of code for ensuring security with respect to Cross-Site Scripting. Still, you can try and inject code in this application – and you will fail. Opa has transparently ensured that this cannot happen.

Our experience with Opa

We have used Opa to develop a number of web applications, including CMSes, online games, high-security communication tools or e-Commerce apps.

What can I tell you? In our experience, Opa is awesome 🙂 It saves us considerable amounts of time and pain and it vastly extended the size of projects that we could undertake with small agile teams.

Firstly, Opa handles transparently all communications between the client and the server, and can generate JavaScript or server binary code from the same source, depending on what is required. This considerably simplifies prototyping and agile development, by letting us concentrate on getting things to work first, experimenting and showing to clients second, and freezing the design only much later. Countless times, this also made us very much more flexible with respect to design changes, by letting us instantaneously move (or reuse) server code on the client, or in the database, or vice-versa, without having to port from one language to another, or to reimplement communication protocols, or validation, or to redesign for asynchronicity. The added benefit of automated XSS protection also considerably improved our confidence in such agile code.

Secondly, Opa’s paradigm is a natural match for scalability concerns. It favors stateless services, makes sure that state can be easily marked as local (e.g. caches) or shared (e.g. accounts), and it also makes it quite easy to place local caches in front of anything shared. Most of our applications written on one server worked even better on several servers, out-of-the-box. To push scalability even further, marking data as local/shared/cached is extremely simple, which has always helped us experiment quickly, before deciding whether to push such optimizations into production.

On the security side, I’m not sure exactly how many men·months Opa saved us by guaranteeing that we were automatically safe against injections (including XSS and SQL/SQL-like), and I’m not quite sure how to measure it, but this definitely relieved us of plenty of work, stress and emergency calls.

How does this work?

We make it work 🙂

More seriously, last time we counted, including testing, around 100 man·years of R&D had been spent on Opa. We took advantage of that time to make Opa the best solution we could imagine. I’ll try and explain some of the key techniques progressively, in a series of blog entries.

Limitations

We are extremely proud of everything that is possible with Opa, but, as any product, Opa has limitations.

Firstly, while the Opa compiler and runtime can perform very aggressive optimizations on distribution and database requests for instance, for the moment, some of these optimizations cannot be performed automatically. In such cases, a developer needs to annotate the code here and there, to mark code chunks as safe for such optimizations. We have a number of plans to push forward the automation of these optimizations, but we haven’t had a chance to implement them yet.

Other limitations are related to our objectives. Opa is designed for security on the web. Consequently, a number of primitives that are just too dangerous are not accessible for Opa developers, so don’t expect to encounter innerHTML, eval(), document.print() or execvp(), for instance. These primitives are available as part of the platform, should you wish to work on extending the runtime, but not as part of the language/library.

Also, as we dedicated Opa to the web, do not look too hard for Gtk or DirectX bindings – nothing prevent such system bindings, but we have no plans on introducing these ones. Similarly, Opa is designed for scalability, so the language favors stateless programming, or when state is required, as in our web chat, states that can be shared between several instances of a server. So, while Opa will let you write an application with messy state, the design of the language will try and guide you on another way.

We also have a few other limitations, that may be considered anecdotical in this day and age. For instance, the client side of Opa applications that have a client (i.e. non-pure web services) requires JavaScript and will not work with IE6 or Lynx.

Show me the code!

Soon, but not quite yet.

We’re working full-time on the open-source release. If you are interested, I suggest you visit opalang.org to find some information and documentation or to request invitations to the closed beta. You can also follow our updates on Twitter or come and chat with us on IRC.

Tagged: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

§ 18 Responses to Unbreaking Scalable Web Development, One Loc at a Time

  • Lucian says:

    Simple, almost trivial, examples as AGPL?
    What do you think beginners will use to create applications?
    Why *force* them to not use these examples to build something from there?

    • yoric says:

      I’m not sure I understand the question. Is it that you have an issue with AGPL?

    • foo says:

      What’s wrong with AGPL?
      Even for a 20-lines app, you have another way to organize your code and name your values — should you wish.
      And AGPL can lead to free more web apps.
      /me likes

  • If I may. As in the documentation, you really should try to have shorter line.
    On my brower, I can’t easily see the right and of the long lines, because they are not on the page. I have to scroll in an inner windows, which makes little sens.

    And I guess that it is usually considered to be “best-pratctice” and more readable to keep lines short. Except of course is OPA doesn’t permit it.

    • yoric says:

      Thanks for the feedback, is this any better?

      Note 1: Just finished a reformatting pass on the manual. It should now be easier to read.
      Note 2: That’s one of the reasons why I uploaded the examples to github. WordPress is definitely not the most source code-friendly tool around.

  • Dan says:

    No, he’s right. AGPL on hello world is pointless, ignorant of the effect of the license and frankly, enough to stop my interest right here.

    You can BSD it – or you can just do public domain. Otherwise this language is is DOA.

    • yoric says:

      Would you care to elaborate? Are you telling me that you are deciding to choose a language based on the license of examples on a blog post?

      • Dan says:

        No, I’d avoid a language in which incredibly basic code is now subject to a viral license.

        If this code is subject to the AGPL:

        server = one_page_server(“Hello”, -> Hello, web!)

        It seems hard to write a program that would not be derivative (please read section 5 of your license).

        Foo’s suggestion of changing the names doesn’t hold up to the level of scrutiny that we’ve seen when these issues come to trial.

        Would I be in compliance with Apache’s license if the changes I make are just to rename all of the variables or change the name to A_Patchy?

        Of course not.

      • yoric says:

        No, I’d avoid a language in which incredibly basic code is now subject to a viral license.

        What if I offered you dual-licensing on the tutorial code ? 🙂

        More seriously, I believe that I now understand your point, but I’m afraid that we have a core philosophical disagreement here. AGPL was chosen on purpose.

        Foo’s suggestion of changing the names doesn’t hold up to the level of scrutiny that we’ve seen when these issues come to trial.

        You’re right, definitely.

      • Np237 says:

        Trivial code is not subject to copyright. Therefore the license for it is irrelevant; you can ignore it.

      • Dan says:

        >Trivial code is not subject to copyright. Therefore the license for
        >it is irrelevant; you can ignore it.

        I don’t know… “Trivial” seems subject to interpretation. While it’s length and simplicity make it seem “trivial”… It also describes a complete functional program – which kind of throws a wrench into “fair use” too, IMO.

        Is a haiku less subject to copyright then a book?

  • Dan says:

    So, it’s a trap. 🙂

    Every program written in OPA will ultimately be subject to the AGPL.

    That seems limiting.

    • yoric says:

      So, it’s a trap. 🙂

      Gasp, I’m unmasked 🙂

      More seriously, it’s just one of the many topics that I haven’t covered in this blog entry to avoid making it really too long and unreadable.

      Every program written in OPA will ultimately be subject to the AGPL.

      That seems limiting.

      We will provide developers with a choice: either contribute to the community by releasing the source code of their application – or contribute to the community by funding us to develop Opa further.

      We believe that this is only fair.

      • Dan says:

        >We believe that this is only fair.

        Whatever you want to do is “fair” – It is yours after all.

        It seems a bit odd that OWASP is on-board with these terms.

        Oh – and scratch that “ignorant of the effect of the license” bit. I totally thought you were going a different direction then this – like toward wide-spread adoption. 🙂

      • yoric says:

        It seems a bit odd that OWASP is on-board with these terms.

        They’ve been aware of this since day one and I haven’t heard any complaint. On the other hand, I’ve heard some major FOSS groups cheering us up 🙂

        Oh – and scratch that “ignorant of the effect of the license” bit. I totally thought you were going a different direction then this – like toward wide-spread adoption. 🙂

        You evil person 🙂

    • foo says:

      According to clause 13 of the AGPL, even though every ‘OPApp’ should be released as an open source project (which is unclear, and would probably need to be properly stated anyway), it seems the apps could be released under the GPL.

      And I do believe it is a good thing: We all love the free software we have today but the web is becoming more and more closed. When people use free software to build proprietary applications, users interacting with these services are becoming in the exact same situation that frustrated RMS and made him create free software. The AGPL is a right answer to free the web. But you might have different goals 😉

      • AP² says:

        “the web is becoming more and more closed”

        No, it’s not. There are more and better OSS web stacks than ever. There are at least one OSS framework for each of: Ruby, Python, Perl, PHP, Javascript (NodeJS), C# (Mono), Lua and more. There are multiple OSS databases, both relational and NoSQL. There are multiple OSS servers (Apache, Lighttpd, Nginx, Tomcat, etc), free OSs (Linux, *BSD) and more people and companies everyday using completely Free stacks. The browsers are implementing open standards that can replace much of what Flash was needed for until now.

        The web was never freer.

  • p4bl0 says:

    AP²: you totally miss the point. The technologies are open, but the software developed using those technologies are not, and this is very important. Using a service developed on a free software stack but which is itself closed has the same problem as using a closed source software on your desktop: you don’t know what it does, and you can’t know either.

    With your definition of “free” then Windows is free since it’s mostly developed using C and C++ which are open standard and since there are free compilers for those languages.

Leave a comment

What’s this?

You are currently reading Unbreaking Scalable Web Development, One Loc at a Time at Il y a du thé renversé au bord de la table.

meta