Fighting the good fight for fun and credits, with Mozilla Education

October 30, 2012 § Leave a comment

Are you a student?

Do you want to fight the good fight, for the Future of the Web, and earn credits along the way?

Mozilla Education maintains a tracker of student project topics. Each project is followed by one (or more) mentor from the Mozilla Community.

Then what are you waiting for? Come and pick or join a project or get in touch to suggest new ideas!

Are you an educator?

The tracker is also open to you. Do not hesitate to pick projects for your students, send students to us or contact us with project ideas.

We offer/accept both Development-oriented, Research-oriented projects and not-CS-oriented-at-all projects.

Are you an open-source developer/community?

If things work out smoothly, we intend to progressively open this tracker to other (non-Mozilla) projects related to the future of the web. Stay tuned – or contact us!

Crowdsourcing the syntax

May 30, 2011 § 18 Comments

Feedback from Opa testers suggests that we can improve the syntax and make it easier for developers new to Opa to read and write code. We have spent some time both inside the Opa team and with the testers designing two possible revisions to the syntax. Feedback on both possible revisions, as well as alternative ideas, are welcome.

A few days ago, we announced the Opa platform, and I’m happy to announce that things are going very well. We have received numerous applications for the closed preview – we now have onboard people from Mozilla, Google and Twitter, to quote but a few, from many startups, and even from famous defense contractors – and I’d like to start this post by thanking all the applicants. It’s really great to have you guys & gals and your feedback. We are still accepting applications, by the way.

Speaking of feedback, we got plenty of it, too, on just about everything Opa, much of it on the syntax. This focus on syntax is only fair, as syntax is both the first thing a new developer sees of a language and something that they have to live with daily. And feedback on the syntax indicates rather clearly that our syntax, while being extremely concise, was perceived as too exotic by many developers.

Well, we aim to please, so we have spent some time with our testers working on possible syntax revisions, and we have converged on two possible syntaxes. In this post, I will walk you through syntax changes. Please keep in mind that we are very much interested in feedback, so do not hesitate to contact us, either by leaving comments on this blog, by IRC, or at feedback@opalang.org .

Important note: that we will continue supporting the previous syntax for some time and we will provide tools to automatically convert from the previous syntax to the revised syntax.

Let me walk you through syntax changes.

« Read the rest of this entry »

Let’s do it with Batteries

January 27, 2009 § 6 Comments

Or, OCaml is a scripting language, too.

Note: These extracts use the latest version of Batteries, currently available from the git. Barring any accident, this version should be made public within the next few days.

A few days ago, when writing some code for OCaml Batteries Included, I realized that, to properly embed Camomile’s Unicode transcoding module, I would need to manually write 500+ boring lines, all of them looking like:

| `ascii -> Encoding.of_name "ASCII"

The idea behind that pattern matching was to define a type-safe phantom type for text encodings. Upon installation, Camomile generates a directory containing about 540 files, one per text encoding, and it seemed like a good idea to rely upon something less fragile than a string name.

Of course, writing this pattern-matching manually was out of the question: it was boring, error-prone, and while Batteries deserves sacrifices, it doesn’t quite deserve that level of mind-numbing activities. The alternative was to generate both the list of constructors and the pattern-matching code from the contents of the directory. I could have done it with some scripting language but that sounded like a good chance to test-drive the numerous new functions of the String module of Batteries (73 for 28 in the Base Library).

The main program

The structure of the program is easy: read the contents of a directory. For each file, do some treatment on the file name and print the result:

open Shell
foreach (files_of argv.(1)) do_something

Here, foreach is the same function as iter but with its arguments reversed. It’s sometimes much more readable. Instead of reading the contents of a directory with Shell.files_of, we could just as well have traversed the command-line arguments with args, or read the lines of standard input using IO.lines_of stdin.

Actually, we could just as well generalize to a (possibly empty) set of directories. For this purpose, we just need to map our function files_of to the enumeration of command-line arguments. This yields an enumeration of enumerations, which we turn into a flat enumeration with flatten. In my mind, that’s somewhat nicer and more readable than nested loops.

Our main program now looks like:

open Shell, Enum
foreach (flatten (map files_of (args ()))) do_something

Or, for those of us who prefer operators to parenthesis:

open Shell, Enum
(foreach **> flatten **> map files_of **> args ()) do_something

String manipulation

It’s now time to take a file name and turn it into

  1. a nice constructor name
  2. a file name without extension,

That second point is the easiest, so let’s start with it. We have a function Filename.chop_extension just for this purpose. So, if we were interested only in printing the list of files without their extension, we could define

let do_something x = print_endline (Filename.chop_extension x)

The first point is slightly trickier, as we need to

  1. remove the extension from the file name (done)
  2. prepend character ` (trivial)
  3. replace any illicit character with _ (slightly more annoying, I know that the list of illicit characters which may actually appear in my list of files contains :, -, (, ) and whitespaces but I’d rather not go and check manually  which other characters may turn out problematic)
  4. prepend something before names which start with a digit, as digits cannot appear as the first character of an OCaml constructor (a tad annoying, too)
  5. make everything lowercase, just because it’s nicer (trivial).

Let’s deal with the third item, it’s bound to be central. Let’s see, replacing characters could be done with regular expressions, something I dislike, or with function String.map. It’s nicer, type-safer, and it has a counterpart Rope.map for Unicode, if we ever need one. Now, functions Char.is_letter and Char.is_digit will help us determine which names are safe. Using them together, we obtain the following function:

open Char
let replace s = String.map (fun c -> if is_letter c || is_digit c then c else '_') s

Let’s solve the fourth item on our list. We need to check the first character of a string and to determine whether it’s a digit. Well, we already know how to do this. Let’s call our prefix p:

let clean_digit p s = if is_digit s.[0] then p^s else s

If we chain up everything, we obtain

let constructor p s = "`" ^ (if is_digit r.[0] then p^r else r)
    where         r = lowercase (String.map (fun c -> if is_letter c || is_digit c then c else '_') s)

I like this where syntax.

Format

Now that we have both our strings, we just need to be able to combine and print them. For this purpose, Printf is probably the most concise tool. Here, we can just write

let print s1 s2 = Printf.printf " | %s -> %S\n" s1 s2

We could parameterize upon the format used by printf and we’re bound to do this sooner or later, but let’s keep it simple for now.

The complete program

open Shell, Enum

foreach (flatten **> map files_of **> args ()) do_something
  where do_something s =
   let name = Filename.chop_extension s in Printf.printf " | %s -> %S\n" c name
     where c = "`" ^ (if Char.is_digit r.[0] then "codemap_"^r else r)
     where r = lowercase (String.map (fun c -> if Char.is_letter c || Char.is_digit c then c else '_') name)

I don’t know about you but I find this pretty nice, for a type-safe language. I’m sure it would have been possible to make something shorter in Perl or awk, and suggestions are welcome regarding how to improve this but I’m rather happy. And, once again, we’re not trying to beat Python, Perl or awk in concision, just to do something comparably good, because we already beat them by far in speed and safety.

So, what do you think?

OCaml Batteries Included, version alpha

October 11, 2008 § 4 Comments

It has landed

The first alpha version of OCaml Batteries Included has landed. It is now available as source code, from the Forge or as a GODI package. You may find the API documentation here and the complete documentation there.

Remember, this is alpha-level code, use at your own risk. Also note that documentation generation is very slow (10+ minute on my laptop), so don’t worry if installation seems to last forever, it’s nearly true but not quite.

Reviews, comments, suggestions and bug reports are particularly welcome. We have a few trackers for these, as well as a forum, so don’t hesitate to use them. We’re also looking for volunteers to give us a hand, so please consider stepping forward.

What is OCaml Batteries Included?

Twenty years ago, a language was just a compiler. You could measure the quality of a language from the beauty of its semantics, the clarity of its syntax, the speed of generated code. That was then. In the meantime, the Java and .Net nuclear plants have been built, while the Python and Ruby communities have gradually developed their own powerhouses. All these platforms have amply demonstrated that a language can only be as beautiful, clear and fast as the libraries which developers are actually going to use for their work. In other words, it’s not about the language anymore, it’s about the platform.

At this point in time, out-of-the-box, OCaml isn’t a usable platform. There is no Unicode, there are no modern user interface toolkits, no distributed programming infrastructures, no network services, no type-safe communications, no analysis of other languages, no interfacing with industrial platforms, no XML, non modern two- or three-dimensional drawing engine, etc. This would not be too much of a problem if OCaml provided an easy way of installing new libraries and of using these libraries once they are installed. This would also not be too much of a problem if OCaml could somehow guarantee that trivial data structures didn’t need to be reinvented by most projects and that communication between libraries happened lawlessly.

So no, out of the box, OCaml isn’t a usable platform. However, no matter what you’re trying to do, chances are that the community has already developed or adapted a tool to make your life easier. Easy installation of OCaml packages is quite possible, if you are using Debian or Ubuntu (apt-get), Fedora or Red Hat (yum) or actually, any Unix-compatible platform, including MacOS X and Windows (GODI). Simple and reliable usage of installed libraries can be done with Findlib. A comprehensive Unicode library is available (Camomile), as well as a modern user interface toolkit or two (LablGtk, OCamlRT), a number of distributed programming infrastructures (OCamlMPI, BSML, Opis, Camlp3l, OCamlNAE), libraries to interface with industrial platforms
(OCamlJava, SpiderCaml …), to analyse other languages (CIL, Dalton/FlowCaml, …), to read or write XML (PXP, Expat, …), to draw in two or three dimensions (Cairo, OpenGL), to test your programs (OUnit), etc. OCaml even offers a built-in tool to customize the language itself (Camlp4).

What’s missing? A few things. For now, not all important libraries are available as simple-to-install packages. That problem is being addressed by the devoted packagers of Debian, Fedora and GODI, while the possibility exists that the recently announced Symbiosis will also help address the issue. The other missing part is a standard set of libraries, language extensions and data structures which developers could be assured to find on every OCaml installation, and which would let them write programs without having to endlessly reinvent the same basic wheels, and without spending their time writing adapters between libraries which should work well together but don’t use the same conventions. OCaml Batteries Included is one possible answer to this problem.

OCaml Batteries Included consists in

  • A core set of libraries, designed to define the basic standard data structures. This set, largely based on both the Base library of OCaml and ExtLib, extends the basic strings, arrays, lists… provided with OCaml and introduces numerous data structures, including enumerations, lazy lists, extendable inputs
    and outputs, dynamic arrays, unicode ropes…
  • A uniformization layer, the glue, on top of chosen existing libraries. Note that we are not forking any of these libraries, only providing an additional layer on top of them.  The libraries may be installed manually by the user or, preferably, by automatic dependency resolution thanks to apt-get, yum, GODI or some other packaging tool. The uniformization layer serves to guarantee that libraries play together nicely, that only one manner of reading from a file or writing to the output is necessary, that every data provided by a library may be decoded by another, etc.
  • Additional documentation on whichever library for which we provide glue, including both low-level documentation (“what the heck does this function do?”), mid-level documentation (“why should I use that function?”) and high-level documentation (“ok,I’m new here, where do I start?”)
  • A handful of language extensions, provided as Camlp4 modules, to solve common issues, automatically generate boilerplate code, improve readability, …
  • In the future, possibly a set of external tools.
  • Build tools to make all of this transparent.
  • And, of course, a logo.
OCaml Batteries Included

(Batteries courtesy of Benjamin Pavie, Camel courtesy of John Olsen, no animals injured)

OCaml Batteries Included is a project maintained by the community, which means that it depends on you. If you have ideas, suggestions, complaints, bug reports, if you want to participate, to write code, documentation, tutorials, build tools, review code, have a word in policies, if you want a package to be included in the Batteries, or just to contact us, please visit our website and take advantage of our bug trackers, request for features trackers, forums, etc. (courtesy of OCaml Core). We can also often be seen on irc, on server freenode, channel #ocaml and, of course on the OCaml Mailing-List.

OCaml Batteries Included is also a work-in-progress. The version you have in front of your eyes does not contain everything we want to put in it, nor even all the code we have written for it. But we’ll get there. We intend to integrate additional libraries progressively, from minor release to minor release, with major milestones approximately twice per year. This policy is not written in stone and is largely subject to debate, so don’t hesitate to comment on the subject.

For more informations on the contents of OCaml Batteries Included, follow us towards the manual.

Relations to other libraries

Project Gallium’s Base Library

First, a word on vocabulary. We call “Base library” the library provided by INRIA with the default distribution of OCaml. We don’t call it “standard library” for the simple reason that there are several libraries vying for the status of standard, including Batteries Included.

The relation between Batteries Included and the Base Library is simple: the Base Library is one of the libraries for which Batteries Included provides a uniformization layer. We are not forking the library, merely providing additional functions, additional documentation, boilerplate code…

The complete Base Library is available in Batteries Included as module Legacy. Most modules are also available inside the Batteries module hierarchy, sometimes under different names, and usually completed by numerous new functions. A few modules are considered obsolete and appear only in Legacy.

Jane Street’s Core

Jane Street’s Core is another library vying for the status of standard, this time produced by Jane Street Capital. This library is comparable in purpose and design to the core of Batteries Included (actually, we draw some inspiration from them and we hope that they are going to draw some inspiration from us, too) but there is no code shared between Batteries Included and Jane Street’s Core for the moment.

In the future, Batteries Included may depend on Jane Street’s Core. This is not the case yet as, according to one of Jane Street’s Core’s authors, this library may change quite a lot, and in what we understand as possibly incompatible ways, before reaching version 1.0.

We already depend on two components of Jane Street’s Core: Type-Conv (a general infrastructure which may be used to generate boilerplate code) and Sexplib (an instance of Type-Conv used to generate code for serializing to/deserializing from human-readable S-Expressions). We intend to extend this to a third component of Jane Street’s Core: Bin-Prot, another instance of Type-Conv, used this time to generate serialization to/from binary protocols.

ExtLib

ExtLib is another extension of the OCaml Base Library. ExtLib was designed as a relatively small addition, with the idea of fitting nicely within the OCaml Base Library. Since 2005, ExtLib has essentially stopped growing, following a conscious choice from the maintainers.

OCaml Batteries Included is largely based on ExtLib. Indeed, the core of Batteries is essentially a fork of ExtLib. We intend to maintain  ascending compatibility with ExtLib (with the exception of changes in module names), although we have added a great number of features,  fixed bugs, split code, etc. For most purposes, OCaml Batteries Included contains ExtLib and, in our mind, superseeds this library. We hope that the maintainers and developers of ExtLib will consider migrating to OCaml Batteries Included and helping us with their skills.

Community OCaml

Community OCaml is an attempt to turn OCaml into an out-of-the-box complete development platform, including numerous libraries.  We have joined forces with the developers of Community OCaml and we share most of our code. Although the design of Community OCaml is not completely decided yet, it is quite possible that this will end up as distribution consisting in OCaml + Batteries Included + some package manager.

OCamlNet

OCamlNet is an implementation of numerous Internet protocols for OCaml, along with so many utilities that it almost constitues a complete general-purpose development library for OCaml.

OCaml Batteries Included will depend on OCamlNet and incorporate OCamlNet into its hierarchy. In particular, for the moment, OCaml Batteries Included provides little in the way of libraries for interacting with the operating system, as we intend to use OCamlNet for that purpose.

Caml Development Kit

The Caml Development Kit, or CDK, was another attempt to build a development platform around OCaml by bundling together OCaml itself and a number of interesting libraries and tools. By opposition to Batteries Included, this was a monolithic distribution, with a custom compiler and very little documentation. Development on CDK seems to have died around 2001.

There is no relation between CDK and Batteries Included other than the common goal. Any feedback from former CDK users or developers will be welcome, though.

GODI, Apt, Yum

GODI, Apt and Yum are the three main package managers available for OCaml users. The first one is OCaml-specific, the second one lives in Debian/Ubuntu/Knoppix world and the third one in the Red Hat/Fedora world.

OCaml Batteries Included has no direct relation with either package manager, nor does it replicate the work of any of these managers. However, as Batteries relies on numerous external libraries, use of a package manager is strongly recommended.

Findlib

Findlib is a compile-time package manager for OCaml. It lets developers easily specify which libraries are required and manages compile-time dependencies between installed libraries.

OCaml Batteries Included uses Findlib extensively.

Comparable projects, somewhere else

Python Batteries Included

To the best of our knowledge, the term “Batteries Included” was first coined by the Python community to describe the standard distribution of Python, which in addition to usual data structures, contains modules for handling databases, compression, a number of file format (de)coders including JSON, (X)HTML, XML, CSV, cryptography, logging, text interfaces, threading, inter-process communication, implementation of client and server protocols from SMTP to HTTP, communication with external webbrowsers, image and sound manipulation, internationalization, lexing and parsing, graphical interfaces, unit testing, sandboxing, reflexivity, OS-specific services, …

This massive number of out-of-the-box features, along with the conciseness of the language, are probably the two main reasons of the success of Python: simple tasks, even those which require complex libraries, may often be programmed with only a few lines and in a few minutes.

Well, obviously, we’re trying to provide something as useful with OCaml Batteries Included, but with the added safety and speed of OCaml.

Haskell Batteries Included

The Haskell Platform (although known as “Haskell Batteries Included”) is a recent project undertaken by the Haskell community with objectives comparable to OCaml Batteries Included: “provides a comprehensive,
stable and quality tested base for Haskell projects to work from.” At the time of this writing, the Haskell Platform has not released any software, although a first release is expected within a few weeks.

As OCaml Batteries, the Haskell Platform is community-led and relies on a number of decentralized libraries. As OCaml Batteries, the Haskell Platform requires package management.There are a number of differences in the methods, though.

Some differences are probably trivial: where we strive to work with any of the major package management systems, the Haskell Platform is based solely on Cabal, which may allow them a better integration. Where OCaml Batteries attempts to reclassify existing libraries into one uniform hierarchy of modules, the Haskell Platform keeps the original module names, as decided by their original authors, pre-Platform.

Others are quite far-reaching: OCaml Batteries provides an extended core of libraries to serve as support for standardization and uniformization, while the Haskell Platform doesn’t. Similarly, the Haskell Platform doesn’t add any uniformization layer, which means that the task of getting libraries to work together lies upon the end-user. Some Haskell libraries may be patched into compliance and uniformization, but this is not always possible, short of creating cyclic dependencies between libraries which should work together but don’t. Finally, the Haskell Platform has much stricter guidelines regarding libraries which can or can’t make it into the Platform. This is a good thing if library authors are willing to fix whichever problems prevent the inclusion of their work — something which we don’t assume for OCaml Batteries, at least not yet.

Despite these differences, both projects seem based on solid foundations. And hopefully, both will achieve large success.

Where Am I?

You are currently browsing entries tagged with python at Il y a du thé renversé au bord de la table.