OCaml Batteries Included Beta 1

April 6, 2009 § 9 Comments

Note This post is written on the 64th day of University strikes in France. During these 64 days, the government has rejected any negociation on the core reasons for the strike, has attempted to discredit the contestation (if I understand correctly, I am both « improductive » and a « mask-wearing commando ») and has used police intimidation and repression. The strike continues. Quite possibly, there will be no university exams this year and no baccalauréat. If repression continues increasing, no one can tell for sure what will happen. Nothing good, for sure.

After days and nights of coding, debugging and fighting over naming conventions, the OCaml Batteries Included team is proud to announce OCaml Batteries Included Beta 1. You can find the binaries here,  read the API documentation, the platform documentation, the release notes and the ChangeLog or the list of individual commits. A GODI package and a Debian/testing package are also available.

« Read the rest of this entry »

Dangereux d’être Lumière

February 1, 2009 § 2 Comments

C’est officiel, à partir du 2 février, les universités françaises sont en grève contre les réformes. La grève ne prendra pas la même forme partout, qu’il s’agisse de grèves administratives, de grève des notes, de grève des publications ou encore de grève des cours.

Plutôt que d’analyser une nouvelle fois les réformes, les discours ou interviews de Valérie Pécresse ou Nicolas Sarkozy, laissez-moi vous résumer l’un des nombreux points préoccupants des réformes, qui vous rappelleront d’autres réformes dans le domaine de l’audiovisuel ou de la justice.

Dans cette université nouvelle, la machine politique contrôle:

  • directement le financement sur chaque sujet de recherche (loi LRU, agences de moyens)
  • directement la carrière de chaque enseignant-chercheur (réforme du statut d’enseignant-chercheur / précarisation du statut d’enseignant-chercheur)
  • indirectement le salaire de chaque enseignant-chercheur (réforme du statut d’enseignant-chercheur)
  • indirectement le nombre d’heures d’enseignement de chaque enseignant-chercheur, c’est-à-dire la directement possibilité de faire de la recherche ou/et la possibilité de faire de l’enseignement (réforme du statut d’enseignant-chercheur).

En d’autres termes, le gouvernement se dote de l’attirail nécessaire pour choisir, à tout moment, ce qui doit être étudié ou ce qui doit être oublié. Ce degré de contrôle, qui est simplement absurde dans le monde des sciences dures, devient inquiétant dès qu’il s’agit d’histoire, de littérature, de sociologie ou plus généralement de sciences humaines. Un enseignant-chercheur trublion, qui travaillerait sur des sujets délicats, pourra donc voir sa carrière immédiatement foudroyée par un coup de fil à son université tutélaire.

Insistons sur ce dernier point :

Il devient dangereux de faire des recherches ou des découvertes qui fâchent.

Je vous laisse imaginer des scénarios dans tous les domaines. Personnellement, j’en vois déjà en histoire, en littérature, en sociologie, en ethnologie, en sécurité informatique, et plus généralement tous les domaines  perçus comme pouvant mettre à mal le discours officiel du gouvernement, les secrets de ses amis ou de n’importe quel grand groupe industriel français.

Let’s do it with Batteries

January 27, 2009 § 6 Comments

Or, OCaml is a scripting language, too.

Note: These extracts use the latest version of Batteries, currently available from the git. Barring any accident, this version should be made public within the next few days.

A few days ago, when writing some code for OCaml Batteries Included, I realized that, to properly embed Camomile’s Unicode transcoding module, I would need to manually write 500+ boring lines, all of them looking like:

| `ascii -> Encoding.of_name "ASCII"

The idea behind that pattern matching was to define a type-safe phantom type for text encodings. Upon installation, Camomile generates a directory containing about 540 files, one per text encoding, and it seemed like a good idea to rely upon something less fragile than a string name.

Of course, writing this pattern-matching manually was out of the question: it was boring, error-prone, and while Batteries deserves sacrifices, it doesn’t quite deserve that level of mind-numbing activities. The alternative was to generate both the list of constructors and the pattern-matching code from the contents of the directory. I could have done it with some scripting language but that sounded like a good chance to test-drive the numerous new functions of the String module of Batteries (73 for 28 in the Base Library).

The main program

The structure of the program is easy: read the contents of a directory. For each file, do some treatment on the file name and print the result:

open Shell
foreach (files_of argv.(1)) do_something

Here, foreach is the same function as iter but with its arguments reversed. It’s sometimes much more readable. Instead of reading the contents of a directory with Shell.files_of, we could just as well have traversed the command-line arguments with args, or read the lines of standard input using IO.lines_of stdin.

Actually, we could just as well generalize to a (possibly empty) set of directories. For this purpose, we just need to map our function files_of to the enumeration of command-line arguments. This yields an enumeration of enumerations, which we turn into a flat enumeration with flatten. In my mind, that’s somewhat nicer and more readable than nested loops.

Our main program now looks like:

open Shell, Enum
foreach (flatten (map files_of (args ()))) do_something

Or, for those of us who prefer operators to parenthesis:

open Shell, Enum
(foreach **> flatten **> map files_of **> args ()) do_something

String manipulation

It’s now time to take a file name and turn it into

  1. a nice constructor name
  2. a file name without extension,

That second point is the easiest, so let’s start with it. We have a function Filename.chop_extension just for this purpose. So, if we were interested only in printing the list of files without their extension, we could define

let do_something x = print_endline (Filename.chop_extension x)

The first point is slightly trickier, as we need to

  1. remove the extension from the file name (done)
  2. prepend character ` (trivial)
  3. replace any illicit character with _ (slightly more annoying, I know that the list of illicit characters which may actually appear in my list of files contains :, -, (, ) and whitespaces but I’d rather not go and check manually  which other characters may turn out problematic)
  4. prepend something before names which start with a digit, as digits cannot appear as the first character of an OCaml constructor (a tad annoying, too)
  5. make everything lowercase, just because it’s nicer (trivial).

Let’s deal with the third item, it’s bound to be central. Let’s see, replacing characters could be done with regular expressions, something I dislike, or with function String.map. It’s nicer, type-safer, and it has a counterpart Rope.map for Unicode, if we ever need one. Now, functions Char.is_letter and Char.is_digit will help us determine which names are safe. Using them together, we obtain the following function:

open Char
let replace s = String.map (fun c -> if is_letter c || is_digit c then c else '_') s

Let’s solve the fourth item on our list. We need to check the first character of a string and to determine whether it’s a digit. Well, we already know how to do this. Let’s call our prefix p:

let clean_digit p s = if is_digit s.[0] then p^s else s

If we chain up everything, we obtain

let constructor p s = "`" ^ (if is_digit r.[0] then p^r else r)
    where         r = lowercase (String.map (fun c -> if is_letter c || is_digit c then c else '_') s)

I like this where syntax.


Now that we have both our strings, we just need to be able to combine and print them. For this purpose, Printf is probably the most concise tool. Here, we can just write

let print s1 s2 = Printf.printf " | %s -> %S\n" s1 s2

We could parameterize upon the format used by printf and we’re bound to do this sooner or later, but let’s keep it simple for now.

The complete program

open Shell, Enum

foreach (flatten **> map files_of **> args ()) do_something
  where do_something s =
   let name = Filename.chop_extension s in Printf.printf " | %s -> %S\n" c name
     where c = "`" ^ (if Char.is_digit r.[0] then "codemap_"^r else r)
     where r = lowercase (String.map (fun c -> if Char.is_letter c || Char.is_digit c then c else '_') name)

I don’t know about you but I find this pretty nice, for a type-safe language. I’m sure it would have been possible to make something shorter in Perl or awk, and suggestions are welcome regarding how to improve this but I’m rather happy. And, once again, we’re not trying to beat Python, Perl or awk in concision, just to do something comparably good, because we already beat them by far in speed and safety.

So, what do you think?

OCaml Batteries Included: Alpha 2 has landed

November 10, 2008 § 10 Comments

note: There seems to have been a WordPress bug. For some reason, the extended release notes on OCaml Batteries Included were replaced by something quite unrelated. My apologies for this.

Dear programmers, I am happy to inform you that the second alpha release of OCaml Batteries Included has landed. You may now download it from the Forge. A GODI package is also available and a Debian package should follow soon (you should be able to find the old one here) and you can read the documentation on-line.

So, what’s new in this release?

« Read the rest of this entry »

A quick update on Batteries

September 27, 2008 § Leave a comment

Just a quick word for people who may be curious about the development of OCaml Batteries Included. Work is proceeding nicely and we’re getting close to a first official release. We’ve moved things around quite a lot recently, worked on the documentation and added a few nice features (read-only strings and arrays, uniform numeric modules with type-class-style dictionaries). We’re about to add Unicode support for inputs and outputs (based on Camomile) and an improved Scanf module and that should be it for a first release.

A new preview tarball has just been uploaded on the Forge, as well as a new preview documentation.

As a side-note, the Haskell community seems to be involved much in the same process as Batteries Included, with the Haskell Platform, aka Haskell Batteries Included. Both their schedule and their list of packages seem a little more precise than ours but the overall objective remains the same: take a great programming language used mostly by academics and turn it into a complete development platform able to compete with the best the industrial world is able to offer. The main difference, it seems, is that the Haskell Platform doesn’t have a glue layer designed to uniformize APIs. The other main difference, I’m afraid, is that the Haskell community seems much larger these days than the OCaml community — or perhaps just more active or more verbal. It is my hope that a larger and more convenient standard library will help draw (back?) both academics and developers to the OCaml world. A little more academic support wouldn’t hurt, of course.

Back to OCaml Batteries Included, I hope we’ll be able release by October 10th. At that point, we’ll need beta-testing and it will be time to decide of what should get into Batteries Included 0.2. I’m sure everyone has ideas and suggestions — it will soon be time to share them.

Internship in Virtual Machine Design

August 31, 2008 § Leave a comment

Start-up MLState and team SDS (Security of Distributed Systems, part of Laboratoire d’Informatique Fondamentale d’Orléans) offer a research or engineering internship in the domain of Programming Language Design, under the supervision of David Teller (SDS) and Henri Binsztok (MLState).

« Read the rest of this entry »

PhD position in either Applied Security or Foundations of Security

August 16, 2008 § Leave a comment

A PhD position in Security is to be filled in CEA (Commissariat à l’Énergie Atomique) and team SDS (Security of Distributed Systems, part of Laboratoire d’Informatique Fondamentale d’Orléans), in France, on the topic of Mandatory Access Control for Distributed Systems, under the administrative supervision of Mathieu Blanc (CEA) and Christian Toinard (SDS).

Studentship is for three years (renewable) and includes a salary rising from €1990.25/month (during years 1 and 2)  to €2049.75/month (during year 3). The earliest start date is October 1st 2008.

Profile and skills

The ideal candidate should have an excellent undergraduate degree/Master 2 in Computer Science and an interest in either System Security or Formal Methods. Candidates should have a background in one or more of the following areas:

  • system security
  • operating systems
  • distributed systems
  • system programming
  • clusters
  • static analysis
  • graph theory
  • theory of concurrency
  • logics
  • denotational semantics
  • operational semantics
  • foundations of trust.

Candidates should be eligible to work in France and should expect to work on-campus in ENSIB (École Nationale Supérieure d’Ingénieurs de Bourges). They will work on the theory and/or implementation of effective and manageable enforcement mechanisms for security policies in distributed systems such as clusters and grids.  The main objective of this PhD is to build upon existing local enforcement mechanisms for security policies to design (and, if possible, implement) techniques which may be applied for large distributed systems, as used for data analysis or numeric analysis.

Application procedure

To apply, please send your resume and a motivation letter, either by e-mail or by paper-mail, to both Christian Toinard and Mathieu Blanc. If possible, join a sample of your academic work. The application process entails a background check by the French Department of Defense.

Where Am I?

You are currently browsing entries tagged with computer science at Il y a du thé renversé au bord de la table.