11.28.09
We call it OPA
Web applications are nice. They’re useful, they’re cross-platform, users need no installation, no upgrades, no maintenance, not even the computing or storage power to which they are used. As weird as it may sound, I’ve even seen announcements for web applications supposed to run your games on distant high-end computers so that you can actually play on low-end computers. Go web application!
Of course, there are a few downsides to web applications. Firstly, they require a web connexion. Secondly, they are largely composed of plumbing. Finally, ensuring their security is a constant fight.
04.22.09
Last lecture
Note: This post is written on the 79th day of strike of Universities. Despite the overwhelming consensus against these bills, the government has just passed the application decrees implementing the possibility of arbitrarily increasing the teaching charge of researchers, without need for any justification. The government obviously fails to see how much this will hurt Research. Simultaneously, the government has announced that, since the reform of primary and secondary schools cannot proceed in compliance with the government’s own decrees, it will simply proceed illegally. Once the shock is gone, expect increased strike actions. Expect Resarch strikes on publications, on patents, on contracts with the government or French companies. Expect difficulties with baccalauréat, exams and degrees.
Where headhunters had failed, the government has finally succeeded. Today was my last lecture.
As the government obviously doesn’t want Researchers to have the means and time to undertake Research, I have accepted a position in the private sector, where I should be able to pursue my work on semantics, security and functional concurrent/distributed programming languages.
While I’m glad to start in a position where I will have both more leeway and both students and engineers to work with me, I am saddened that the situation had to reach the point where I felt I had no choice.
Barring any accident, starting September 1st, you will be able to find me at MLState.
04.06.09
OCaml Batteries Included Beta 1
Note This post is written on the 64th day of University strikes in France. During these 64 days, the government has rejected any negociation on the core reasons for the strike, has attempted to discredit the contestation (if I understand correctly, I am both « improductive » and a « mask-wearing commando ») and has used police intimidation and repression. The strike continues. Quite possibly, there will be no university exams this year and no baccalauréat. If repression continues increasing, no one can tell for sure what will happen. Nothing good, for sure.
After days and nights of coding, debugging and fighting over naming conventions, the OCaml Batteries Included team is proud to announce OCaml Batteries Included Beta 1. You can find the binaries here, read the API documentation, the platform documentation, the release notes and the ChangeLog or the list of individual commits. A GODI package and a Debian/testing package are also available.
Lisez la suite de cette entrée »
03.25.09
On-line interpreter for Batteries
Note This entry is written on the 51st day of nationwide strike in French Universities and Research laboratories. Still no sign of negociation from the government. Liquidation of the system continues.
I am happy to announce that the repository version of OCaml Batteries Included now has a full-featured (and working) interpreter.
02.06.09
OCaml Batteries Included, alpha 3
Dear programmers.
I am happy to inform you that, despite the in-progress liquidation of French Universities, OCaml Batteries Included Alpha 3 has landed. Barring any accident, this should be the final Alpha version, with a mostly stable API and module structure. You may now download it from the Forge. A GODI package is also available and a Debian package should follow soon. You may also read the documentation on-line.
So, what’s new with Alpha 3? Plenty of things, as you’ll see.
01.27.09
Let’s do it with Batteries
Or, OCaml is a scripting language, too.
Note: These extracts use the latest version of Batteries, currently available from the git. Barring any accident, this version should be made public within the next few days.
A few days ago, when writing some code for OCaml Batteries Included, I realized that, to properly embed Camomile’s Unicode transcoding module, I would need to manually write 500+ boring lines, all of them looking like:
| `ascii -> Encoding.of_name "ASCII"
The idea behind that pattern matching was to define a type-safe phantom type for text encodings. Upon installation, Camomile generates a directory containing about 540 files, one per text encoding, and it seemed like a good idea to rely upon something less fragile than a string name.
Of course, writing this pattern-matching manually was out of the question: it was boring, error-prone, and while Batteries deserves sacrifices, it doesn’t quite deserve that level of mind-numbing activities. The alternative was to generate both the list of constructors and the pattern-matching code from the contents of the directory. I could have done it with some scripting language but that sounded like a good chance to test-drive the numerous new functions of the String module of Batteries (73 for 28 in the Base Library).
The main program
The structure of the program is easy: read the contents of a directory. For each file, do some treatment on the file name and print the result:
open Shell foreach (files_of argv.(1)) do_something
Here, foreach is the same function as iter but with its arguments reversed. It’s sometimes much more readable. Instead of reading the contents of a directory with Shell.files_of, we could just as well have traversed the command-line arguments with args, or read the lines of standard input using IO.lines_of stdin.
Actually, we could just as well generalize to a (possibly empty) set of directories. For this purpose, we just need to map our function files_of to the enumeration of command-line arguments. This yields an enumeration of enumerations, which we turn into a flat enumeration with flatten. In my mind, that’s somewhat nicer and more readable than nested loops.
Our main program now looks like:
open Shell, Enum foreach (flatten (map files_of (args ()))) do_something
Or, for those of us who prefer operators to parenthesis:
open Shell, Enum (foreach **> flatten **> map files_of **> args ()) do_something
String manipulation
It’s now time to take a file name and turn it into
- a nice constructor name
- a file name without extension,
That second point is the easiest, so let’s start with it. We have a function Filename.chop_extension just for this purpose. So, if we were interested only in printing the list of files without their extension, we could define
let do_something x = print_endline (Filename.chop_extension x)
The first point is slightly trickier, as we need to
- remove the extension from the file name (done)
- prepend character
`(trivial) - replace any illicit character with
_(slightly more annoying, I know that the list of illicit characters which may actually appear in my list of files contains:,-,(,)and whitespaces but I’d rather not go and check manually which other characters may turn out problematic) - prepend something before names which start with a digit, as digits cannot appear as the first character of an OCaml constructor (a tad annoying, too)
- make everything lowercase, just because it’s nicer (trivial).
Let’s deal with the third item, it’s bound to be central. Let’s see, replacing characters could be done with regular expressions, something I dislike, or with function String.map. It’s nicer, type-safer, and it has a counterpart Rope.map for Unicode, if we ever need one. Now, functions Char.is_letter and Char.is_digit will help us determine which names are safe. Using them together, we obtain the following function:
open Char let replace s = String.map (fun c -> if is_letter c || is_digit c then c else '_') s
Let’s solve the fourth item on our list. We need to check the first character of a string and to determine whether it’s a digit. Well, we already know how to do this. Let’s call our prefix p:
let clean_digit p s = if is_digit s.[0] then p^s else s
If we chain up everything, we obtain
let constructor p s = "`" ^ (if is_digit r.[0] then p^r else r)
where r = lowercase (String.map (fun c -> if is_letter c || is_digit c then c else '_') s)
I like this where syntax.
Format
Now that we have both our strings, we just need to be able to combine and print them. For this purpose, Printf is probably the most concise tool. Here, we can just write
let print s1 s2 = Printf.printf " | %s -> %S\n" s1 s2
We could parameterize upon the format used by printf and we’re bound to do this sooner or later, but let’s keep it simple for now.
The complete program
open Shell, Enum
foreach (flatten **> map files_of **> args ()) do_something
where do_something s =
let name = Filename.chop_extension s in Printf.printf " | %s -> %S\n" c name
where c = "`" ^ (if Char.is_digit r.[0] then "codemap_"^r else r)
where r = lowercase (String.map (fun c -> if Char.is_letter c || Char.is_digit c then c else '_') name)
I don’t know about you but I find this pretty nice, for a type-safe language. I’m sure it would have been possible to make something shorter in Perl or awk, and suggestions are welcome regarding how to improve this but I’m rather happy. And, once again, we’re not trying to beat Python, Perl or awk in concision, just to do something comparably good, because we already beat them by far in speed and safety.
So, what do you think?
11.20.08
OCaml Batteries Included: The Hierarchy, reloaded
Well, my previous post on the Hierarchy of OCaml Batteries Included certainly triggered reactions. Essentially, judging from these, the OCaml community doesn’t seem to want of a module hierarchy. So here’s a reworked version of the library layout, without hierarchy. Again, feedback is appreciated and should go to the OCaml mailing-list.
Batteries
- Standard (automatically opened)
- Legacy
- Arg
- Array
- …
- Future
- Lexers
- C
- OCaml
I. Control
- Lexers
- Exceptions
- Return
- Monad Interfaces for monadic operations
I.1. Concurrency
- Concurrency Interfaces for concurrency operations
I.1.i. Built-in threads
- Condition
- Event
- Mutex
- RMutex
- Thread
- Threads A module containing aliases to Condition, Event…
I.1.ii. coThreads
- CoCondition
- CoEvent
- CoMutex
- CoRMutex
- CoThread
- CoThreads as Threads but with implementations coming from coThreads
I.1.iii. Shared memory
- Shm_* Placeholders
II. IO
- IO
- BigEndian
- GZip
- Bz2
- Zip
- Transcode
III. Mutable containers
- Array
- Cap
- ExceptionLess
- Labels
- ExceptionLess
- Labels
- Cap
- Bigarray
- Array1
- Array2
- Array3
- Dllist
- Dynarray
- Enum
- ExceptionLess
- Labels
- Global
- Hashtbl
- Make
- ExceptionLess
- Labels
IV. Persistent containers
- Make
- Lazy
- List
- ExceptionLess
- Labels
- Map
- Make
- ExceptionLess
- Labels
- Make
- Option
- Labels
- PMap
- PSet
- RefList
- Index
- Queue
- Ref
- Set
- Make
- ExceptionLess
- Labels
- Make
- Stack
- Stream
V. Data
- Unit
V.1. Logical
- Bool
- BitSet
V.2. Numeric
- Numeric Interfaces for number-related stuff
- Big_int
- Common
- Complex
- Float
- Int
- Int32
- Int64
- Native_int
- Num
- Safe_float placeholder
- Safe_int
V.3. Textual
- Text Definition of text-related interfaces
- Buffer
- Char
- UTF8
- Rope
- UChar
- String
- StringText A module containing aliases to String and Char1
- RopeText As StringText but with implementations from Rope and UChar
- UTF8Text As StringText but with implementations from UTF8 and UChar
- Labels
VI. Distribution-related stuff
- Packages
- Compilers
VII. Internals
- Gc
- Modules
- Oo
- Private
- Weak
- Make
VIII. Network (placeholders)
- URL
- Netencoding
- Base64
- QuotedPrintable
- Q
- URL
- Html
VIII.1. Http
- Http
- Http_client
- Cgi_*
- Httpd_*
- MIME
VIII.2. Ftp
- Ftp_client
VIII.3. Mail
- Netmail
- Pop
- Sendmail
- Smtp
VIII.4. Generic server
- Netplex_*
VIII.5. RPC
- Rpc_*
VIII.6. Languages
- Genlex
- Lexing
- CharParser
- UCharParser
- ParserCo
- Source
- Parsing
- Format
- Printf
- Str
- PCRE place-holder
- Scanf
- Scanning
- SExpr
IX. System
- Arg
- File
- OptParse
- Opt
- OptParser
- StdOpt
- Path
- Shell
- Unix
- Labels
- Equeue
X. Unclassified
- Digest
- Random
- State
- Date placeholder
- 1
- Actually a slightly modified version of Char to match signatures for Latin-1 and Unicode
11.18.08
Batteries: reworking the hierarchy (feedback wanted)
As readers interested in OCaml may know, we’ve been working for several months of OCaml Batteries Included. Early in the development, it appeared to us that, with the large number of modules involved, we would need a hierarchy of modules.
For instance, for the moment, we have a module System containing among other submodules IO (definition of i/o operations), File (definition of operations on files), Sys (the usual OCaml Sys module, soon to be expanded), etc. Therefore, before one may open and manipulate files, one has to do
open System.IO;; open System.File;;
or, with the syntax extension we developed to alleviate this,
open System, IO, File;;
the syntax extension does a few other things which we’re not going to detail here — for one thing, it allows local opening of modules).
Now that we’ve reached Alpha 2 and are well on our way towards Alpha 3, we’ve decided that it’s time to rework our current hierarchy and make it shorter and more consistent. Before we proceed, we’d like some feedback from the community. Discussion will take place mostly on the OCaml mailing-list and on irc (server: Freenode, channel: #ocaml) but I preferred posting the hierarchy here for easier reading.
At the moment, our hierarchy looks roughly as follows:
Batteries (automatically opened)
- Control
- Concurrency
- Common
- Threads
- Condition
- Event
- Mutex
- Thread
- Exceptions
- Return
- Monad
- Concurrency
- Data
- Containers (common interfaces)
- Mutable
- Array
- Cap
- Labels
- ExceptionLess
- Labels
- ExceptionLess
- Cap
- Bigarray
- Array1
- Array2
- Array3
- Genarray
- Dllist
- Dynarray
- Enum
- ExceptionLess
- Global
- Hashtbl
- Make
- Labels
- ExceptionLess
- Make
- RefList
- Index
- Queue
- Ref
- RefList
- Stack
- Stream
- Array
- Logical
- Bool
- BitSet
- Numeric
- Big_int
- Common
- Complex
- Float
- Int
- Int32
- Int64
- Native_int
- Num
- Safe_int
- Unit
- Common
- Persistent
- Lazy
- List
- ExceptionLess
- Labels
- Map
- Make
- ExceptionLess
- Labels
- Make
- Option
- Labels
- PMap
- PSet
- Set
- Make
- ExceptionLess
- Labels
- Make
- Text
- Buffer
- Char
- UTF8
- Rope
- UChar
- String
- ExceptionLess
- Labels
- Languages
- CharParser
- Genlex
- Languages
- Library
- C
- OCaml
- Library
- Languages
- Format
- Lexing
- ParserCo
- Source
- Parsing
- Printf
- Scanf
- Scanning
- SExpr
- Str
- Meta
- Callback
- CamlinternalMod
- CamlinternalOo
- Gc
- Obj
- Oo
- Weak
- Make
- Standard (automatically opened)
- System
- Arg
- Compress
- GZip
- File
- Filename
- IO
- BigEndian
- Network (placeholder)
- OptParse
- Opt
- OptParser
- StdOpt
- Unix
- Labels
- Sys
- Toolchain
- Execute
- Findlib
- Util
- Digest
- Random
- State
- Legacy
One possible replacement has been drafted:
- Control
- Concurrency
- Common
- Threads
- Condition
- Event
- Mutex
- Thread
- Exceptions
- Return
- Monad
- Concurrency
- Data
- Containers
- Common
- Array
- Cap
- ExceptionLess
- Labels
- ExceptionLess
- Labels
- Cap
- Bigarray
- Array1
- Array2
- Array3
- Genarray
- Dllist
- Dynarray
- Enum
- ExceptionLess
- Labels
- Global
- Hashtbl
- Make
- ExceptionLess
- Labels
- Make
- Lazy
- List
- ExceptionLess
- Labels
- Map
- Make
-
- ExceptionLess
- Labels
- Option
- Labels
- PMap
- PSet
- RefList
- Index
- Queue
- Ref
- RefList
- Set
- Make
- ExceptionLess
- Labels
- Stack
- Stream
- Logical
- Bool
- BitSet
- Unit
- Numeric
- Interfaces (formerly Common)
- Big_int
- Common
- Complex
- Float
- Int
- Int32
- Int64
- Native_int
- Num
- Safe_int
- Text
- Buffer
- Char
- UTF8
- Rope
- UChar
- String
- Labels
- Containers
- Distro
- Packages (formerly Findlib)
- Tools (formerly Execute)
- Internals
- Callback
- Gc
- Mod (formerly CamlinternalMod)
- Obj
- Oo
- Private (formerly CamlinternalOo)
- Weak
- Make
- IO
- BigEndian
- Gzip
- Bz2
- Zip
- Transcode (placeholder)
- Network (placeholder)
- Text (formerly Languages)
- Lex
- Genlex
- Languages
- C
- OCaml
- Languages
- Lexing
- Genlex
- Parse
- CharParser
- UCharParser
- ParserCo
- Source
- Parsing
- Pretty
- Format
- Printf
- Regexp
- Str
- PCRE (in the future)
- Scanf
- Scanning
- SExpr
- XML (placeholder)
- Lex
- Standard (automatically opened)
- Sys
- Arg
- File
- OptParse
- Opt
- OptParser
- StdOpt
- Path (formerly Filename)
- Shell (formerly Sys)
- Unix
- Labels
- Util
- Digest
- Random
- State
- Legacy
The overall goal, with this new version, is to make for
- a shallower hierarchy
- shorter module names
- more consistent groups of modules.
What do you think of this replacement? Is it any better?