Better exceptions for OCaml, v0.2

Short version

Catch me if you can is a small library for OCaml 3.10. The latest release is version 0.2, which you may find here. This library improves management of errors in OCaml. It is released under the LGPL licence. It has been written by David Teller, Arnaud Spiwack, Till Varoquaux and Gabriel Scherer.


Long version

As all languages of the ML family — and most modern languages indeed — OCaml permits the management of exceptional situations using exceptions. This mechanism lets programmer register protected sections of code, as well as exception handlers to handle any exception which may be raised during the execution of a protected section. Whenever an exception is raised, the protected section of code is immediately stopped and the corresponding exception handler is executed instead. In addition, exceptions may convey some information regarding the nature of the exceptional circumstance.

In OCaml, the mechanism is fast, it’s convenient and it’s type-safe, much like the rest of the language (barring any type-unsafe interaction with C). However, a few things are missing. If we consider the rest of the language, exceptions are both heavyweight and clumsy: each exception must be declared before being used and there’s no way to introduce a polymorphic type parameter in the exception. In addition, languages such as Java offer to important features missing in OCaml: automatic case coverage and exception hierarchies. While a nice tool exists  to provide case coverage for exceptions in OCaml, this tool is complex and  unfortunately unmaintained.

Catch me if you can offers an alternative mechanism, comparable to ML exceptions, to handle errors. In comparison with OCaml’s native exception mechanism, this library adds:

  • automatic inference of exceptions (i.e. no need to declare your exceptions, unless you want to)
  • more flexible exceptions (i.e. exceptions may have polymorphic type parameters, constraints, etc.)
  • hierarchies (i.e. an IOException is a sub-case of Exception and a super-case of NetworkException)
  • case coverage (i.e. the compiler can tell you if you forgot a case or sometimes if you wrote useless ones)
  • conditional success handlers (i.e. do something with the result in case of success)
  • conditional success-and-failure handlers (i.e. “finally”).

To attain this, we replace the mechanism of exceptions by an error monad, we replace exception constructors with polymorphic variants and we introduce a dose of syntactic sugar.


Examples

Expression evaluator

Let’s write a simple expression evaluator for the following set of expressions:


type expr =
  | Value of float
  | Div     of expr * expr
  | Add    of expr * expr
  | Mult   of expr * expr
  | Subs  of expr * expr

These may be evaluated using the following function:


let rec eval = function
 | Value x    -> x
 | Add (x,y) -> eval x +. eval y
 | Mult(x,y)  -> eval x *. eval y
 | Div(x,y)   -> eval x /. eval y
 | Subs(x,y) -> eval x -. eval y

Of course, this function is bound to fail in case of division by zero. While this is expected, there is nothing in the source code — much less in the type of the function — to let us know which exception will be raised in case of division by zero.

An alternative would be to add manual error checking, as follows:


type ('a, 'b) result =
 | Ok of 'a
 | Error of 'b

let rec eval = function
 | Value x    -> OK x
 | Add (x,y) -> (match eval x with
                      | Error e -> Error e
                      | Ok x' ->  match eval y with
                               | Error e -> Error e
                               | Ok y'    -> Ok (x' +. y'))
 | Mult (x,y) -> (match eval x with
                      | Error e -> Error e
                      | Ok x' ->  match eval y with
                               | Error e -> Error e
                               | Ok y'    -> Ok (x' *. y'))
 | Div (x,y) -> (match eval x with
                      | Error e -> Error e
                      | Ok x' ->  match eval y with
                               | Error e -> Error e
                               | Ok y'    -> if y' = 0. then Error "Division by zero"
                                                else              Ok (x' /. y'))
 | Subs (x,y) -> (match eval x with
                      | Error e -> Error e
                      | Ok x' ->  match eval y with
                               | Error e -> Error e
                               | Ok y'    -> Ok (x' -. y'))
(*eval : expr -> (float, string) result*)

After this transformation, the type of the exception appears in the type of eval — here, we used strings, but anything else would have been fine. Of course, the downside is that this is unreadable. Well, what about the following ?


let rec eval = function
 | Value x    -> return x
 | Add (x,y) -> perform with module Error
                        x' <-- eval x;
                        y' <-- eval y;
                        return (x' +. y' ;)
 | Mult (x,y) -> perform with module Error
                        x' <-- eval x;
                        y' <-- eval y;
                        return (x' *. y' ;)
 | Div (x,y) -> perform with module Error
                        x' <-- eval x;
                        y' <-- eval y;
                        if y'=0. then throw "Division by zero"
                        else            return (x' /. y' ;)
 | Subs (x,y) -> perform with module Error
                        x' <-- eval x;
                        y' <-- eval y;
                        return (x' -. y' ;)
(*eval : expr -> (float, string) result*)

This extract uses [our customized version of] Pa_monad (included in the package), which brings syntactic support for monads. While this is more verbose than the original version, it’s also safer, insofar as we can guarantee that exceptions won’t remain uncaught.

Still too long? Then what about using the appropriate operators?


open Error.Operators
let rec eval = function
 | Value x    -> x
 | Add (x,y) -> eval x +. eval y
 | Mult(x,y) -> eval x *. eval y
 | Div(x,y)   -> eval x /. eval y
 | Subs(x,y) -> eval x -. eval y

Except for the module opening, that’s the same thing as our first listing. Just with the added safety.

Throwing, catching and hierarchies

By the way, the type of the result is


(*eval : expr -> (float, [> `Arithmetic of (unit, [> `Div_by_zero of (unit, _) exc ]) exc ]) result*)

That is, eval may either succeed and return a float or fail and return an arithmetic exception, which also turns out to be a division by zero. That’s classes of exceptions.

With our syntactic sugar, raising such an exception is done by


throw (exception Arithmetic (); Div_by_zero ())

Note that we could have put some content instead of (). Note that exceptions are typed as they appear in the code and don’t need to be declared (if you wonder, polymorphic variants are involved in this).

Of course, various kinds of exceptions may be combined, as in the following extract:


match ... with
| 1 -> throw (exception Arithmetic (); Div_by_zero ())
| 2 -> throw (exception Arithmetic (); Overflow "by gosh !&quot ;)
| 3 -> throw (exception IO file_descr)
| ...
(*
('a, [> `Arithmetic of (unit, [> `Div_by_zero of (unit, _) exc
                                     |   `Overflow of (string, _)   exc]) exc
     |   `IO of (int, [`> ]) exc ]) result*)
*)

While the type of the expression is difficult to read, catching is easy


attempt ... with
 | val s -> (*success*)
 | Arithmetic (); Div_by_zero () -> (*Division by zero*)
 | Arithmetic (); _                   -> (*Other arithmetic*)
 | IO _                                  -> (*Some IO stuff *)
 | finally _                             -> (*Don't forget to close the door*)

This extract introduces three keywords:

  • attempt is our replacement for try
  • val is used to pattern-match against the result of a successful evaluation
  • finally is used to pattern-match against the final result, whether this result was obtained after a successful evaluation or after an exception was raised and handled.

Unbreaking tail-recursion

The following extract is wrong:


let line_count filename =
  let rec loop file count =
  try
    ignore (input_line file);
    loop file (count + 1)
  with
    End_of_file -> count
  in
    loop (open_file filename) 0

Don’t get me wrong, it will compile and run. The problem is that it’s not tail-recursive. In other words, it will be much slower and much more memory-consuming than if exceptions had been ignored. Why ? Because exception End_of_file may have been raised from the next call to loop, so the recursive call cannot be optimized into a non-recursive call. Of course, exceptions can’t be ignored in this extract, as they are required to determine when to stop reading the file. Now, a simple transformation would make the problem go away :


let line_count filename =
  let rec loop file count =
    let should_continue =
    try
      ignore (input_line file);
      true
    with End_of_file -> false
  in
    if should_continue then loop file (count + 1)
    else                    count
  in
    loop (open_file filename) 0

Well, the transformation is simple, but it’s annoying and hard to read. What’s even more annoying is that it’s quite common. With Catch me if you can, we would rather write the following:


let input_line2 x = Error.legacy input_line x

let line_count filename =
  let rec loop file count =
    attempt input_line2 file with
      | val _ -> loop file (count + 1)
      | _     -> count
  in
  loop (open_file filename) 0

In this extract, legacy is a simple manner of wrapping an existing, one-argument, function and convert it to our new exception style. It’s not quite as good as wrapping the function manually and giving it an actual exception, but it’s better than nothing.

All in all, the resulting function line_count is shorter, easier to read, takes less memory and is also faster than the original.

What about performance?

Now, that’s a complex question. Short answer: there’s an acceptable performance loss. False short answer: we wrote a paper on that subject.

5 commentaires »

  1. Exception Monad for OCaml « Il y a du thé renversé au bord de la table a dit,

    janvier 31, 2008 à 10:52

    [...] I have just released a first version of a library introducing an exception monad in OCaml. More informations here. [...]

  2. Paolo Donadeo a dit,

    février 1, 2008 à 3:27

    Very interesting library, thanks.

    Two questions:
    1) Why the monadify function has signature:
    value monadify : Lazy.t ‘a -> can_fail ‘a exn
    instead of:
    value monadify : ‘a -> can_fail ‘a exn
    I mean: can monadify create the lazy value, instead of demanding it to the client?

    2) May I have a version of this library under the BSD (or MIT, or something similar)? I’d like to use it in my web framework, which is under this license:

    http://www.ex-nunc.org/gitweb/?p=ex-nunc.git;a=blob;f=LICENSE;h=e49b3950c6f65eaece1f289d244a038a54e728db;hb=ff82d688f21d75167484f49f6d49b19910acc8f2

  3. yoric a dit,

    février 1, 2008 à 10:50

    1) Unfortunately, in the current state, monadify can’t do that. I’ll try and write a small Camlp4 extension to avoid this problem (and remove the passage through lazy in the first place).

    2) As far as I’m concerned, no problem. I’ll ask Arnaud Spiwack what he thinks about it.

  4. Paolo Donadeo a dit,

    février 2, 2008 à 10:57

    Ah, I was missing the evaluation order! Not a problem at all, the library is very interesting (at least to me) event without any Camlp4 extension.

  5. yoric a dit,

    février 5, 2008 à 9:26

    The official answer about the licence is: no problem.

Apporter un Commentaire