Introduction to the OCaml Module System

2024-09-10

I got a request from the OCaml group at the Recurse Center to go over modules and functors this week. Here are the notes I prepared. We'll be using Jane Street's Core standard library replacement throughout.

open Core

Signatures as Interfaces

A module is a collection of definitions. A signature (alternatively "interface") is a specification of a module. Signatures are to modules as types are to values -- they are static specifications of behaviour.

The essence of signatures and modules is information hiding. Signatures allow the programmer to constrain the interface that must be agreed upon by two bodies of code (the module providing the interface and the module consuming the interface). The word "constrain" may sound negative, but this is a good thing! It gives us freedom to modify a module with confidence that our changes will not affect the rest of the program in ways we don't intend them to.

The relationship between modules and signatures is many-to-many: (infinitely) many modules may satisfy a given signature, and many signatures may be satisfied by a given module.

Suppose we want to write a program that makes use of the mathematical notion of a finite set. We might write the following signature.

module type SET = sig

  type 'a set

  (* create a singleton set *)
  val singleton : 'a -> 'a set

  (* create the set of all elements in a list *)
  val of_list : 'a list -> 'a set

  (* take the union of two sets *)
  val union : 'a set -> 'a set -> 'a set

  (* take the difference of two sets *)
  val difference : 'a set -> 'a set -> 'a set

  (* further operations omitted for the sake of brevity *)

end

The type set is abstract. You can read type 'a set as a claim that "the module will define a type (parametric in one variable) named set", without actually saying what that type is.

Here is a possible implementation of our SET interface.

module SetModule = struct

  type 'a set = 'a list

  (* Yes, polymorphic equality is controversial! We will get to that later. *)
  let eq = Stdlib.(=)

  let rec union left = function
    | [] -> left
    | x :: xs ->
        if List.exists left ~f:(eq x)
        then union left xs
        else union (x :: left) xs

  let of_list l =
    union [] l

  let singleton a = [ a ]

  let difference left right =
    let filter a = not (List.exists right ~f:(eq a)) in
    List.filter left ~f:filter

end

We may refer to names bound in the current module without an explicit qualifier. For example, in the definition of union, we reference eq without a prefix.

Because we didn't specify a signature for SetModule, all top level definitions are exposed by default. If we were to instead give it a signature...

module Set : SET = SetModule

then only those values in the signature will be exposed. In this case, for example, eq would be hidden.

Functors

OCaml functors are essentially functions between modules, evaluated at compile time.

As an example, let's consider our earlier SET interface. Polymorphic equality is problematic for reasons discussed elsewhere, so let's suppose we want to be explicit about our equality test. We could introduce an extra function parameter to union and difference to pass in an equality predicate. That may be a fine solution in some cases, but functors give us a way of specifying the equality predicate at compile time and not having to pass it around.

We'll define a signature of types that have decidable equality:

module type EQUALITY = sig
  type t
  val eq : t -> t -> bool
end

Then we can define our set implementation as a functor from EQUALITY.

module SetFunctor = functor (E : EQUALITY) -> struct

  type element = E.t
  type set = element list

  let rec union left = function
    | [] -> left
    | x :: xs ->
        if List.exists left ~f:(E.eq x)
        then union left xs
        else union (x :: left) xs

  let of_list l =
    union [] l

  let singleton a = [ a ]

  let difference left right =
    let filter a = not (List.exists right ~f:(E.eq a)) in
    List.filter left ~f:filter

end

Now, to instantiate SetFunctor, we'll first have to define an input module.

module IntEquality : EQUALITY = struct
  type t = int
  let eq = Int.equal
end

Applying the functor looks like this:

module IntSet = SetFunctor(IntEquality)

This is a rare case in OCaml where the syntax for application requires parentheses. We can give the functor an explicit signature, if we want:

module type SET_FUNCTOR = functor (E : EQUALITY) -> sig

  type element = E.t
  type set

  (* create a singleton set *)
  val singleton : element -> set

  (* create the set of all elements in a list *)
  val of_list : element list -> set

  (* take the union of two sets *)
  val union : set -> set -> set

  (* take the difference of two sets *)
  val difference : set -> set -> set

  (* further operations omitted for the sake of brevity *)

end

module SetFunctor' : SET_FUNCTOR = SetFunctor

The type declaration type element = E.t requires that a module implementing this signature will define a type element that is equal to E.t from its input module.

Notice that set is no longer parametric. The set type we get when we apply the functor represents a set of elements specifically of type E.t. We've essentially lifted the parameter to the level of the module system.

Type Classes as Signatures

Haskell's type classes may be viewed as a particular kind of signature. For example, the Functor and Applicative classes are described by the following signatures:

module type FUNCTOR = sig
  type 'a t
  val map : ('a -> 'b) -> 'a t -> 'b t
end

(* an example instance *)
module OptionFunctor : FUNCTOR = struct
  type 'a t = 'a option
  let map f xs = Option.map xs ~f:f
end

module type APPLICATIVE = sig
  type 'a t
  val pure : 'a -> 'a t
  val apply : ('a -> 'b) t -> 'a t -> 'b t
end

(* an example instance *)
module OptionApplicative : APPLICATIVE = struct
  type 'a t = 'a option
  let pure a = Some a
  let apply fs xs = match (fs, xs) with
  | (Some f, Some x) -> Some (f x)
  | _ -> None
end

It is conventional in OCaml to name the principal type of a module/signature t, if one exists.

In fact, OCaml modules are strictly more powerful than Haskell type classes. Type classes are limited to having at most one instance per type, while signatures may be inhabited by several modules that package a given type. (For example, Haskell is forced to ordain one ordering of the natural numbers as the canonical Ord instance, even though there are many possible orderings.) The cost is in verbosity -- we have to name which instance of a signature we're talking about. There is research being done on modular type classes, which are intended to combine the best of both worlds.

Resources

The source code in this document is available here.

Yaron Minsky and Anil Madhavapeddy have written an excellent book on OCaml, which includes sections on modules and functors.

Robert Harper, one of the designers of Standard ML, has written about the importance of modularity on his blog.

recurse center webring