2024-09-10
I got a request from the OCaml group at the Recurse Center to go over modules and functors this week. Here are the notes I prepared. We'll be using Jane Street's Core standard library replacement throughout.
open Core
A module is a collection of definitions. A signature (alternatively "interface") is a specification of a module. Signatures are to modules as types are to values -- they are static specifications of behaviour.
The essence of signatures and modules is information hiding. Signatures allow the programmer to constrain the interface that must be agreed upon by two bodies of code (the module providing the interface and the module consuming the interface). The word "constrain" may sound negative, but this is a good thing! It gives us freedom to modify a module with confidence that our changes will not affect the rest of the program in ways we don't intend them to.
The relationship between modules and signatures is many-to-many: (infinitely) many modules may satisfy a given signature, and many signatures may be satisfied by a given module.
Suppose we want to write a program that makes use of the mathematical notion of a finite set. We might write the following signature.
module type SET = sig
type 'a set
(* create a singleton set *)
val singleton : 'a -> 'a set
(* create the set of all elements in a list *)
val of_list : 'a list -> 'a set
(* take the union of two sets *)
val union : 'a set -> 'a set -> 'a set
(* take the difference of two sets *)
val difference : 'a set -> 'a set -> 'a set
(* further operations omitted for the sake of brevity *)
end
The type set
is abstract. You can read type 'a set
as a claim that "the module will define a type (parametric in one variable) named set", without actually saying what that type is.
Here is a possible implementation of our SET
interface.
module SetModule = struct
type 'a set = 'a list
(* Yes, polymorphic equality is controversial! We will get to that later. *)
let eq = Stdlib.(=)
let rec union left = function
| [] -> left
| x :: xs ->
if List.exists left ~f:(eq x)
then union left xs
else union (x :: left) xs
let of_list l =
union [] l
let singleton a = [ a ]
let difference left right =
let filter a = not (List.exists right ~f:(eq a)) in
List.filter left ~f:filter
end
We may refer to names bound in the current module without an explicit qualifier. For example, in the definition of union
, we reference eq
without a prefix.
Because we didn't specify a signature for SetModule
, all top level definitions are exposed by default. If we were to instead give it a signature...
module Set : SET = SetModule
then only those values in the signature will be exposed. In this case, for example, eq
would be hidden.
OCaml functors are essentially functions between modules, evaluated at compile time.
As an example, let's consider our earlier SET
interface. Polymorphic equality is problematic for reasons discussed elsewhere, so let's suppose we want to be explicit about our equality test. We could introduce an extra function parameter to union
and difference
to pass in an equality predicate. That may be a fine solution in some cases, but functors give us a way of specifying the equality predicate at compile time and not having to pass it around.
We'll define a signature of types that have decidable equality:
module type EQUALITY = sig
type t
val eq : t -> t -> bool
end
Then we can define our set implementation as a functor from EQUALITY
.
module SetFunctor = functor (E : EQUALITY) -> struct
type element = E.t
type set = element list
let rec union left = function
| [] -> left
| x :: xs ->
if List.exists left ~f:(E.eq x)
then union left xs
else union (x :: left) xs
let of_list l =
union [] l
let singleton a = [ a ]
let difference left right =
let filter a = not (List.exists right ~f:(E.eq a)) in
List.filter left ~f:filter
end
Now, to instantiate SetFunctor
, we'll first have to define an input module.
module IntEquality : EQUALITY = struct
type t = int
let eq = Int.equal
end
Applying the functor looks like this:
module IntSet = SetFunctor(IntEquality)
This is a rare case in OCaml where the syntax for application requires parentheses. We can give the functor an explicit signature, if we want:
module type SET_FUNCTOR = functor (E : EQUALITY) -> sig
type element = E.t
type set
(* create a singleton set *)
val singleton : element -> set
(* create the set of all elements in a list *)
val of_list : element list -> set
(* take the union of two sets *)
val union : set -> set -> set
(* take the difference of two sets *)
val difference : set -> set -> set
(* further operations omitted for the sake of brevity *)
end
module SetFunctor' : SET_FUNCTOR = SetFunctor
The type declaration type element = E.t
requires that a module implementing this signature will define a type element
that is equal to E.t
from its input module.
Notice that set
is no longer parametric. The set type we get when we apply the functor represents a set of elements specifically of type E.t
. We've essentially lifted the parameter to the level of the module system.
Haskell's type classes may be viewed as a particular kind of signature. For example, the Functor and Applicative classes are described by the following signatures:
module type FUNCTOR = sig
type 'a t
val map : ('a -> 'b) -> 'a t -> 'b t
end
(* an example instance *)
module OptionFunctor : FUNCTOR = struct
type 'a t = 'a option
let map f xs = Option.map xs ~f:f
end
module type APPLICATIVE = sig
type 'a t
val pure : 'a -> 'a t
val apply : ('a -> 'b) t -> 'a t -> 'b t
end
(* an example instance *)
module OptionApplicative : APPLICATIVE = struct
type 'a t = 'a option
let pure a = Some a
let apply fs xs = match (fs, xs) with
| (Some f, Some x) -> Some (f x)
| _ -> None
end
It is conventional in OCaml to name the principal type of a module/signature t
, if one exists.
In fact, OCaml modules are strictly more powerful than Haskell type classes. Type classes are limited to having at most one instance per type, while signatures may be inhabited by several modules that package a given type. (For example, Haskell is forced to ordain one ordering of the natural numbers as the canonical Ord
instance, even though there are many possible orderings.) The cost is in verbosity -- we have to name which instance of a signature we're talking about. There is research being done on modular type classes, which are intended to combine the best of both worlds.
The source code in this document is available here.
Yaron Minsky and Anil Madhavapeddy have written an excellent book on OCaml, which includes sections on modules and functors.
Robert Harper, one of the designers of Standard ML, has written about the importance of modularity on his blog.