Tree @develop-0.2 (Download .tar.gz)
Lariat
Version 0.2
Lariat is a project to define an abstract data type for lambda terms,
consisting of six basic operations: equal
and fresh
(on names), and
app
, abs
, var
, and destruct
(on terms).
This repository presents the definition of these operations. It also contains implementations of this abstract data type (and possibly one day variations on it) in various programming languages, including:
The version of the Lariat defined by this document is 0.2. This version number will be promoted to 1.0 once vetted sufficiently.
Table of Contents
Background
There are several approaches to representing lambda terms in software.
The naive approach is to represent them just as they are written on paper. In this approach, whether a variable, such as x, is free or bound depends on whether it is inside a lambda abstraction λ x or not. If you need to manipulate it (or the abstraction it's bound to), you might need to rename it so that it doesn't conflict with another variable also called x that is perhaps free or perhaps bound to a different lambda abstraction.
This is tiresome and error-prone. So other approaches were developed.
One such approach alternate is De Bruijn indexes, where variables are represented not by names, but by numbers. The number indicates which lambda abstraction the variable is bound to, if any; a 1 indicates the immediately enclosing lambda abstraction, a 2 indicates the lambda abstraction just above that, and so on. If the number exceeds the number of enclosing lambda abstractions, then it is a free variable.
But this, too, has some drawbacks, so people have devised a number of other approaches:
- "nominal techniques" (Gabbay and Pitts)
- "locally nameless" (various?)
- "maps" (Sato et al.)
- "bound" (Kmett)
among others.
But the point we would like to make in this article is this: At some level of abstraction it does not matter which approach is chosen as long as the approach satisfies the essential properties that we require of lambda terms.
To this end, we present an abstract data type for lambda terms, which we call Lariat, consisting of six operations. The actual, concrete data structure in which they are stored, and the actual, concrete mechanism by which names become bound to terms, are of no consequence (and may well be hidden from the programmer) so long as the implementation of the operations conforms to the stated specification.
Names
In any explication of name binding we must deal with names. In Lariat 0.1, names were left almost entirely undefined; the only operation they were required to support was comparison of two names for equality. While this extreme level of abstraction might be attractive from a theoretical perspective, it introduced complications and awkwardness for any potential practical use of the abstract data type.
In Lariat 0.2, names are treated as abstract objects much like terms, and we specify that names must support the following two operations:
equal(n: name, m: name): boolean
Given a name n and a name m, return true if they are identical names, otherwise return false.
fresh(ns: set of name): name
Given a set of names ns, return a name which does not occur in ns. We say this returned name is "fresh for ns".
The means by which the fresh name is generated is abstracted away; we only care about the guarantee that it is not a member of ns. For discussion on implementation, see Appendix A.
The operation should be deterministic in the sense that, given the same set of names, it should always return the same fresh name.
Note: It is not required that the
fresh
operation be exposed to the user; it is, rather, a structural requirement of a correct implementation ofdestruct
, below.Note: Beyond these two operations, it would be not unexpected that an implementation of Lariat would provide other operations such as constructing a new name from a textual representation, rendering a given name to a canonical textual representation, and suchlike. From the perspective of Lariat itself these are ancillary operations, and as such will not be defined in this document.
Terms
We now list the four operations available for manipulating terms.
var(n: name): term
Given a name n, return a free variable with the name n.
Note: A free variable is a term; it can be passed to any operation that expects a term.
app(t: term, u: term): term
Given a term t and a term u, return an application term which contains t as its first subterm and u as its second subterm.
Note: An application term is a term that behaves just like an ordered pair of terms.
abs(n: name, t: term): term
Given a name n and a term t, return an abstraction term containing t', where t' is a version of t where all free variables named n inside t' have been replaced with bound variables. These bound variables are bound to the returned abstraction term.
Note: we may consider a bound variable to be a term, but the user of the abstract data type cannot work with bound variables directly, so it is unlike all other kinds of terms in that respect. A bound variable is always bound to a particular abstraction term. In the case of
abs
, the abstraction term to which variables are bound is always the term returned by theabs
operation.Note: an abstraction term contains one subterm. This subterm cannot be extracted directly, as it may contain bound variables, which the user cannot work with directly.
destruct(t: term, f1: fun, f2: fun, f3: fun): X
Given a term t and three functions f1, f2, and f3 (each with a different signature, described below), choose one of the three functions based on the structure of t, and evaluate it, returning what it returns.
If t is a free variable, evaluate f1(n) where n is the name of the free variable t.
If t is an application term, evaluate f2(u, v) where u is the first subterm of t and v is the second subterm of t.
If t is an abstraction term, evaluate f3(u, n) where u is a version of t where all bound variables in u that were bound to u itself have been replaced by n, where n is a fresh name (i.e. a name that does not occur free anywhere in u).
Note: as stated above, a bound variable is always bound to an abstraction term. The bound variables that are replaced by the
destruct
of an abstraction term are always and only those that are bound to the abstraction term beingdestruct
ed.Note: the
destruct
operation's signature shown above was abbreviated to make it look less intimidating. The full signature would be something more likedestruct(t: term, f1: fun(n: name): X, f2: fun(u: term, v: term): X, f3: fun(u: term, n: name): X): X
Some Examples
We will now give some concrete examples of how these operations can be used, but first, we would like to emphasize that this is an abstract data type for lambda terms, not the lambda calculus. Naturally, one ought to be able to write a lambda calculus normalizer using these operations (and this will be one of our goals in the next section), but one is not restricted to that activity. The terms constructed using the Lariat operations may be used for any purpose for which terms-with-name-binding might be useful.
Example 1
A common task is to obtain the set of free variables present
in a lambda term. This is not difficult; we only need to
keep track of the new free variables we introduce ourselves
when we destruct
an abstraction term, and make sure not to
include any of them when we report the free variables we found.
let freevars = fun(t, ours) ->
destruct(t,
fun(n) -> if n in ours then {} else {n},
fun(u, v) -> freevars(u, ours) + freevars(v, ours),
fun(u, n) -> freevars(u, ours + {n})
)
Example 2
Given an abstraction term and a value, return a version of
the body of the abstraction term where every instance of the
variable bound to the abstraction term is replaced by the given
value. We can call this operation resolve
.
let resolve = fun(t, x) ->
destruct(t,
fun(n) -> t,
fun(u, v) -> t,
fun(u, n) -> replace_all(u, n, x)
)
where replace_all = fun(t, m, x) ->
destruct(t,
fun(n) -> if n == m then x else var(n),
fun(u, v) -> app(replace_all(u, m, x), replace_all(v, m, x))
fun(u, n) -> abs(n, replace_all(u, m, x))
)
Note that this operation was specifically not called subst
,
because the name subst
is often given to a process that
replaces free variables, while this operation replaces
bound ones. It was also specifically not named beta
because it does not require that t and x come from the same
application term.
Example 3
The next task is to write a beta-reducer. We destruct
the term twice, once to ensure it is an application term,
and again to ensure the application term's first subterm
is an abstraction term. Then we use resolve
, above, to
plug the application term's second subterm into the
abstraction term.
let beta = fun(r) ->
destruct(r,
fun(n) -> var(n),
fun(u, v) ->
destruct(u,
fun(_) -> app(u, v),
fun(_, _) -> app(u, v),
fun(u) -> resolve(u, v)
),
fun(t) -> t
)
In fact, we could merge this implementation with the
implementation of resolve
and this would save a call
to destruct
; but this would be merely an optimisation.
It is left as an exercise to any reader who may be so
motivated to undertake it.
Example 4
The next task would be to search through a lambda term, looking for a candidate application term to reduce, and reducing it.
--
-- Returns [bool, term] where bool indicates "has rewritten"
--
let reduce = fun(t) ->
if is_beta_reducible(t) then
[true, beta(t)]
else
destruct(t,
fun(n) -> [false, var(n)],
fun(u, v) ->
let
[has_rewritten, new_u] = reduce(u)
in
if has_rewritten then
[true, app(new_u, v)]
else
let
[has_rewritten, new_v] = reduce(v)
in
if has_rewritten then
[true, app(u, new_v)]
else
[false, app(new_u, new_v)],
fun(t) -> [false, t]
)
From there it ought to be just a hop, a skip, and a jump to a proper lambda term normalizer:
let normalize(t) ->
let
[has_rewritten, new_t] = reduce(t)
in
if has_rewritten then
normalize(new_t)
else
t
Discussion
var
, app
, and abs
construct terms, while destruct
takes them apart.
Constructing terms is the easy part; it's taking them apart properly that's
hard.
destruct
is a "destructorizer" in the sense described in
this article on Destructorizers.
In fact, this use case of "taking apart" lambda terms was one
of the major motivations for formulating the destructorizer
concept.
When working with lambda terms, one is often concerned with
comparing two lambda terms for equality, modulo renaming of bound
variables. We haven't introduced such an operation because it should
be possible to build such an operation using destruct
; basically,
render the two terms as text (or some other concrete representation),
then compare the texts for equality.
But of course such an operation could be provided as a native
operation for performance or convenience. Similarly, although
we have shown that we can implement freevars
using the operations
of the abstract data type, it is expected that it would already
be implemented in the implementation of the abstract data type
(to correctly implement destruct
), so could be exposed to the
user as well.
Appendices
Appendix A
On the implementation of names.
One way to implement names as defined in Lariat 0.2 is to use qualified names. A qualified name is an ordered list of name segments, where each name segment is what a name was in Lariat 0.1, i.e. the only operation we require name segments to support is comparison of two name segments for equality. (Comparing two qualified names for equality is straightforwardly derived from this.)
The client of the Lariat ADT is not, by themselves, required to qualify any names; if each qualified name they supply in their usage consists of only a single name segment, that's fine.
However, qualified names permit the definition of a simple algorithm for generating a fresh name, i.e. a name which does not appear in a given set of names. To wit,
- pick the longest qualified name from the set;
- prepend an arbitrary name segment to that qualified name.
If there are more than one longest names in the set, pick one arbitrarily.
In addition, a simple way to pick an arbitrary name segment to prepend to it, is to look at the leftmost name segment already in the qualified name.
So we can assume an algorithm like this is in use. But ultimately, any
implementation which satisfies the two operations required of names
(equal
and fresh
) is acceptable. We provide this concrete representation
and algorithm here partly because a trivial concrete representation as used
in Lariat 0.1 is not sufficient for implementing names (as there is no
derivable way to obtain a fresh name when needed, without relying on some
external fresh name supply) and we wanted to show that there was at least
some concrete representation which fulfills the requirements.
Commit History
@develop-0.2
git clone https://git.catseye.tc/Lariat/
- Several (hopefully final-for-0.2) edits for the main README. Chris Pressey 2 years ago
- Fix bug in reduceOnce. Chris Pressey 2 years ago
- Edits to README. Chris Pressey 2 years ago
- Restore the preceding examples/tests. Chris Pressey 2 years ago
- Introduce `Freshable` datatype. Chris Pressey 2 years ago
- Checkpoint. We need a typeclass here. Chris Pressey 2 years ago
- Merge branch 'develop-0.2' into abstract-names Chris Pressey 2 years ago
- Extend the example cases. Chris Pressey 2 years ago
- Make the operations on names, abstract. Chris Pressey 2 years ago
- Implement `resolve` and adjust the unit tests for it. Chris Pressey 2 years ago