CBPV and Natural Deduction - Part 1. Small steps

2023-07-22
/blog#logic#abstract machine#cbpv

I have been reading up on call-by-push-value (CBPV), a variation of $\lambda$-calculus that is more fine-grained in its treatment of argument-passing. I then came across a chapter from Frank Pfenning's 2016 lecture notes on substructual logic that characterizes CBPV as polarized natural deduction. In a small series of posts, I am exploring what this means and how this relates to compilers.

The Curry-Howard-Lambek correspondence is a nice way to connect logic and computation. I have written about this in previous posts on my old blog, see Higher-order logic and equality and Intuitionistic Propositional Logic and Natural Deduction. For the first part, those posts should be enough to tag along. In the next parts, we may look at some fine points of natural deduction.

Reduction and strategy

In $\lambda$-calculus, our elementary computation step is a $\beta$-reduction. Taking the perspective of the Curry-Howard-Lambek correspondence, we can regard a type as a logical proposition and a term of typed $\lambda$-calculus as a proof of this proposition. Here, $\beta$-reduction is a rewriting of the proof, which removes a 'detour' in the proof. In other words, computation is proof normalization.

$$ ( \lambda x : A . M)\ N \longrightarrow_β M [ x := N ] $$

Here, $[ x := N ]$ is our way of writing substitution. The full definition requires the usual careful treatment of bound names. The rewriting can take place anywhere in the term.

Suppose we wanted to specify the order of computation steps. When there are not one, but several reducible expressions, which reduction should happen first?

A small-step semantics would specify not only reduction, but also where exactly evaluation takes place. To this end, one defines evaluation context, a term with a hole that specifies exactly where in the term reduction is permitted to happen. Here is a sample definition, where we consider $\lambda$-abstractions as values (we do not reduce under a $\lambda$.)

$$ \begin{array}{lll} E &::= &x\ |\ \lambda x. E\ |\ E\ E \\ C &::= &[~]\ |\ E\ C\ |\ C\ V \\ V &::= &\lambda x. E \end{array} $$

When we have a context, we can fill its hole with a $\lambda$-expression and now exactly where evaluation takes place. The hole of a context could also be filled with another context.

The definition of evaluation contexts forces the argument (operand) in an application to be evaluated before the operator. The example above is thus a form of call-by-value. Note that the definition of $C$ comes with a somewhat arbitrary decision to evaluate the operand before the operator. For call-by-name, passing a term that is not value would require a context like $[~]\ E$.

Our specification of evaluation contexts seems to impose order by defining what form of argument-passing is possible; in other words, constraining (or leaving unconstrained) what can be bound to an identifier.

Values and computations

Instead of defining evaluation contexts separately, we now look at a different way of impose order on evaluation. In doing so, we start treating contexts as computation (filling the hole with a value is something that can produce a new value). We will see that what we are about to do is not very different from specifying an intermediate representation (IR) of a compiler.

CBPV is a calculus that encompasses both call-by-value and call-by-name. It achieves this through a fine-grained distinction between terms that are values vs terms that are computations which is enfored by a type disciple. Therefore we will have value types $A^+$ and computation types $B^-$:

$$ \begin{array}{lll} B^- &::= & A^+ \rightarrow B^-\ |\ \uparrow{A^+} \\ A^+ &::= &\mathbf{1}\ |\ \downarrow{B^-} \end{array} $$

The type operators $\uparrow{}$ and $\downarrow{}$ are described below. We add $\mathbf{1}$ (the "unit type") as a base type, with only inhabitant $\mathtt{()}$ that we pronounce as "unit". Base types like $\mathtt{Int}$ or $\mathtt{String}$ would also be value types. All variables have value type.

$$ \begin{prooftree} \AxiomC{} \RightLabel{$(1_I)$} \UnaryInfC{$\cdot \vdash (): \mathbf{1}$} \end{prooftree} \quad \quad \begin{prooftree} \AxiomC{$x : A^+ \in \Gamma$} \RightLabel{(hyp)} \UnaryInfC{$\Gamma \vdash x: A^+$} \end{prooftree} $$

The arrow type forces arguments to be of value type and the result to be of computation type. What matters most is the interplay of $\lambda$-abstraction and application. Here are the typing rules for these:

$$ \begin{prooftree} \AxiomC{$\Gamma, x: A^+ \vdash M: B^-$} \RightLabel{$(\to_I)$} \UnaryInfC{$\Gamma \vdash (\lambda x: A^+. M):\, A^+ \rightarrow B^-$} \end{prooftree} \quad \quad \begin{prooftree} \AxiomC{$\Gamma \vdash M:\, A^+ \rightarrow B^-$} \AxiomC{$\Gamma \vdash V:\, A^+$} \RightLabel{$(\to_E)$} \BinaryInfC{$\Gamma \vdash (M\ V):\, B^-$} \end{prooftree} $$

In words, a $\lambda$-expression has an arrow type which is a computation (negative) type. We can chain abstractions like $\lambda x:X^+. \lambda y:Y^+. M$ for some term $M$ of computation type, but the argument types $X^+, Y^+$ are forced to be be value (positive) types. Application yields a computation type. This is where it may be useful to remember that filling the hole of a context yields something that we can turn into a value (but it is not a value yet).

You may have noticed the $I$ and $E$ letters in the rule names. In natural deduction, every logical connective comes with an introduction and elimination rule. Even though we write these like sequents of sequent calculus, there are a few differences. In sequent calculus, there are left- and right-(introduction)-rules, and a computation step corresponds to the removal of detours (lemmas) via cut-elimination.

Shifts

As any type discipline, the above rules make sure that certain programs cannot be written anymore. How can we return a value, though? Or write an identity computation? Or pass a function as an argument to another function? Our calculus is not yet complete.

First, we need a way to turn a value into a computation. More precisely, we want to turn a term $V: A^+$ of value (positive) type into a term of computation (negative) type.

Let us call this operation $\mathtt{return}\ p$. This is a "shift" between value types and computation types and is made explicit using a type operator $\uparrow\!{}A^+$. The notation requires some decoding work since $\uparrow\!{}A^+$ is a computation (negative) type.

$$ \begin{prooftree} \AxiomC{$\Gamma \vdash V: A^+$} \RightLabel{$(\uparrow_I)$} \UnaryInfC{$\Gamma \vdash \mathtt{return}\ V:\, \uparrow\!{}A^+$} \end{prooftree} \quad \quad \begin{prooftree} \AxiomC{$\Gamma \vdash s:\, \uparrow\!{}A^+$} \AxiomC{$\Gamma, x: A^+ \vdash M:\, B^-$} \RightLabel{$(\uparrow_E)$} \BinaryInfC{$\Gamma \vdash \mathtt{let\ val}\ x = M\ \mathtt{in}\ N:\, B^-$} \end{prooftree} $$

The corresponding elimination operation takes a suspended computation and yields a value. We use a "$\mathtt{let\ val}$" declaration as source syntax. Note how the type discipline forces the righthand-side to be a suspended computation and how this imposes an order - we need to have the value before continuing. In turning the the right-hand side into a value, we not only know where the actual computation happens; we also bind the result to a local name.

This amounts to all intermediary results being named.

Next, we want to "package" a computation into a value (suspend the computation). This will let us pass a $\lambda$-abstraction as an argument to another $\lambda$-abstraction. We introduce an operator $\mathtt{thunk}\ t$ that suspends a computation and an operator $\mathtt{force}\ s$ that resumes a suspended computation.

$$ \begin{prooftree} \AxiomC{$\Gamma \vdash M: B^-$} \RightLabel{$(\downarrow_I)$} \UnaryInfC{$\Gamma \vdash \mathtt{thunk}\ M:\, \downarrow\!{}B^-$} \end{prooftree} \quad \quad \begin{prooftree} \AxiomC{$\Gamma \vdash V:\, \downarrow\!{}B^-$} \RightLabel{$(\downarrow_E)$} \UnaryInfC{$\Gamma \vdash \mathtt{force}\ V:\, B^-$} \end{prooftree} $$

Again, the shift operator indicates in the type that we have a suspended computation which is a value. Thus, $\downarrow$ shifts a negative to a positive type, and $\downarrow\!A^-$ can be used in all places that require a value type.

Example programs

Let's look at examples, starting with the identity combinator of plain $\lambda$-calculus. We want one of these at every type $I_A := \lambda x: A. x$ of type $A \rightarrow A$. In the above polarized $\lambda$-calculus, we get something close enough:

(Exercise: what are the types of these? What kind of "optimization" would remove these?)

Next, let us try a combinator $\mathit{twice}_A$ of type $(A \rightarrow A) \rightarrow A \rightarrow A$. It takes a function $f$ and and argument $x$ and applies $f$ to $x$ twice. In our polarized $\lambda$ calculus, we need to use the other shift operation and end up with:

$$ \begin{array}{l} \lambda f:\,\downarrow(A^+ \rightarrow\, \uparrow\!A^+). \lambda x:A^+. \\ \mathtt{let\ val}\ y\ =\ (\mathtt{force}\ f)\ x\ \mathtt{in} \\ \mathtt{return}\ (\mathtt{force}\ f)\ y \end{array} $$

Note how the intermediary result has a name $y$. (Exercise: what is the type of this program? If we used an alternative term which didn't use $\mathbf{return}$, what changes?)

An interpreter

The logical reading of these typing rules is that we have set up a particular kind of natural deduction calculus. A term that has a type derivation is a proof, and an introduction followed by an elimitation is clearly a "detour". These detours can be removed and these proof normalization steps correspond to computation steps.

At this point, we can write out local reductions:

$$\begin{array}{ll} \mathtt{let\ val}\ x\ =\ \mathtt{return}\ V\ \mathtt{in} \mathtt{M} &\rightsquigarrow M[x := V] \\ \mathtt{thunk}\ (\mathtt{force}\ M) &\rightsquigarrow M \end{array} $$

This is not what we were after though. Instead, we can define an abstract machine that specifies exactly which rewriting steps to take when. What is special about an abstract machine is that unlike an interpreter that recursively traverses an expression, the machine always operates at a bounded depth from the top.

What follows are transitions rules of a CK machine. Here C stands for control and K is a stack of contexts. The source level $\mathbf{let\ val}\ x\ = \_ \ \mathbf{in}\ M$ expression is shortened to $(\_ \ \mathtt{to}\ x. M)$, and an application where we are waiting for the operator to be evaluated is written $(\_\ V)$. This gives a simple operational semantics, although a CK machine is still a rather high-level description since we need to appeal to substitution in the definition.

$$ \begin{array}{llll} C & K & \rightsquigarrow & C' & K' \\ \mathtt{let\ val}\ x = M\ \mathtt{in}\ N & k & & M & (\_ \ \mathtt{to}\ x. N) :: k \\ \mathtt{return}\ V & (\_\ \mathtt{to}\ x. M) :: k & & M[x := V] & k \\ \mathtt{force} (\mathtt{thunk}\ M) & k & & M & k \\ M\ V & k & & M & (\_ \ V) :: k \\ \lambda x. M & (\_ \ V) :: k & & M[x := V] & k \end{array} $$

There is a simple idea behind all this which is worth restating: we statically (through the type system) know that every application $(M\ V)$ comes with an operand that is a value. So:

More precisely, what we push and pop is an application (evaluation context) with hole in the operator place and an value as operand. Even though this looks like call-by-value, this subsumes call-by-name because a suspended computation can be treated as a value. The stack is a list of nested contexts. In a sense, it is dual to an expression; this can be made precise but we won't do this now.

A preview of the continuation

We started from a natural deduction calculus, which is used for formal logical reasoning, and ended up with an abstract machine. Unlike rewriting, we have made a step towards a more mechanical, low-level way of normalizing expressions. The fact that the CK machine is still using substitutions makes it look like we are playing a formal game of symbol manipulation, but if we could continue from here towards a CEK machine which replaces substitutions with environments.

On the logical side, since we did not discuss products and sums, we are missing conjunction and disjunction. We did not discuss polarization much. We did not explore classical reasoning, negation, sequent calculus.

We did not talk about effects yet. CBPV gives us a handle on computational effects, similar to monads but different. It should be obvious how a lean way to specifying evaluation order helps with describing effects.

This is a good time to pause and reflect, before we go to the next round and shed light on some of these topics. In the meantime, here are some pointers to learn more: