CBPV and Natural Deduction - Part 1. Small steps

2023-07-22

I have been reading up on call-by-push-value (CBPV),
a variation of $\lambda$-calculus that is more
fine-grained in its treatment of argument-passing.
I then came across a chapter
from Frank Pfenning's 2016 lecture notes on substructual logic that
characterizes CBPV as *polarized natural deduction.* In a small series
of posts, I am exploring what this means and how this relates to
compilers.

The Curry-Howard-Lambek correspondence is a nice way to connect logic and computation. I have written about this in previous posts on my old blog, see Higher-order logic and equality and Intuitionistic Propositional Logic and Natural Deduction. For the first part, those posts should be enough to tag along. In the next parts, we may look at some fine points of natural deduction.

In $\lambda$-calculus, our elementary computation step is a $\beta$-reduction. Taking the perspective of the Curry-Howard-Lambek correspondence, we can regard a type as a logical proposition and a term of typed $\lambda$-calculus as a proof of this proposition. Here, $\beta$-reduction is a rewriting of the proof, which removes a 'detour' in the proof. In other words, computation is proof normalization.

$$ ( \lambda x : A . M)\ N \longrightarrow_β M [ x := N ] $$

Here, $[ x := N ]$ is our way of writing substitution. The full definition requires the usual careful treatment of bound names. The rewriting can take place anywhere in the term.

Suppose we wanted to specify the *order* of computation steps. When there are not one, but several reducible expressions,
which reduction should happen first?

A small-step semantics would specify not only reduction, but also where exactly evaluation takes place.
To this end, one defines *evaluation context*, a term with a hole that specifies exactly
where in the term reduction is permitted to happen. Here is a sample definition, where we consider
$\lambda$-abstractions as values (we do not reduce under a $\lambda$.)

$$ \begin{array}{lll} E &::= &x\ |\ \lambda x. E\ |\ E\ E \\ C &::= &[~]\ |\ E\ C\ |\ C\ V \\ V &::= &\lambda x. E \end{array} $$

When we have a context, we can fill its hole with a $\lambda$-expression and now exactly where evaluation takes place. The hole of a context could also be filled with another context.

The definition of evaluation contexts forces the argument (operand) in an application to be evaluated before the operator. The example above is thus a form of call-by-value. Note that the definition of $C$ comes with a somewhat arbitrary decision to evaluate the operand before the operator. For call-by-name, passing a term that is not value would require a context like $[~]\ E$.

Our specification of evaluation contexts seems to impose order by defining what form of argument-passing is possible; in other words, constraining (or leaving unconstrained) what can be bound to an identifier.

Instead of defining evaluation contexts separately, we now look at a different way of impose order on evaluation. In doing so, we start treating contexts as computation (filling the hole with a value is something that can produce a new value). We will see that what we are about to do is not very different from specifying an intermediate representation (IR) of a compiler.

CBPV is a calculus that encompasses both call-by-value and call-by-name.
It achieves this through a fine-grained distinction between terms that are
*values* vs terms that are *computations* which is enfored by a
type disciple. Therefore we will have value types
$A^+$ and computation types $B^-$:

$$ \begin{array}{lll} B^- &::= & A^+ \rightarrow B^-\ |\ \uparrow{A^+} \\ A^+ &::= &\mathbf{1}\ |\ \downarrow{B^-} \end{array} $$

The type operators $\uparrow{}$ and $\downarrow{}$ are described below. We add $\mathbf{1}$ (the "unit type") as a base type, with only inhabitant $\mathtt{()}$ that we pronounce as "unit". Base types like $\mathtt{Int}$ or $\mathtt{String}$ would also be value types. All variables have value type.

$$ \begin{prooftree} \AxiomC{} \RightLabel{$(1_I)$} \UnaryInfC{$\cdot \vdash (): \mathbf{1}$} \end{prooftree} \quad \quad \begin{prooftree} \AxiomC{$x : A^+ \in \Gamma$} \RightLabel{(hyp)} \UnaryInfC{$\Gamma \vdash x: A^+$} \end{prooftree} $$

The arrow type forces arguments to be of value type and the result to be of computation type. What matters most is the interplay of $\lambda$-abstraction and application. Here are the typing rules for these:

$$ \begin{prooftree} \AxiomC{$\Gamma, x: A^+ \vdash M: B^-$} \RightLabel{$(\to_I)$} \UnaryInfC{$\Gamma \vdash (\lambda x: A^+. M):\, A^+ \rightarrow B^-$} \end{prooftree} \quad \quad \begin{prooftree} \AxiomC{$\Gamma \vdash M:\, A^+ \rightarrow B^-$} \AxiomC{$\Gamma \vdash V:\, A^+$} \RightLabel{$(\to_E)$} \BinaryInfC{$\Gamma \vdash (M\ V):\, B^-$} \end{prooftree} $$

In words, a $\lambda$-expression has an arrow type which is a computation (negative) type. We can chain abstractions like $\lambda x:X^+. \lambda y:Y^+. M$ for some term $M$ of computation type, but the argument types $X^+, Y^+$ are forced to be be value (positive) types. Application yields a computation type. This is where it may be useful to remember that filling the hole of a context yields something that we can turn into a value (but it is not a value yet).

You may have noticed the $I$ and $E$ letters in the rule names. In natural deduction, every logical connective comes with an introduction and elimination rule. Even though we write these like sequents of sequent calculus, there are a few differences. In sequent calculus, there are left- and right-(introduction)-rules, and a computation step corresponds to the removal of detours (lemmas) via cut-elimination.

As any type discipline, the above rules make sure that certain programs cannot be written anymore. How can we return a value, though? Or write an identity computation? Or pass a function as an argument to another function? Our calculus is not yet complete.

First, we need a way to turn a value into a computation. More precisely, we want to turn a term $V: A^+$ of value (positive) type into a term of computation (negative) type.

Let us call this operation $\mathtt{return}\ p$. This is a "shift" between value types and computation types and is made explicit using a type operator $\uparrow\!{}A^+$. The notation requires some decoding work since $\uparrow\!{}A^+$ is a computation (negative) type.

$$ \begin{prooftree} \AxiomC{$\Gamma \vdash V: A^+$} \RightLabel{$(\uparrow_I)$} \UnaryInfC{$\Gamma \vdash \mathtt{return}\ V:\, \uparrow\!{}A^+$} \end{prooftree} \quad \quad \begin{prooftree} \AxiomC{$\Gamma \vdash s:\, \uparrow\!{}A^+$} \AxiomC{$\Gamma, x: A^+ \vdash M:\, B^-$} \RightLabel{$(\uparrow_E)$} \BinaryInfC{$\Gamma \vdash \mathtt{let\ val}\ x = M\ \mathtt{in}\ N:\, B^-$} \end{prooftree} $$

The corresponding elimination operation takes a suspended computation and yields a value. We use a "$\mathtt{let\ val}$" declaration as source syntax. Note how the type discipline forces the righthand-side to be a suspended computation and how this imposes an order - we need to have the value before continuing. In turning the the right-hand side into a value, we not only know where the actual computation happens; we also bind the result to a local name.

This amounts to all intermediary results being named.

Next, we want to "package" a computation
into a value (*suspend* the computation). This will let us pass a $\lambda$-abstraction
as an argument to another $\lambda$-abstraction. We introduce an operator
$\mathtt{thunk}\ t$ that suspends a computation and an operator
$\mathtt{force}\ s$ that resumes a suspended computation.

$$ \begin{prooftree} \AxiomC{$\Gamma \vdash M: B^-$} \RightLabel{$(\downarrow_I)$} \UnaryInfC{$\Gamma \vdash \mathtt{thunk}\ M:\, \downarrow\!{}B^-$} \end{prooftree} \quad \quad \begin{prooftree} \AxiomC{$\Gamma \vdash V:\, \downarrow\!{}B^-$} \RightLabel{$(\downarrow_E)$} \UnaryInfC{$\Gamma \vdash \mathtt{force}\ V:\, B^-$} \end{prooftree} $$

Again, the shift operator indicates in the type that we have a suspended computation which is a value. Thus, $\downarrow$ shifts a negative to a positive type, and $\downarrow\!A^-$ can be used in all places that require a value type.

Let's look at examples, starting with the identity combinator of plain $\lambda$-calculus. We want one of these at every type $I_A := \lambda x: A. x$ of type $A \rightarrow A$. In the above polarized $\lambda$-calculus, we get something close enough:

- for value types, there is: $\mathit{idval}_{A^+} := \lambda x: A^+. \mathtt{return}\,x$
- for computation types, we have: $\mathit{idcmp}_{B^-} := \lambda x:\,\downarrow{B^-}. \mathtt{force}\ x$

(Exercise: what are the types of these? What kind of "optimization" would remove these?)

Next, let us try a combinator $\mathit{twice}_A$ of type $(A \rightarrow A) \rightarrow A \rightarrow A$. It takes a function $f$ and and argument $x$ and applies $f$ to $x$ twice. In our polarized $\lambda$ calculus, we need to use the other shift operation and end up with:

$$ \begin{array}{l} \lambda f:\,\downarrow(A^+ \rightarrow\, \uparrow\!A^+). \lambda x:A^+. \\ \mathtt{let\ val}\ y\ =\ (\mathtt{force}\ f)\ x\ \mathtt{in} \\ \mathtt{return}\ (\mathtt{force}\ f)\ y \end{array} $$

Note how the intermediary result has a name $y$. (Exercise: what is the type of this program? If we used an alternative term which didn't use $\mathbf{return}$, what changes?)

The logical reading of these typing rules is that we have set up a particular kind of natural deduction calculus. A term that has a type derivation is a proof, and an introduction followed by an elimitation is clearly a "detour". These detours can be removed and these proof normalization steps correspond to computation steps.

At this point, we can write out local reductions:

$$\begin{array}{ll} \mathtt{let\ val}\ x\ =\ \mathtt{return}\ V\ \mathtt{in} \mathtt{M} &\rightsquigarrow M[x := V] \\ \mathtt{thunk}\ (\mathtt{force}\ M) &\rightsquigarrow M \end{array} $$

This is not what we were after though. Instead, we can define an *abstract machine*
that specifies exactly which rewriting steps to take when. What is special about
an abstract machine is that unlike an interpreter that recursively traverses
an expression, the machine always operates at a bounded depth from the top.

What follows are transitions rules of a CK machine. Here C stands for control and K is a stack of contexts. The source level $\mathbf{let\ val}\ x\ = \_ \ \mathbf{in}\ M$ expression is shortened to $(\_ \ \mathtt{to}\ x. M)$, and an application where we are waiting for the operator to be evaluated is written $(\_\ V)$. This gives a simple operational semantics, although a CK machine is still a rather high-level description since we need to appeal to substitution in the definition.

$$ \begin{array}{llll} C & K & \rightsquigarrow & C' & K' \\ \mathtt{let\ val}\ x = M\ \mathtt{in}\ N & k & & M & (\_ \ \mathtt{to}\ x. N) :: k \\ \mathtt{return}\ V & (\_\ \mathtt{to}\ x. M) :: k & & M[x := V] & k \\ \mathtt{force} (\mathtt{thunk}\ M) & k & & M & k \\ M\ V & k & & M & (\_ \ V) :: k \\ \lambda x. M & (\_ \ V) :: k & & M[x := V] & k \end{array} $$

There is a simple idea behind all this which is worth restating: we statically (through the type system) know that every application $(M\ V)$ comes with an operand that is a value. So:

- whenever we evaluate an application we start by pushing a value (the operand)
- when we are done with evaluating the operator and obtain a $\lambda$-term, we can pop a value and continue

More precisely, what we push and pop is an application (evaluation context) with hole in the operator place and an value as operand. Even though this looks like call-by-value, this subsumes call-by-name because a suspended computation can be treated as a value. The stack is a list of nested contexts. In a sense, it is dual to an expression; this can be made precise but we won't do this now.

We started from a natural deduction calculus, which is used for formal logical reasoning, and ended up with an abstract machine. Unlike rewriting, we have made a step towards a more mechanical, low-level way of normalizing expressions. The fact that the CK machine is still using substitutions makes it look like we are playing a formal game of symbol manipulation, but if we could continue from here towards a CEK machine which replaces substitutions with environments.

On the logical side, since we did not discuss products and sums, we are missing conjunction and disjunction. We did not discuss polarization much. We did not explore classical reasoning, negation, sequent calculus.

We did not talk about effects yet. CBPV gives us a handle on computational effects, similar to monads but different. It should be obvious how a lean way to specifying evaluation order helps with describing effects.

This is a good time to pause and reflect, before we go to the next round and shed light on some of these topics. In the meantime, here are some pointers to learn more:

- Frank Pfenning's 2016 lecture notes on substructual logic has a chapter on CBPV. It ends not with a CK machine, but a specification of operational semantics in an ordered logic formalism.
- CBPV is described in the book "Call-by-Push-Value: A Functional Imperative Synthesis" by Paul Blain Levy. A minor difference is that we spell out the application context with the value here while in Levy's stack only the value gets pushed.
- Matthias Felleisen's lecture notes have a discussion of CK, CEK, CESK machines. The introduction of environments lets us get rid of substitutions and replace them with environment lookups. This is not yet an efficient language implementation, but it closes the gap. There is discussion about treating the environment more like a call stack, including popping unused values. And (for the CESK) there is a discussion on allocating structures.
- Bob Harper's post on polarity in type theory
- Zena M. Ariola, Aaron Bohannon, Amr Sabry. Sequent calculi and abstract machines has a thorough discussion of natural deduction.
- Nick Benton, Gavin Bierman, Valeria de Paiva, Martin Hyland. Term Assignment for Intuitionistic Linear Logic