\documentclass{llncs}
\usepackage{amssymb}
\usepackage{url}
\usepackage{bussproofs}
%\pagestyle{plain}
\newcommand{\derl}{\mathtt{derl}}
\newcommand{\gderl}{\mathtt{gderl}}
\newcommand{\derrc}{\mathtt{derrc}}
\newcommand{\dersl}{\mathtt{dersl}}
\newcommand{\pscrel}{\mathtt{rli}}
\newcommand{\psr}{\pscrel}
\newcommand{\derrec}{\mathtt{derrec}}
\newcommand{\dersrec}{\mathtt{dersrec}}
\newcommand{\prems}{\mathtt{pms}}
\newcommand{\labvdash}[1]{\vdash_{#1}}
\newcommand{\vseq}{\labvdash{}}
\newcommand{\vdashL}{\labvdash{L}}
\newcommand{\vderl}{\labvdash{\derl}}
\newcommand{\vdersl}{\labvdash{\dersl}}
\newcommand{\vpscrel}{\labvdash{\pscrel}}
\newcommand{\ExImp}{\rightarrow}
\newcommand{\ExAnd}{\wedge}
\newcommand{\ExOr}{\vee}
\newcommand{\ExNot}{\neg}
\newcommand{\LabWhtDia}[1]{\langle#1\rangle}
\newcommand{\LabWhtBox}[1]{[#1]}
\newcommand{\WhtDia}{\Diamond}
\newcommand{\WhtBox}{\Box}
\newcommand{\ds}{\displaystyle\strut}
\newcommand{\idrule}[2]{
\mbox{#1 \ $\ds$ #2}
}
\newcommand{\urule}[4]{
\AxiomC{#2}
\LeftLabel{#1}
\RightLabel{#4}
\UnaryInfC{#3}
\DisplayProof
}
\newcommand{\brule}[4]{
\AxiomC{#2}
\AxiomC{#3}
\LeftLabel{#1}
\BinaryInfC{#4}
\DisplayProof
}
\newcommand\invrule[3]{
\AxiomC{#2}
\LeftLabel{#1}
\doubleLine
\UnaryInfC{#3}
\DisplayProof
}
\newcommand\doubleinvrule[4]{
\AxiomC{#2}
\LeftLabel{#1}
\doubleLine
\UnaryInfC{#3}
\doubleLine
\UnaryInfC{#4}
\DisplayProof
}
\newcommand{\NewRule}[4]{
\newcommand{#1}{#2}
\newcommand{#3}{#4}
}
%% \NewRule{\RuleLabelAsMacro} %%% Rule Name
%% {(RuleLabelAsString)} %%% Test of Rule Name
%% {\RuleName} %%% Rule Identifier
%% {\invrule{\RuleLabelAsMacro} %%% Rule
%% {test}
%% {test}
%% }
% Example:
%% \NewRule{\K}{$(\vdash K)$}
%% {\KRule}{
%% \urule{\K}
%% {$\vdash A, \Delta$}
%% {$\vdash \WhtBox A, \WhtDia \Delta, \Pi$}
%% }
\title{Machine Checking Proof Theory: \\
an application of logic to logic}
\author{Rajeev Gor\'e}
\date{ICLA 2009: Indian Conference on Logic and Applications}
\institute{Computer Sciences Laboratory
\\ The Australian National University
\\ Canberra
\\ \url{rajeev.gore@anu.edu.au}
}
\begin{document}
\maketitle
\begin{abstract}
Modern proof-assistants are now mature enough to formalise many
aspects of mathematics. I outline some work we have done using
the proof-assistant Isabelle to machine-check aspects of proof
theory in general, and specifically the proof theory of provability
logic GL.
\end{abstract}
\section{Motivation}
Proof theory, broadly construed, is the study of derivations as
first-class objects. Typically, we study a proof-calculus which
captures the notion that a particular formula $A$ is deducible from a
given finite collection $\Gamma$ of assumption formulae in some given
logic $L$: usually written as $\Gamma \vdash_L A$. Typical such
calculi are Gentzen's Sequent Calculi, Natural Deduction Calculi or
Hilbert Calculi.
But proof theory is error prone. There are numerous examples of
published ``proofs'' in proof theory which have turned out to be
incorrect at a later date. These errors often lie undiscovered for
years, usually until some diligent Phd student actually tries to work
through the proofs in detail and discovers a bug. I give a concrete
example later.
Part of the problem is that conferences and journal typically enforce
page limits, so that authors are forced to elide full details. Another
cause is that proof-theoretical proofs typically contain many similar
cases, and humans are notoriously bad at carrying out repetitive tasks
with precision. Thus authors often resort to words like ``the other
cases are similar''. But sometimes the errors are very subtle, and are
not just a matter of routine checking.
Proof-theoretic proofs often proceed by induction since derivations
are usually structured objects like lists, trees or graphs.
Proof assistants are computer programs which allow a user to encode
and check proofs written using a special syntax and interface. They
have a long history going back to the early 1970s, are usually based
upon an encoding of higher-order logic into some extension of Church's
typed $\lambda$-calculus, and are now an exciting and mature area of
research. Indeed, there is now a strong movement to ``formalise
mathematics'' using computers as exemplified by Tom Hales' project to
formally verify his ``proof'' of the Kepler
Conjecture~\url{http://code.google.com/p/flyspeck/}.
Most modern proof-assistants allow us to define infinite sets using
inductive definitions. Most also automatically generate powerful
induction principles for proving arbitrary properties of such sets.
Given that proof-theory is error-prone and that it typically utilises
proofs by induction, can we use modern proof-assistants to help us
machine-check proof theory?
\paragraph{An Oath:}
I have only a limited amount of time and I need to cover a lot of
background material. I also want to show you some actual code that we
have developed, but I wish to simplify it to hide unimportant details.
So here is an oath: I will tell the truth, I may not tell the whole
truth, but I won't lie. So complain immediately if you see something
blatantly incorrect!
\paragraph{Acknowledgements:}
Most of the work reported here has been done in collaboration with
Jeremy E Dawson. He is a perfect example of a serious mathematician
who checks all of his proofs in Isabelle ... just to be sure.
\section{Proof Theory: Purely Syntactic Calculi for $L$-Deduction}
To begin with the basics, I just want to talk briefly about the
proof-calculi we typically study.
We typically work with judgements of the form $\Gamma \vdash_{L}
\Delta$ where $\Gamma$ and $\Delta$ are ``collections'' of formulae.
I deliberately want to leave vague the exact definition of
``collection'' for now: think of it as some sort of data-structure for
storing information.
From these judgements, we usually define rules, and form a calculus by
assembling a finite collection of such rules.
A rule typically has a rule name, a (finite) number of premises,
a side-condition and a conclusion as shown below:
\begin{prooftree}
\AxiomC{$\Gamma_1 \vdash_{L} \Delta_1
\qquad\cdots\qquad
\Gamma_n \vdash_{L} \Delta_n$}
\LeftLabel{RuleName}
\RightLabel{Condition}
\UnaryInfC{$\Gamma_0 \vdash_{L} \Delta_0$}
\end{prooftree}
We read the rules top-down as statements of the form ``if premises
hold then conclusion holds'', again deliberately using the imprecision
of ``holds'' rather than something more exact.
A derivation of the judgement $\Gamma \vdash_{L} \Delta$ is typically
a finite tree of judgements with root $\Gamma \vdash_{L} \Delta$ where
parents are obtained from children by ``applying a rule''. From now
on, I will usually omit $L$ to reduce clutter.
\begin{figure}[t]
\centering
\begin{tabular}{llr}
Calculus
& \multicolumn{1}{c}{Example Rule}
& \multicolumn{1}{c}{Collection}
\\
\\ LK
& \brule{$(\ExImp L)$}
{$\Gamma, B \vdash \Delta$}
{$\Gamma \vdash A, \Delta$}
{$\Gamma, A \ExImp B \vdash \Delta$}
{}
& sets of formulae
\\
\\ LJ
& \brule{$(\ExImp L)$}
{$\Gamma, A \ExImp B \vdash A $}
{$\Gamma, B \vdash C$}
{$\Gamma, A \ExImp B \vdash C$}
& multisets + SOR
\\
\\ ND
& \brule{$(\{.\}_{.} I)$}
{$\Gamma \vdash K$}
{$\Gamma, K \vdash M $}
{$\Gamma \vdash \{M\}_{K}$}
{}
& multisets + SOR
\\
\\ NL
& \brule{$(\setminus L)$}
{$\Delta \vdash A$}
{$\Gamma[B] \vdash C $}
{$\Gamma[(\Delta, A \setminus B)] \vdash C$}
{}
& trees with holes
\\
\\ DL
& \brule{$(\ExImp L)$}
{$X \vdash A$}
{$B \vdash Y $}
{$A \ExImp B \vdash (\ast X) \circ Y$}
{}
& complex trees
\end{tabular}
\caption{Example of Rules of Some Existing Calculi}
\label{fig:example-calculi}
\end{figure}
Figure~\ref{fig:example-calculi} shows some typical rules from the
literature:
\begin{description}
\item[ Gentzen's LK:] in some formulations uses rules built from
multisets, but it can also be easily turned into a calculus which
uses sets. LK has a particularly pleasing property in that in all
its rules, the components of the premises like $A$ and $B$ are
subformulae of the components of the conclusion like $A \ExImp B$;
\item[Gentzen's LJ:] uses sequents built from multisets, with an added
condition that the right hand side must consist of at most one
formula. The particular formulation shown also carries its principal
formula $A \ExImp B$ from the conclusion into one of its premises,
which can be used to show that the contraction rule is redundant;
\item[ND for Security Protocols:] Natural deduction calculi have been
used to reason about security protocols. This rule captures the idea
that the ability to decode the key $K$ and the ability to decode a
message $M$ using that key, allows us to decode $M$ even when it is
encrypted with key $K$;
\item[NL:] The non-associative Lambek calculus uses sequents built
from trees containing ``holes''. The rule allows us to replace the
formula $B$ in a hole inside the tree $\Gamma$, with a more complex
subtree built from the tree $\Delta$ and the formula $A \setminus B$;
\item[Display Logic:] Belnap's display logic uses sequents built from
complex structural connectives like $\ast$ and $\circ$ so that its
sequents are akin to complex trees.
\end{description}
Thus there are many different notions of ``sequent''. Our hope is to
encode the proofs about such sequents in a modern proof-assistant,
specifically Isabelle.
\section{Applying A Rule: Example Derivation In Gentzen's LK}
In almost all cases, we build derivations in a top-down manner,
starting from leaves which are instances of $\Gamma, p \vdash p,
\Delta$ by ``applying'' a rule. Rule application typically proceeds
using pattern-matching as exemplified by the following derivation from
Gentzen's LK.
\begin{example}
Here, $ \Gamma , A$ means ``$\Gamma$ multiset-union $A$''.
\\[1em]
\begin{tabular}[c]{l@{\extracolsep{3cm}}l}
\urule{$(\ExAnd\vdash$)}
{$\Gamma, A, B \vdash \Delta$}
{$\Gamma, A \ExAnd B \vdash \Delta$}
{}
&
\brule{$(\ExImp\vdash$)}
{$\Gamma \vdash A, \Delta$}
{$\Gamma, B \vdash \Delta$}
{$\Gamma, A \ExImp B \vdash \Delta$}
{}
\end{tabular}
\begin{prooftree}
\AxiomC{$p_0 \vdash p_0, q_0$}
\AxiomC{$p_0 , q_0 \vdash q_0$}
\RightLabel{$(\ExImp\vdash)$}
\BinaryInfC{$p_0 , (p_0 \ExImp q_0) \vdash q_0$}
\RightLabel{$(\ExAnd\vdash)$}
\UnaryInfC{$p_0 \ExAnd (p_0 \ExImp q_0) \vdash q_0$}
\end{prooftree}
The first rule instance $(\ExImp\vdash)$ utilises the substitutions
$\Gamma := \{p_0\}$, $A := p_0$, $B := q_0$ and $\Delta := \{q_0\}$.
The second rule instance $(\ExAnd\vdash)$ utilises the substitutions
$\Gamma := \emptyset$, $A := p_0$, $B := p_0 \ExImp q_0$ and $\Delta
:= \{q_0\}$.
\end{example}
The example also illustrates the use of sequent calculi as ``backward
chaining'' decision procedures where we can find a derivation starting
from $p_0 \ExAnd (p_0 \ExImp q_0) \vdash q_0$ and applying the rules
in a systematic way ``backwards'' towards the leaves.
For propositional LK, decidability follows almost immediately by
observing that in every such ``backward'' rule application, at least
one formula disappears from the sequents, and is replaced by strictly
smaller formulae only. Thus, every branch of rule applications must
terminate.
More generally, we typically find some numerical measure which
strictly decreases when ``reducing'' a conclusions to its premises.
Indeed, the International Conference on Automated Reasoning with
Analytic Tableaux and Related Methods, to be held next in Oslo, is
dedicated to research aimed at automating reasoning in various
non-classical logics using this technique. Some of us have even made a
career out of this endeavour!
Notice that the structure of ``collections'' is significant. For
example, the structural rule of contraction shown below at left using
multisets is well-defined, with a typical instance shown below at
right:
\[
\urule{(Ctr)}{$\Gamma, A, A \vdash \Delta$}
{$\Gamma, A \vdash \Delta$}
{}
\hspace{4cm}
\urule{}{$p_0, p_0 \vdash q_0$}
{$p_0 \vdash q_0$}
{(Ctr)}
\]
Similarly, the following contraction lemma is well-defined:
\begin{center}
If
$\Gamma, A, A \vdash \Delta$
is derivable
then so is
$\Gamma, A \vdash \Delta$.
\end{center}
But neither makes sense in a setting where sequents are built from
sets since the rule instance shown below at left collapses to identity.
\[
\urule{}{$p_0, p_0 \vdash q_0$}
{$p_0 \vdash q_0$}
{(Ctr)}
\hspace{4cm}
\urule{}{$\{p_0 \} \vdash \{ q_0 \}$}
{$\{p_0\} \vdash \{q_0\}$}
{identity}
\]
Similarly, the contraction lemma is meaningless since
$\Gamma \cup \{A\} \cup \{A\} \vdash \Delta$
is the same as
$\Gamma \cup \{A\} \vdash \Delta$.
Although automated reasoning is an important application of sequent
calculi, most uses of proof theory are meta-theoretic. For example,
proof theory is typically used to answer questions like the following:
\begin{description}
\item[\rm Consistency:]
$\emptyset \vdash_{L} A$ and
$\emptyset \vdash_{L} \ExNot A$
are not both derivable;
% \smallskip
% \item[\rm Decidability:] An algorithm for deciding whether
% $\Gamma \vdash_{L} A$
% holds~?
\item[\rm Disjunction Property:] If $\emptyset \vdash_{Int} A \ExOr
B$ then $\emptyset \vdash_{Int} A$ or $\emptyset \vdash_{Int} B$;
\item[\rm Craig Interpolation:] If $\Gamma \vdash_{L} \Delta$ holds
then so do $\Gamma \vdash_{L} A$ and $A \vdash_{L} \Delta$ for
some formula $A$ with $\mathrm{Vars}(A) \subseteq
\mathrm{Vars}(\Gamma) \cap \mathrm{Vars}(\Delta) $;
\item[\rm Normal Forms:] Is there a (unique) normal form for derivations~?
\item[\rm Curry-Howard:] Do normal derivations correspond to well-typed terms
of some $\lambda$-calculus~?
% \smallskip
% \item[\rm Automated Deduction:] Find derivations efficiently on a computer
\item[\rm Equality:] When are two derivations of
$\Gamma \vdash_{L} A$
equivalent~?
\item[\rm Relative Strengths:] Every derivation in $\vdash_1$ can be
simulated polynomially by a derivation in $\vdash_2$
\end{description}
The methods used usually involve reasoning \textbf{about} derivations
rather than finding derivations, as exemplified by the following
lemmas:
\begin{description}
\item[\rm Identity:] The judgement $A \vdash A$
is derivable for all $A$.
\item[\rm Monotonicity:] If
$\Gamma \vdash \Delta$
is derivable then so is
$\Gamma, \Sigma \vdash \Delta$.
\item[\rm Exchange:] If
$\Gamma, A, B \vdash \Delta$
is derivable then so is
$\Gamma, B, A \vdash \Delta$.
\item[\rm Contraction:] If
$\Gamma, A, A \vdash \Delta$
is derivable then so is
$\Gamma, A \vdash \Delta$.
% \item[\rm Permutability:] Any derivation of
% $\Gamma \vdash \Delta$
% with rule
% $\rho_1$
% immediately above rule
% $\rho_2$
% can be turned into another derivation of
% $\Gamma\vdash\Delta$
% with rule
% $\rho_2$
% immediately above rule
% $\rho_1$
\item[\rm Inversion:] If the conclusion of a rule instance is derivable
then so are the corresponding premise instances.
\item[\rm Cut-elimination/-admissibility:]
If $\Gamma \vdash A, \Delta$
is (cut-free) derivable
and $\Gamma, A \vdash \Delta$
is (cut-free) derivable
then so is
$\Gamma \vdash \Delta$, where the cut rule is:
\begin{prooftree}
\AxiomC{$\Gamma \vdash A, \Delta$}
\AxiomC{$\Gamma, A \vdash \Delta$}
\LeftLabel{(cut)}
\BinaryInfC{$\Gamma \vdash \Delta$}
\end{prooftree}
\item[\rm Weak/Strong Normalisation:] Algorithm to transform a derivation
into a normal form by eliminating topmost/nested cuts~?
\item[\rm Cost:] How much bigger is the transformed derivation?
\end{description}
\section{Proof Theory Is Error-Prone: Provability Logic GL}
To illustrate the fact that proof-theory is error-prone, I would like
to describe the history of the cut-elimination theorem for a
propositional modal logic called GL, after G\"odel-L\"ob.
The logic GL has an Hilbert axiomatisation which extends the standard
axiomatisation for modal logic K by adding L\"ob's axiom
$\WhtBox(\WhtBox A \ExImp A) \ExImp \WhtBox A$.
%
It rose to prominence when Solovay showed that $\WhtBox A$
could be interpreted as ``A is provable in Peano
Arithmetic''~\cite{Sol76}.
%
An initial proof-theoretic account was given by Leivant in
1981 when he ``proved'' cut-elimination for a set-based sequent
calculus for GL~\cite{Lei81}.
%
But Valentini in 1983 found a simple counter-example and gave a new
cut-elimination proof~\cite{Val83}.
%
The issue seemed to have been settled, but in 2001, Moen~\cite{Moe01}
claimed that Valentini's transformations don't terminate if the
sequents $\Gamma \vdash \Delta$ are based on multisets. There is of
course no \textit{a priori} reason why a proof based on sets should
not carry over with some modification to a proof based on multisets,
so this set the cat amongst the pigeons.
In response, Negri~\cite{Neg05} in 2005 gave a new cut-elimination
proof using sequents built from labelled formulae $w: A$, which
captures that the traditional formula $A$ is true at the possible
world $w$. But this is not satisfactory as it brings the underlying
(Kripke) semantics of modal logic into the proof theory. Mints in 2005
announced a new proof using traditional methods~\cite{Min06}.
But the question of Moen versus Valentini remained unresolved. Finally,
Gor\'e and Ramanayake~\cite{GR08} in 2008 showed that Moen is
incorrect, and that Valentini's proof using multisets is mostly okay.
Many such examples exist in the literature.
\section{Interactive Proof Assistants}
Interactive proof-assistants are now a mature technology for
``formalising mathematics''. They come in many different flavours as
indicated by the names of some of the most popular ones
{\tt Mizar, HOL, Coq, LEGO, NuPrl, NqThm, Isabelle,
$\lambda$-Prolog, HOL-Lite, LF, ELF, Twelf} $\cdots$
with apologies to those whose favourite proof-assistant I have
omitted.
Most of the modern proof-assistants are implemented using a modern
functional programming language like ML, which was specifically
designed for the implementation of such proof-assistants.
The lowest levels typically implement a typed lambda-calculus with
hooks provided to allow the encoding of further logical notions like
equality of terms on top of this base implementation. The base
implementation is usually very small, comprising of a few hundred
lines of code, so that this code can be scrutinised by experts to
ensure its correctness.
Almost all aspects of proof-checking eventually compile down to a
type-checking problem using this small core, so that trust rests on
strong typing and a well-scrutinised small core of (ML) code.
Most proof-assistants also allow the user to create a proof-transcript
which can be cross-checked using other proof-assistants to guarantee
correctness.
I don't want to go into details, but one type of proof-assistant,
called a logical framework, allows the user to manage a proof using
the ``backward chaining'' idea which we saw in use earlier to find
derivations using sequent calculi.
\begin{figure}[t]
\centering
\begin{tabular}[c]{lr}
$[ \beta_1 \ ; \ \beta_2 \ ; \ \cdots \ ; \ \beta_n ] \Longrightarrow \alpha$
& $\beta$
\\[2em]
$\theta = match(\beta, \alpha)$
& $\beta_1\theta \ ; \ \beta_2\theta \ ; \ \cdots
\ ; \ \beta_n\theta$
\end{tabular}
\label{fig:backward-chaining}
\caption{Backward Chaining in Logical Frameworks}
\end{figure}
Figure~\ref{fig:backward-chaining} shows how these logical
frameworks typically work. Thus given some goal $\beta$ and an
inference step which claims that $\alpha$ is implied by $\beta_1$ up
to $\beta_n$, we pattern-match $\alpha$ and $\beta$ to find their most
general unifier $\theta$, and then reduce the original goal $\beta$ to
the $n$ subgoals $\beta_1\theta \cdots \beta_n\theta$.
The pattern matching required is usually (associative-commutative)
higher order unification.
The important point is that the logical framework keeps track of
sub-goals and the current proof state.
The syntax of the ``basic propositions'' like $\alpha$, $\beta$ is
defined via an ``object logic'', which is a parameter. Different
``object logics'' can be invoked using the same logical-framework for
the task at hand.
The logical properties of ``$;$'' like associativity or commutativity,
and properties of the ``$\Longrightarrow$'' like classicality or
linearity are determined by the ``meta-logic'', which is usually fixed
for the logical framework in question.
For example, the meta-logic of Isabelle is higher-order intuitionistic
logic. Higher order simply means that functions can accept other
functions as arguments and can produce functions as results.
\section{Isabelle's LK Object Logic: a shallow embedding of sequent calculus}
We begin with what is called a ``shallow embedding'' of sequents. The
meaning of this term will become apparent as we proceed.
The ``propositions'' of Isabelle's sequent object logic are sequents
built from sequences of formulae as defined in the grammar below:
\begin{center}
\begin{math}
\begin{array}{lll}
prop & = & sequence \ \verb!|-! \ sequence
\\
sequence & = & elem \ (, elem )^* \ | \ empty
\\
elem & = & \mathtt{\$} id
\ | \ \mathtt{\$} var
\ | \ formula
\\
formula & = & \mathtt{\sim} \ formula
\ | \ formula \ \mathtt{\&} \ formula
\ | \ \cdots
\end{array}
\end{math}
\end{center}
Thus sequents are built from ``collections'' which are sequences of
formulae. A sequent rule built from premise sequents $\beta_1 , \cdots ,
\beta_n$ with conclusion sequent $\alpha$ is encoded directly as the
meta-logical expression:
\[
[ \beta_1 \ ; \cdots \ ; \ \beta_n ] \Longrightarrow \alpha
\]
\begin{example}
For example, the (cut) rule shown below is encoded as the
meta-logical expression shown below it:
\begin{tabular}[c]{c}
\\[1em]
\brule{(cut)}{$\Gamma \vdash \Delta, P \ \ $}
{$\ \ \ \ \Gamma, P \vdash \Delta$}
{$\Gamma \vdash \Delta$}
\\[2em]
% $\mathtt{[|\ \ \$G \ |- \ \$D, P
% \ \ ;\ \
% \$G, P \ |- \ \$D \ \ |]
% ==> \$G \ |- \ \$D}$
\verb! [| $G |- $D,P ; $G,P |- $D |] ==> $G |- $D!
\end{tabular}
\end{example}
Thus we encode the horizontal bar separating the premises from the
conclusion directly using the meta-logical implication
$\Longrightarrow$.
The advantage is that we can immediately create and check derivations
using the proof assistant to manage the backward chaining involved.
That is, we use the proof-assistant to find derivations by applying
the rules in a backward way.
There is thus a perfect match between the backward chaining involved
in finding derivations and the backward chaining involved in the
subgoaling provided by the proof-assistant.
The disadvantage is that there is no explicit encoding of a
derivation. The derivation is kept implicitly by the proof-assistant
and we cannot manipulate its structure. Nor is it possible to encode
statements like the identity lemma: the sequent $A \vdash A$ is
derivable for all formulae $A$. It is possible to show that particular
instances of this sequent like $ P \& Q \vdash P \& Q$ are derivable,
but we cannot actually encode the inductive nature of the proof which
would require us to show that it held for $A$ being atomic, and that
an inductive step would take us from the case for formulae of length
$n$ to formulae of length $n+1$. In particular, there is no way to
state the final step of the induction which allows us to state that the
lemma holds for all finite formulae.
\section{A Deeper Embedding: Change Object Logic}
Recall that the meta-logic provides us with a method for backward
chaining via expressions of the form:
$$\beta_1 \ ; \cdots \ ; \ \beta_n \Longrightarrow \alpha$$
The usual method for obtaining the power for reasoning about sequent
derivations is to use the full power of higher-order classical logic
(HOL) to build the basic propositions $\beta_i$.
Isabelle's incarnation of HOL provides the usual connectives of logic
like conjunction, disjunction, implication, negation and the higher
order quantifiers. But it also provides many powerful facilities
allowing us to define new types, define functions which accept and
return other functions as arguments, and even define infinite sets
using inductive definitions.
For example, the following HOL expressions capture the usual inductive
definition of the natural numbers by encoding the facts that ``zero is
a natural number, and if $n$ is a natural number then so is its
successor $s(n)$'':
\begin{center}
\begin{tabular}[c]{l}
$\mathtt{0 \in nat}$
\\
$\mathtt{n \in nat \Longrightarrow s(n) \in nat}$
\end{tabular}
\end{center}
Most proof-assistants will automatically generate an induction
principle from a given inductive definition. For example, Isabelle
will automatically generate the usual induction principle which states
that we can prove a property $P$ holds of all natural numbers if we
can show that $P(0)$ holds and we can show that $P(n)$ implies
$P(s(n))$. An implicit assumption which facilitates such induction
principles is that the inductive definitions are the only way to
construct its members. Thus, if $n$ is a natural number, then it is
either $0$, or is of the form $s(m)$ for some natural number $m$.
\begin{samepage}
To encode sequent calculus into HOL we first encode the grammar for
recognising formulae as below:
\begin{verbatim}
datatype fml = FC string (fml list) (* fml connective *)
| FV string (* fml variable *)
| PP string (* prim prop *)
\end{verbatim}
\end{samepage}
There are three type constructors \verb!FC!, \verb!FV! and \verb!PP!
which encode formula connectives, formula variables, and atomic
formulae (primitive propositions). Each of them takes one string
argument which is simply the string we want to use for that
construction. The formula connective constructor also accepts a list
of formulae, which constitute its subformulae.
For example, \verb!FC "&" [FV "A", PP "q"]! encodes $A\; \& \; q$.
Since we want to encode modal provability logic GL, we require only
the classical connectives, plus two unary modalities
\verb!FC "Box" [.]! for $\WhtBox .$ and \verb!FC "Dia" [.]! for
$\WhtDia .$.
Isabelle's HOL allows us to form sets and multisets of objects of an
arbitrary type, so the HOL expressions \verb!fml set! and
\verb!fml multiset! capture the types of formula sets and formula
multisets.
Using these types we can build a sequent type using an infix constructor
$\mathtt{\vdash}$ via:
$$\mathtt{datatype ~seq ~=~ ~fml ~multiset~~ \vdash ~~fml ~multiset}$$
Isabelle's HOL allows us to form lists of objects of an arbitrary
but fixed type, so we define the type of a rule as a pair with the
first component being a list of sequent premises and the second
component being the conclusion sequent:
$$\mathtt{datatype ~inf ~=~ (seq ~list ~,~ seq)}$$
Finally, we use the HOL type declaration $\mathtt{\pscrel ~::~inf~set}$ to
declare that $\pscrel$ is a set of inferences, each a pair of the form
mechanism \verb!(seq list , seq)!, and inductively define the set
$\pscrel$ by giving a finite collection of rule instances which belong
to this set. For example, the traditional rule $(\&\vdash)$ for
introducing a conjunction into the left hand side of a
sequent as shown below is given by the encoding below it:
$$
\brule{$(\vdash\&)$}
{$\Gamma \vdash A, \Delta$}
{$\Gamma \vdash B, \Delta$}
{$\Gamma \vdash A \& B, \Delta$}
$$
$$
\mathtt{(\;[\; G \vdash \{A\} + D
\ , \
G \vdash \{B\} + D \;],
\ \ G \vdash \{A \& B\} + D \;\; ) \in \pscrel}
$$
The encoding uses HOL's notation ``\verb!+!'' for multiset union, and
a slightly inaccurate description of encoding singleton multisets as
\verb!{A}!. Thus each element of $\pscrel$ is a pair whose first
component is a list of its premises, and whose second
component is its conclusion.
% \begin{figure}[t]
% \centering
% \begin{tabular}[c]{ll}
% $c \in \prems \Longrightarrow c \in \derrec~\pscrel~\prems$
% &
% $[\ \ ] \in \dersrec~\pscrel~\prems$
% \\[1em]
% \AxiomC{$p_1 \cdots p_k$}
% \LeftLabel{$\pscrel$}
% \UnaryInfC{$c$}
% \DisplayProof
% &
% \AxiomC{$\{ pms_1 , \cdots , pms_n \}$}
% \dottedLine
% \LeftLabel{$\dersrec~\pscrel$}
% \UnaryInfC{$[\ p_1, \cdots , p_k \ ]$}
% \DisplayProof
% \\[1em]
% \multicolumn{2}{l}{
% $(ps, c) \in \pscrel \ ; \ ps \in \dersrec~\pscrel~\prems
% \Longrightarrow c \in \derrec~\pscrel~\prems$}
% \\[1em]
% \multicolumn{2}{c}{
% \AxiomC{$\{ pms_1, \cdots , pms_n \}$}
% \LeftLabel{$\derrec~\pscrel$}
% \dottedLine
% \UnaryInfC{$c$}
% \DisplayProof}
% \\[2em]
% \AxiomC{$\{ pms_1 , \cdots , pms_n \}$}
% \LeftLabel{$\derrec~\pscrel$}
% \dottedLine
% \UnaryInfC{$c$}
% \DisplayProof
% &
% \AxiomC{$\{ pms_1 , \cdots , pms_n \}$}
% \dottedLine
% \LeftLabel{$\dersrec~\pscrel$}
% \UnaryInfC{$[\ c_1, \cdots , c_m \ ]$}
% \DisplayProof
% \\[1em]
% \multicolumn{2}{l}{
% $c \in \derrec~\pscrel~\prems \ ; \ cs \in \dersrec~\pscrel~\prems$}
% \\
% \multicolumn{2}{c}{
% $\Longrightarrow c\# cs \in \dersrec~\pscrel~\prems$}
% \\[1em]
% \multicolumn{2}{c}{
% \AxiomC{$\{ pms_1 , \cdots , pms_n \}$}
% \dottedLine
% \LeftLabel{$\dersrec~\pscrel$}
% \UnaryInfC{$[\ c, c_1, \cdots , c_m \ ]$}
% \DisplayProof}
% \end{tabular}
% \caption{Inductively Definining Derivable Sequents}
% \label{fig:derrec}
% \end{figure}
We are now in a position to encode the set $\derrec$ of ``recursively
derivable sequents'' given an initial set $\prems$ of premise sequents
and an initial set $\pscrel$ of inference rules. The set $\derrec$
is defined inductively as shown below:
\begin{tabular}[c]{l@{\extracolsep{1cm}}l}
\\
1 & $\mathtt{\derrec ~::~ (seq ~list, ~seq) ~set
~\Rightarrow ~seq ~set
~\Rightarrow ~seq ~set }$
\\
2 & $\mathtt{c \in \prems \Longrightarrow c \in \derrec~\pscrel~\prems}$
\\
3 & $[ \;\; \mathtt{(ps, c) \in \pscrel \ ; }$
\\
4 & $\mathtt{\;\;\;\; \forall ~p . ~p \in (set ~ps)
\Longrightarrow ~p ~\in ~\derrec~\pscrel~\prems \;\; ]}$
\\
5 & $\mathtt{\Longrightarrow ~c ~\in ~\derrec~\pscrel~\prems}$
\end{tabular}
\bigskip
The explanation is as below:
\begin{description}
\item[\rm 1:] A type declaration which tells the proof-assistant that
$\derrec$ accepts a set of inference rules and a set
of sequents, and produces a set of sequents;
\item[\rm 2] The base case of the inductive definition of $\derrec$
captures that ``each premise is itself (vacuously) derivable
from the premises using the rules''. Note that there is an implicit
outermost universal quantifier which is not shown explicitly, but
which binds free variables like $\mathtt{c, ps}, \pscrel, \prems$.
\item[\rm 3] The first conjunct of an inductive clause
stating that $ps/c$ is a rule instance;
\item[\rm 4:] The second conjunct of the inductive clause which
captures that ``each premise $p$ in the set obtained from sequent
list $ps$ is derivable from the premises $\prems$ using the rules
$\pscrel$''. Here we use a function \verb!set! to convert a list
into the set of its members;
\item[\rm 5:] The ``then'' part of the inductive clause which
concludes that sequent ``$c$ is derivable from $\prems$ using
$\pscrel$''.
\end{description}
\section{Inductive Proofs via Automated Inductive Principles}
Induction principles are generated automatically by Isabelle from the
inductive definition of $\derrec$. A heavily simplified
version for proving an arbitrary property $\mathtt{P}$ is shown below:
\bigskip
\begin{tabular}[c]{l@{\extracolsep{1cm}}l}
1 & $\mathtt{\forall x. \forall P.}$
\\
2 & $[ \mathtt{\;\;\; x~ \in ~\derrec ~\pscrel ~\prems ~;}$
\\
3 & $\mathtt{\;\;\; \forall c . c \in ~\prems \Longrightarrow ~P(c) ~;}$
\\
4 & $\mathtt{\;\;\; \forall c . \forall ps.
[\;\; (ps, c) \in ~\pscrel \;\; ;
~\forall y \in (set ~ps). ~P(y)
\Longrightarrow ~P(c)}~]$
\\
5 & $] \;\;\; \mathtt{ \Longrightarrow ~P(x)}$
\end{tabular}
\bigskip
An explanation is:
\begin{description}
\item[\rm 1:] for all sequents $x$ and all properties $P$
\item[\rm 2 :] if $x$ is derivable from premises $\prems$ using rules
$\pscrel$, and
\item[\rm 3 :] $P$ holds for every premise $c$ in $\prems$, and
\item[\rm 4 :] for each rule, if $P$ of its premises implies $P$ of
its conclusion,
\item[\rm 5 :] then $P$ holds of $x$
\end{description}
If you look closely, you will see that this is an induction principle
which we use often in proof-theory: prove that some property holds of
the leaves of a derivation, and prove that the property is preserved
from the premises to the conclusion of each rule. For example,
consider the standard translation from sequents of LK to formulae
given by $\tau(A_1, \cdots , A_n \vdash B_1, \cdots , B_m) = A_1 \ExAnd
\cdots \ExAnd A_n \ExImp B_1 \ExOr \cdots \ExOr B_m$. We typically use
this translation to argue that all derivable sequents are valid in the
semantics of first-order logic. The proof proceeds by showing that the
translation of the leaves of a derivation are all valid, and showing
that if the translations of the premises are valid then the
translations of the conclusion are valid, for every rule.
Using these inductive principles we proved the following lemma about
derivability using Isabelle, where the question marks indicate
free-variables which are implicitly universally quantified:
\begin{lemma}
$$\mathtt{?ps \subseteq \derrec~?\pscrel~?\prems
\ ; \
?c \in \derrec~?\pscrel~?ps
\ \Longrightarrow \
?c \in \derrec~?\pscrel~?\prems
}
$$
\noindent
If each premise in $ps$ is derivable from
premises $\prems$ using rules $\pscrel$,
and $c$ is derivable from $ps$ using $\pscrel$,
then $c$ is derivable from $\prems$ using $\pscrel$.
\end{lemma}
\section{An Even Deeper Embedding: Derivation Trees As Objects}
\label{s-dtobj}
The main advantage of the method outlined in the previous section
was that there was no concrete representation of a derivation. That
is, we relied on the proof-assistant to perform pattern-matching and
rule instantiations in an appropriate way, so that all we needed was
to capture the idea that derivations began with premises and ended
with a single sequent.
If we are to reason about cut-elimination, then we are required to
perform transformations on explicit derivations. We therefore need a
representation of such trees inside our encoding.
In previous work~\cite{DG02}, we described such an encoding using the
following datatype:
\begin{verbatim}
datatype seq dertree = Der seq (seq dertree list)
| Unf seq
\end{verbatim}
The declaration states that a derivation tree can either be an
\verb!Unf!inished leaf sequent built using the constructor \verb!Unf!,
or it can be a pair consisting of a conclusion sequent and a list of
sub-derivation-trees bound together using the constructor \verb!Der!.
In that work, we described how we maintained substitutions as lists of
pairs of the form $(x, t)$ representing the substitution $x := t$. We
also described how we manipulated substitutions and instantiation
directly to obtain rule instances.
We required such low-level aspects to be made explicit so that we
could reason about display logic which required us to check conditions
on rules like ``a rule is closed under substitution of arbitrary
structures for variables''.
Our use of \verb!dertee! can be seen as an even deeper embedding of
proof-theory into Isabelle/HOL since we utilise the proof-assistant
only to maintain the current and further goals.
Omitting details now, suppose we define \verb!valid rli dt! to hold
when derivation tree \verb!dt! uses rules from \verb!rli! only and
has no \verb!Unf!inished leaves. We proved:
\begin{lemma}
$$\mathtt{valid~?\pscrel~?dt~\Longrightarrow (conclDT~?dt)
\in \derrec ~?rls~\{\}}$$
\noindent
If derivation tree $dt$ is valid wrt the rules $\pscrel$ then its
conclusion is derivable from the empty set of premises using $\pscrel$.
\end{lemma}
\begin{lemma}
$$\mathtt{?c \in \derrec ~?\pscrel ~\{\}
\Longrightarrow ~EX ~dt. ~valid ~?\pscrel ~dt
~\& ~conclDT ~dt ~= ~?c}$$
\noindent
If the sequent $c$ is derivable from the empty set of premises using
rules $\pscrel$ then there exists a derivation tree $dt$ which is
valid wrt $\pscrel$ and whose
conclusion is exactly $c$.
\end{lemma}
Thus we now know that our ``deep embedding'' of derivability using
$\derrec$ can be faithfully captured using the ``even deeper''
embedding using explicit derivation trees. Indeed, the lemmas allow us
to move freely between the two embeddings at will to omit or include
details as required by the lemma we wish to prove.
\section{Mix Admissibility for Provability Logic} \label{s-gls-mix}
We finally come to the crux of our work. Below is the traditional
formulation of the mix-rule for sequents built from multisets where
$\Pi_A$ is formed from $\Pi$ by deleting all occurrences of $A$:
\[
\brule{(mix)}{$\Gamma \vdash \Delta$}
{$\Pi \vdash \Sigma$}
{$\Gamma, \Pi_A \vdash \Delta_A, \Sigma$}
\;\; A \in \Delta \;\; \& \;\; A \in \Pi
\]
The rule can be expressed as a lemma rather than a rule using the
embeddings we have developed so far as shown below where we now
explicitly use the name \verb!glss! for the fixed but inductively
defined set of rule instances for provability logic GL:
\bigskip
\begin{tabular}[c]{l}
$\mathtt{(?G ~\vdash ~?D) \in \derrec ~glss ~\{\}
\;\; ; \;\;
(?P ~\vdash ~?S) \in \derrec ~glss ~\{\}}$
\\
\multicolumn{1}{c}{$\mathtt{\Longrightarrow}$}
\\
$\mathtt{((?G ~+~ (ms\_delete ~\{?A\}~ ?P)
~\vdash
~(ms\_delete ~\{?A\} ~?D) ~+~ ?S))}$
\\
\hfill $\mathtt{ \in \derrec ~glss ~\{ \}}$
\end{tabular}
Here we defined a function \verb!ms_delete! which deletes all
occurrences of its first argument from its second argument.
Our main result, which we intend to report in a proper paper, is that
this lemma can be proved using our embeddings and Isabelle.
\input{cut-not-mix}
\section{Objections and Impediments} \label{s-obj}
A frequent objection to the idea of machine-checking anything is that
the errors could also have been found by a good Phd student working
with pencil and paper. But even diligent Phd students are apt to fall
for errors which lie within sentences marked by ``clearly ...'' or the
``other cases are similar''. The beauty of proof-assistants lies in
their absolutely pedantic insistence that nothing is proved until it
passes through the type-checking procedure of the proof-assistant.
Another objection is that this is not research but is just high level
programming since you have to have the proof first. To some extent
this is true since the current prototypical example is usually the
verification of a given proof from a paper or a book. But many
researchers now build the proof interactively. Indeed, better
user-interfaces make this very possible.
The main impediment in my opinion is the sheer effort required to
become familiar with proof-assistants before productive work can be
started. It takes at least three months of full-time work to learn how
to use an interactive proof assistant well. But as I hope I have shown
you, it is worth it!
\begin{thebibliography}{99}
\bibitem{DG02}
J~E Dawson and R~Gor\'e.
\newblock Formalised Cut Admissibility for Display Logic.
\newblock In {\em Proc.\ 15th International Conference on
Theorem Proving in Higher Order Logic}, LNCS 2410:131-147, 2002.
\bibitem{GR08}
R Gor\'e and R Ramanayake.
\newblock Valentini's Cut Elimination for Provability Logic Resolved.
\newblock{\em in Advances in Modal Logic}, Volume~7, pp 67--86, College
Publications, London, 1981. NOTE 1981 MUST BE WRONG
\bibitem{Lei81}
D. Leivant.
\newblock On the Proof Theory of the Modal Logic for Arithmetic Provability.
\newblock{\em Journal of Symbolic Logic}, 46:531--538, 1981.
\bibitem{Min06}
G. Mints.
\newblock Cut elimination for provability logic.
\newblock Collegium Logicum 2005.
\bibitem{Moe01}
A. Moen.
\newblock The proposed algorithms for eliminating cuts in the provability
calculus $GLS$ do not terminate.
\newblock{\em NWPT 2001}, Norwegian Computing Center, 2001-12-10.
\textsf{http://publ.nr.no/3411}
\bibitem{Neg05}
S. Negri.
\newblock Proof Analysis in Modal Logic.
\newblock{\em Journal of Philosophical Logic}, 34:507--544, 2005.
\bibitem{Sol76}
R.M. Solovay.
\newblock Provability Interpretations of Modal Logic.
\newblock{\em Israel Journal of Mathematics}, 25:287--304, 1976.
\bibitem{Val83}
S. Valentini.
\newblock The Modal Logic of Provability: Cut-elimination.
\newblock{\em Journal of Philosophical Logic}, 12:471--476, 1983.
\end{thebibliography}
\end{document}