%%
\documentclass[a4paper,11pt]{article}
%\documentclass[twocolumn,a4paper,11pt]{article}

\usepackage{latexsym}
\usepackage{amssymb}
\usepackage{exptrees}
\usepackage{url}

\renewcommand{\today}{\number\day\ 
  \ifcase\month\or
  January\or February\or March\or April\or May\or June\or
  July\or August\or September\or October\or November\or December\fi
  \ \number\year}

\setcounter{secnumdepth}2
\setcounter{tocdepth}2

\newtheorem{theorem}{Theorem}[section]


\newcommand{\qed}{}

\newenvironment{proof}{
 \noindent{\bf Proof:}}{\mbox{\bf Q.E.D.}\vspace{1ex}}

\newtheorem{definition}{Definition}[section]

\newtheorem{lemma}{Lemma}[section]

\newtheorem{corollary}{Corollary}[section]
    
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% Preamble

\newcommand{\mcyl}{\raisebox{0.2ex}{\bf c}}
\newcommand{\mor}{+}
\newcommand{\mand}{\circ}
\newcommand{\mnot}{\sim}
\newcommand{\mimp}{\rightarrow}
\newcommand{\mif}{\leftarrow}
\newcommand{\mtop}{\mbox{\bf 1}}
\newcommand{\mbot}{\mbox{\bf 0}}
\newcommand{\mstar}{\bigstar}

\newcommand{\mcal}[1]{\mbox{$\cal #1$}}

\newcommand{\btop}{\top}
\newcommand{\bbot}{\bot}
\newcommand{\bor}{\vee}
\newcommand{\band}{\wedge}
\newcommand{\bnot}{\neg}
\newcommand{\bimp}{\supset}

\newcommand{\conv}{\smallsmile}
\newcommand{\semic}{\mbox{ ; }}
\newcommand{\comma}{\raisebox{0.5ex}{ , }}
\newcommand{\myE}{ E }
\newcommand{\blob}{\bullet}

\newcommand\cfile{}

\newcommand\comment[1]{} 
\newcommand\cc[1]{#1} % for label names 

\newcommand\invrule[2]{ 
% for use in non-math mode, args interpreted in math mode
  \begin{tabular}{@{}c@{}}
  $ #1 $ \\ \hline \hline $ #2 $
  \end{tabular}}
\newcommand\slrule[2]{ % similar to \frac, except
% for use in non-math mode, args interpreted in math mode
  \begin{tabular}{@{}c@{}}
  $ #1 $ \\ \hline $ #2 $
  \end{tabular}}

\renewcommand\theenumi{(\alph{enumi})}
\renewcommand\labelenumi{\theenumi}
\renewcommand\theenumii{(\roman{enumii})}
\renewcommand\labelenumii{\theenumii}

\newcommand\src[1]{source file \texttt{#1}}
\newcommand\apsrc[1]{Appendix \ref{app:code}}
\newcommand\seesrc[1]{(see source file \texttt{#1})}
\newcommand\insrc[1]{in source file \texttt{#1}}
\newcommand\page[1]{page~\pageref{#1}}
\newcommand\fig[1]{Fig.~\ref{fig:#1}}
\newcommand\ch[1]{Chapter~\ref{ch:#1}}
\newcommand\tbl[1]{Table~\ref{tbl:#1}}
\newcommand\thrm[1]{Theorem~\ref{thrm:#1}}
\newcommand\lem[1]{Lemma~\ref{lem:#1}}
\newcommand\ttl[1]{\tti{#1}\label{#1}}
\newcommand\ttdl[1]{\texttt{#1}\tti{#1}\label{#1}}
\newcommand\ttdef[1]{\texttt{#1}\tti{#1}}
\newcommand\tti[1]{\index{#1@\texttt{#1}}}
\newcommand\bfi[1]{\textbf{#1}\index{#1|textbf}}
\newcommand\dRA{{\boldmath $\delta$}\textbf{RA}}
\newcommand\dKt{{\boldmath $\delta$}\textbf{Kt}}
\newcommand\NRA{\textbf{NRA}}
\newcommand\RA{\textbf{RA}}
%\newcommand\M{$\mathcal M$}
\newcommand\M{\ensuremath{\mathcal M}}
\newcommand\Mp{$\mathcal M'$}
\newcommand\up[1]{\vspace{-#1ex}} 
\newcommand\vpf{\vspace{-3ex}} % to reduce space before proof tree
\newcommand\1{\textbf 1}
\newcommand\0{\textbf 0}
\newcommand\cut{(\emph{cut})}
\newcommand\pt{\texttt{pt}}
\newcommand\ptn{\texttt{ptn}}

\newcommand{\pordered}[3]{#1 <_{\texttt{\scriptsize #2}} #3}
\newcommand{\Revpordered}[3]{#1 >_{\texttt{\scriptsize #2}} #3}
\newcommand{\LRPord}[2]{\pordered{#1}{LRP}{#2}}
\newcommand{\cutord}[2]{\pordered{#1}{cut}{#2}}
\newcommand{\snoneord}[2]{\pordered{#1}{sn1}{#2}}
\newcommand{\dtord}[2]{\pordered{#1}{dt}{#2}}
\newcommand{\Revdtord}[2]{\Revpordered{#1}{dt}{#2}}

\pagestyle{plain}

\begin{document}

\title{A New Machine-checked Proof of Strong Normalisation for Display Logic}

\author{
 Jeremy E.\ Dawson
%\thanks{Supported by an Australian Research Council Large Grant}
 \ \ and \ \
 Rajeev Gor\'e\thanks{Contact Author 
\texttt{            WordCount = delatex cutelimlics.tex | wc  = 6592 }
}
%\thanks{Supported by an Australian Research Council QEII Fellowship}
\\ Department of Computer Science and
   Automated Reasoning Group
\\ Australian National University
\\ Canberra ACT 0200, Australia 
\\ \url{jeremy@arp.anu.edu.au} \ \ \url{rpg@arp.anu.edu.au}
\\ Tel: +61-2-6125-8603 \ \ Fax: +61-2-6125-8651
}
\date{}
%% Title 
\maketitle

\begin{abstract}
  We use a deep embedding of the display calculus for relation
  algebras \dRA\ in the logical framework Isabelle/HOL to formalise a
  new, machine-checked proof of strong normalisation and
  cut-elimination for \dRA\ which does not use measures on the size of
  derivations. Our formalisation generalises easily to other display
  calculi and can serve as a basis for formalised proofs of strong
  normalisation for the classical and intuitionistic versions of a
  vast range of substructural logics like the Lambek calculus, linear
  logic, relevant logic, BCK-logic, and their modal extensions.  As
  far as we are aware, this is the first full formalisation of a
  strong normalisation result using a logical framework.
\end{abstract}

%% Here begins the main text

\section{Introduction}

Sequent calculi provide a rigorous basis for meta-theoretic studies
of various logics. The central theorem is cut-elimination which
states that detours through lemmata can be avoided: it can be used
to show many important logical properties like consistency,
interpolation, and Beth definability. In computer science, cut-free
sequent calculi can be used for automated deduction
\cite{heuerding-phd}, for nonclassical extensions of logic
programming \cite{pym-harland-uniform}, and for studying the deep
connection between cut-elimination, lambda calculi and functional
programming.  Sequent calculi, and their extensions, therefore play
an important role in theoretical computer science.

Display Logic \cite{belnap-display} is a generalised sequent framework
for non-classical logics, based on Gentzen's sequent calculus
\cite{szabo-gentzen}. Since it is not really a logic, we prefer the
term display calculi and use it from now on. Display calculi extend
Gentzen's language of sequents with extra, complex, $n$-ary structural
connectives, in addition to Gentzen's sole structural connective, the
``comma''.  
Properties such as associativity are explicitly
stated as structural rules.
%
Display calculi provide an extremely general sequent framework
encompassing a vast range of substructural logic like the
bi-intuitionistic and bi-classical versions of the Lambek calculus,
linear logic, relevant logic, BCK-logic, and their modal and tense
logical extensions, in a uniform way
\cite{wansing-displaying-modal-logic,gore-sld-igpl}. The most
remarkable property of display calculi is a generic cut-elimination
theorem, which applies whenever the rules for the display calculus
satisfy certain, easily checked, conditions \cite{belnap-display}.
Display calculi therefore provide an important generalisation of
sequent calculi.

Traditional proofs of cut-elimination in sequent calculi do not
eliminate the cut rule directly due to problems in eliminating
applications of contraction above cut. Gentzen \cite{szabo-gentzen}
himself first replaced the cut rule with the mix rule, showed that cut
was derivable from mix, and then eliminated the mix rule.
Borisavljevic \textit{et al}
\cite{borisavljevic-dosen-petric-permuting-cut-with-contraction} have
shown that the direct elimination of the cut rule is by no means
trivial, requiring a further detour through a special normal form for
derivations. The problems caused by contraction can also be avoided by
proving a lemma that contraction itself is an admissible rule, as done
by Pfenning \cite{pfenning-structural-lics94}. We cannot follow this
last strategy since explicit structural rules like contraction are
what give display calculi their generality.

In \cite{belnap-display}, Nuel Belnap gives a direct proof that the
cut rule is \textit{admissible} in display calculi.  In
\cite{wansing-displaying-modal-logic}, Heinrich Wansing gives a direct
proof of strong-normalisation for the cut-elimination procedure for a
particular display calculus.  Wansings's proof of termination uses a
complex measure on the size of derivations.  When we tried to
mechanise Wansing's proof using a logical framework, we were unable to
proceed beyond a point in the proof where Wansing states that ``This
construction generalises easily to the case where $A$ is introduced
\ldots at more than one point \ldots''.

We therefore developed and mechanised a new direct proof of strong
normalisation for the display calculus \dRA\ using four complex
orderings, which we proved to be well-founded. Our orderings do not
depend upon the size of derivations. As far as we are aware, this is
the first full formalisation of a strong normalisation result using a
logical framework. The work required approximately 9 man-months,
approximately 1600 lines of Isabelle/HOL theory files and 8500 lines
of ML proof files, and takes approximately 35 minutes to fully
type-check on a 300MHz Pentium machine.

We are aware of only three other attempts to formalise proof theory,
cut elimination or strong normalisation. 

The first is the work of Pfenning where he formalises the
admissibility of cut for classical, intuitionistic and linear logic
using the logical framework Elf \cite{pfenning-structural-lics94}. 
In previous work
\cite{dawson-gore-embedding-comparing}, we found that the Elf
implementation reports success even when certain transformations,
essential to cut-elimination, are deleted from the encoding! Our
conclusion was that the Elf implementation was not a full
formalisation of cut-elimination. Pfenning essentially says the same
thing in the conclusion of
\cite{pfenning-structural-lics94}:\textit{``Once the structural proof
  of admissibility has been found and implemented, it is natural to
  ask if it can also be encoded in stronger frameworks such as Coq
  [DFH+93] so that structural inductions are made explicit and the
  proof is fully formally verified.''}.  In separate work, we have
answered this question in the affirmative using Isabelle/HOL instead
of Coq.

The second is the work of Matthews \textit{et al}
\cite{matthews-smaill-basin-experience,
  basin-matthews-structuring,matthews-implementing-isabelle} which
implements Feferman's FS$_0$ in Isabelle, and uses it to formalise and
extend various meta-theoretical systems. It should be possible to
formalise cut-elimination in this implementation, but it does not
appear to have been done.

The third is the work of Mikhajlova and von Wright
\cite{mikhajlova-vonwright-isomorphism}, giving a formalisation of a
sequent calculus for first-order logic in the system HOL. This work
makes no attempt to prove a cut-elimination or strong normalisation
result, and as we point out later, actually contains an unfortunate
bug.

The paper is set out as follows.  In
Section~\ref{sec:deep-embedding-dra} we describe the display calculus
\dRA\ for the logic of relation algebras, the logical framework
Isabelle and its subtheory HOL, and our encoding of \dRA\ into
Isabelle/HOL. In Section~\ref{sec:reas-about-deriv} we describe
essential functions for reasoning about derivations. In
Section~\ref{subsec:funct-reas-about} we describe Isabelle/HOL
functions that allow us to reason about cuts, and discuss the problems
we had in formalising Wansing's proof of strong normalisation. In
Section~\ref{s-sn} we describe our new proof of strong normalisation
for the display calculus \dRA.  Section~\ref{sec:concl-furth-work}
presents further work.

A longer version of this paper can be found at
\url{http://arp.anu.edu.au/~rpg/CutElim/} and the actual Isabelle code
at \url{http://discus.anu.edu.au/~jeremy/deep-files/} via your
favourite web browser. Enjoy!

\section{A Deep Embedding of \dRA\ in Isabelle/HOL}
\label{sec:deep-embedding-dra}

The following grammar defines the syntax of relation algebras where
the elements of the top line are Boolean and the elements of the
bottom line are their relational analogues:
\begin{eqnarray*}
   A \ B & ::= &
           p_i 
           \mid \btop 
           \mid \bbot
           \mid \bnot A 
           \mid A \band B 
           \mid A \bor B 
           \mid A \bimp B 
\\
       &     &
           \mid \mtop
           \mid \mbot
           \mid \mnot A
           \mid \conv A 
           \mid A \mor B 
           \mid A \mand B 
           \mid A \mimp B 
\end{eqnarray*}


A display calculus for relation algebras called \dRA\ can be found in
\cite{gore-relalg}. Sequents of \dRA\ are expressions of the form $X
\vdash Y$ where $X$ and $Y$ are built from the nullary structural
constants $E$ and $I$ and formulae, using 
the structural connectives in the grammar below:
\[
   X\ Y ::= A \mid I \mid E \mid \ast X \mid \blob X \mid X \semic Y 
           \mid X \comma Y 
\]
The defining feature of display calculi is that in all logical
introduction rules, the principal formula is always ``displayed'' as
the whole of the right-hand or left-hand side.  For example, the rule
$(\mbox{\textbf{LK}-}\vdash \lor )$, shown below left, is typical of
Gentzen's sequent calculi like \textbf{LK}, while the rule
$(\mbox{\dRA-}\vdash \lor )$ shown on the right is typical of display
calculi; see \cite{gore-relalg} for details:
\[
\frac {\Gamma \vdash \Delta, P \comma Q}
{\Gamma \vdash \Delta, P \lor Q}(\mbox{\textbf{LK}-}\vdash \lor )
\qquad  \qquad
\frac {X \vdash P \comma Q}
{ X \vdash P \lor Q}(\mbox{\dRA-}\vdash \lor )
\]


Isabelle is an automated proof assistant \cite{isabelle-ref}.  Its
meta-logic is an intuitionistic typed higher-order logic, sufficient
to support the built-in inference steps of higher-order unification
and term rewriting.  Isabelle accepts inference rules of the form
``from $\alpha_1, \alpha_2, \ldots, \alpha_n, \mbox{ infer } \beta$''
where the $\alpha_i$ and $\beta$ are expressions of the Isabelle
meta-logic, or are expressions using a new syntax, defined by the
user, for some ``object logic''. Most users build on one of the
comprehensive ``object logics'' already supplied, like Isabelle/HOL
\cite{IsLogHOL}, which is an Isabelle theory based on the higher-order
logic of Church \cite{church-types} and the HOL system of Gordon
\cite{HOL-introduction}.  In \cite{dawson-gore-embedding-comparing},
we describe our initial attempts to formalise display calculi in
various logical frameworks, describe why we selected Isabelle/HOL, and
give an explanation of this deep embedding. We assume the reader is
familiar with ML and logical frameworks in general.

\begin{figure}[t]
  \centering
\begin{verbatim}
datatype formula = Btimes formula formula ("_ && _" [68,68] 67)
                 | Rtimes formula formula ("_ oo _" [68,68] 67)
                 | Bplus formula formula ("_ v _" [64,64] 63)
                 | Rplus formula formula ("_ ++ _" [64,64] 63)
                 | Bneg formula ("--_" [70] 70)
                 | Rneg formula ("_^" [75] 75)
                 | Btrue ("T")
                 | Bfalse("F")
                 | Rtrue ("r1")
                 | Rfalse("r0")
                 | FV string 
                 | PP string 
\end{verbatim}
  \caption{Isabelle/HOL Representation of Formulae of \dRA.}
  \label{fig:isabelle-formulae}
\end{figure}

Formulae of \dRA\ are represented by the datatype 
%declaration
shown in
Figure~\ref{fig:isabelle-formulae}. The constructors \texttt{FV} and
\texttt{SV} represent formula and structure variables which appear in
the statement of a rule or theorem, and which are instantiated to
actual formulae of \dRA\ when constructing derivations.  The
constructor \texttt{PP} represents a primitive proposition $p_i$,
which is not amenable to substitution.  

\begin{figure}[t]
  \centering
\begin{verbatim}
datatype structr = Comma structr structr 
                 | SemiC structr structr 
                 | Star structr  
                 | Blob structr  
                 | I 
                 | E 
                 | Structform formula 
                 | SV string   
\end{verbatim}
  \caption{Isabelle/HOL Representation of Structures of \dRA.}
  \label{fig:isabelle-structures}
\end{figure}

Structures of \dRA\ are represented by the datatype 
%declaration 
shown
in Figure~\ref{fig:isabelle-structures}.  The operator
\texttt{Structform} causes a formula to be ``cast'' into a structure.
We say that a structure expression is \emph{formula-free} if it does
not contain any formula as a sub-structure: that is, if it contains no
occurrences of \texttt{Structform}.  A formula-free sequent is defined
similarly.

The notation in parentheses in Figures~\ref{fig:isabelle-formulae}
describe an alternative infix syntax which is closer to the actual
syntax of \dRA.  For technical reasons, a different method is used to
specify the alternative infix syntax for structures and sequents:
details omitted.  Sequents of \dRA\ are represented by the
Isabelle/HOL datatype below:
\begin{verbatim}
datatype sequent = Sequent structr structr 

datatype rule = Rule (sequent list) sequent 
              | Bidi sequent sequent
              | InvBidi sequent sequent 
\end{verbatim}
The premisses of a rule are represented using a list of sequents while
the conclusion is a single sequent. Thus \texttt{Rule prems concl}
means a rule with premisses \texttt{prems} and conclusion
\texttt{concl}. Many single-premiss rules of display calculi are
defined to be usable from top to bottom as well as from bottom to top:
the two constants \texttt{Bidi} and \texttt{InvBidi} allow us to cater
for these. 
A sequent \texttt{(Sequent X Y)} can also be represented as
\texttt{\$X |- \$Y}. Thus the term \texttt{Sequent (SV ''X'')
  (Structform (FV ''A''))} is printed, and may be entered, as
\texttt{(\$''X'' |- ''A'')}. 
% Special functions \texttt{premsRule} and
% \texttt{conclRule} return the list of premisses and the conclusion of a
% rule respectively.

In contrast to the formula and structure variables modelled explicitly
by \texttt{FV} and \texttt{SV}, Isabelle has ``scheme variables'',
identified by a preceding `\texttt{?}'.  When a scheme variable
appears in an Isabelle theorem, the theorem is true when anything,
of the appropriate type, is substituted for that variable.

The constant \texttt{rls} represents the set of rules of \dRA\ and
\texttt{cutr} is the name of the cut rule, encoded using the datatypes
just described: we omit the details of this code.  We handle
substitutions for structure and formula variables explicitly via
various functions: details omitted.  We use the term ``derivation'' to
refer to a proof \emph{within} the sequent calculus, reserving the
term ``proof'' for a meta-theoretic proof of a theorem \emph{about}
the sequent calculus. We model a derivation tree 
%(type \texttt{pftree}) 
using the following datatype:
\begin{verbatim}
datatype pftree = Pr sequent rule (pftree list) 
                | Unf sequent 
\end{verbatim}

In a term \texttt{Pr seq rule pts}
the subterm \texttt{seq} is the sequent at the root (bottom) of the tree,
and \texttt{rule} is the rule used in the last (bottom) inference.
Thus, if the tree represents a real derivation, then sequent
\texttt{seq} will be an instantiation of the conclusion of \texttt{rule},
and the corresponding instantiations of the premisses of \texttt{rule}
will be the roots of the trees in the list \texttt{pts}.
The trees in \texttt{pts} are called the \emph{immediate}
subtrees of \texttt{Pr seq rule pts}.

The leaves of a derivation tree are either axioms with no premisses, or
``\texttt{Unf}inished'' sequents whose derivations are currently
unfinished. 

Display calculi use the initial sequent $p \vdash p$, using primitive
propositions only. It is then proved that the sequent $A \vdash A$ is
derivable for all formulae $A$ by induction on the size of $A$, where
$A$ \emph{stands for} a formula composed of primitive propositions and
logical connectives.  We proved this, as the theorem \texttt{idfpp}.
We also need to reason about the derivation trees of derived rules
which may use the (derived) rule $A \vdash A$, for arbitrary formula
$A$.  Therefore we let the derivation tree \texttt{Pr (''A'' |- ''A'')
  idf []} stand for a complete derivation, which uses the lemma that
$A \vdash A$ is derivable, and let the derivation tree \texttt{Unf
  (''A'' |- ''A'')} stand for an incomplete derivation with unfinished
premiss $A \vdash A$.

For example, the incomplete derivation tree shown below at left is
represented as the Isabelle/HOL term shown below at right where
\texttt{''A'' |- PP p \&\& ''A''} stands for $A \vdash p \land A$ and
\texttt{cA} and \texttt{ands} are the contraction and ($\vdash \land
$) rules, and
\texttt{idf} is the derived rule $A \vdash A$:\\[1em]
\begin{tabular}[c]{rr}
  \begin{minipage}[c]{0.3\linewidth}
    {\vpf
     \bpf
     \A "$A \vdash p$"
     \A "$A \vdash A$"
     \B [1em] "$A, A \vdash p \land A$" "($\vdash \land $)"
     \U "$A \vdash p \land A$" "(\textsl{ctr})"
     \epf
    }
  \end{minipage}
&
\begin{minipage}[c]{0.7\textwidth}
\begin{verbatim}
Pr (''A'' |- PP p && ''A'') cA
 [Pr (''A'', ''A'' |- PP p && ''A'') ands 
 [Unf (''A'' |- PP p), 
  Pr (''A'' |- ''A'') idf []]]
\end{verbatim}
\end{minipage}
\end{tabular}

\section{Reasoning About Derivations and Derivability}
\label{sec:reas-about-deriv}

We now describe various functions which allow us to reason about
derivations in \dRA\ and theorems we have proved about derivability in
\dRA.

\begin{description}
  
\item[\texttt{allPT f pt}] holds if property \texttt{f} holds for
  every sub-tree in the derivation tree \texttt{pt}.
  
\item[\texttt{allNextPTs f pt}] holds if property \texttt{f} holds for
  every proper sub-tree of \texttt{pt}.

\item[\texttt{wfb (Pr concl rule pts)}] holds if sequent rule
  \texttt{rule} has an instantiation with conclusion instance
  \texttt{concl} and premiss instances which are the conclusions of the
  derivation trees in the list \texttt{pts}. Such a node is said to be
  \emph{well-formed}.
  
\item[\texttt{allPT wfb pt}] holds if every sub-tree of the derivation
  tree \texttt{pt} is \emph{well-formed}.  That is, if every node in
  \texttt{pt} is well-formed.  Such a derivation is said to be
  \emph{well-formed}.
  
\item[\texttt{frb rules (Pr concl rule pts)}] holds when the lowest
  rule \texttt{rule} used in a derivation tree \texttt{Pr concl rule
    pts} belongs to the set \texttt{rules}.
  
\item[\texttt{allPT (frb rules) pt}] holds when every rule used in
  a derivation tree \texttt{pt} belongs to the set \texttt{rules}.
  
\item[\texttt{premsPT pt}] returns a list of all ``premisses''
  (unfinished leaves) of the derivation tree \texttt{pt}. That is, the
  sequents found in nodes of \texttt{pt} of the form \texttt{Unf seq}.
  
\item[\texttt{conclPT pt}] returns the end-sequent of the derivation
  tree \texttt{pt}. That is, the conclusion of the bottom-most rule
  instance.

\end{description}

\begin{definition}
A derivation tree \texttt{pt} is \emph{valid} if
 it is well-formed,
 uses rules in the set of rules \texttt{rules} of the calculus, and
 has no unfinished leaves.

% from HOL_PT.thy
\begin{verbatim}
valid_def = "valid ?rules ?pt ==
   allPT wfb ?pt & allPT (frb ?rules) ?pt & premsPT ?pt = []"
\end{verbatim}
\end{definition}

The question marks in front of \texttt{rules} and \texttt{pt} flag
that they are Isabelle scheme variables, even though the question
marks would be absent in the Isabelle/HOL theory file itself: we do
this throughout this paper.

\begin{definition}[\texttt{IsDerivableR}]
\texttt{IsDerivableR rules prems' concl} holds iff
there exists a derivation tree \texttt{pftr} which 
uses only rules from set \texttt{rules},
is well-formed,
has conclusion \texttt{concl},
and
has premisses from set \texttt{prems'}.

\begin{verbatim}
"IsDerivableR ?rules ?prems' ?concl == (EX pftr.
      allPT (frb ?rules) pftr & allPT wfb pftr &
      conclPT pftr = ?concl & set (premsPT pftr) <= ?prems')"
\end{verbatim}
\end{definition}

Here, \texttt{set} is a function that allows us to treat its argument
as a set rather than a list, and \texttt{<=} is the subset relation
$\subseteq$. Our main result about derivability is a recursive
characterisation of derivability.

\begin{theorem}[\texttt{IsDerivableR\_recur}]
  A conclusion \texttt{concl} is derivable from premisses
  \texttt{prems} using rules \texttt{rules} if and only if either
  \texttt{concl} is one of \texttt{prems}, or there exists an
  instantiated \texttt{rule} obtained from \texttt{rules} and the
  conclusion of \texttt{rule} is \texttt{concl} and the premisses of
  \texttt{rule} are themselves derivable from \texttt{prems} using
  \texttt{rules}.

\begin{verbatim}
IsDerivableR_recur = "IsDerivableR ?rules ?prems ?concl = (
 ?concl : ?prems | (EX rule. ruleFromSet rule ?rules & 
 conclRule rule = ?concl & (ALL p : set (premsRule rule). 
 IsDerivableR ?rules ?prems p)) )" : thm
\end{verbatim}
\end{theorem}

The appellation ``\texttt{: thm}'' indicates a statement that has been
proved in Isabelle/HOL using previous Isabelle/HOL definitions and
theorems: we follow this practice for theorems and lemmata throughout
this paper.

In the third clause for \texttt{IsDerivable} on
\cite[page~302]{mikhajlova-vonwright-isomorphism}, which deals with
the case of a result being provable using derived rules, an incorrect
use of an existential quantifier gives that $P \to Q$ could be used as
a derived rule on the grounds that one instance of it, say
$\textsl{True} \to \textsl{True}$, is provable.

\section{Reasoning About Cuts and Wansing's Proof}
\label{subsec:funct-reas-about}

We assume familiarity with notions like ``parametric ancestors'' of a
cut formula from \cite{belnap-display}, ``principal moves'' and
``parametric moves'' from the literature on strong normalisation:
details can be found in the Appendix
(Section~\ref{sec:an-operational-view}).

\begin{definition}
  A cut rule application is \emph{left-}
  [\emph{right-}] principal if the cut-formula is the principal formula
  of the left [right] premiss of the cut rule.
\end{definition}

Each of the following functions requires the bottom node of the
derivation tree to be of the form \texttt{Pr seq rule pts}, and that
if \texttt{rule} is \cut, then for: \texttt{cutOnFmls s} the cut is on
a formula in the set \texttt{s}; \texttt{cutIsLP A} the cut is on
formula \texttt{A} and is left-principal; \texttt{cutIsLRP A} the cut
is on formula \texttt{A} and is (left- and right-)principal.
It follows from the definitions that a derivation tree satisfying any
of \texttt{allPT (cutOnFmls s)}, \texttt{allPT (cutIsLP A)} and
\texttt{allPT (cutIsLRP A)} has no unfinished leaves.  We omit
details.

The ideas inherent in principal and parametric moves are well-known,
but they were first applied to prove strong normalisation for a
display calculus by Wansing to show that any (sufficiently long)
sequence of steps in the process of cut-elimination terminates. This
paper is reproduced in \cite{wansing-displaying-modal-logic}.

We have implemented large parts of Wansing's strong normalisation
proof.  In fact we completed it, subject to assertions (which we had
intended to prove) about the effect of primitive and parametric
reductions. These assertions are too complex to include and explain
here, but they implied (\emph{inter alia}) that the subtree $\Pi'[Y]$
and its conclusion ${Z[Y] \vdash A}$ in the right (transformed) tree
of Figure~\ref{fig-cut-p-new} is the same as the tree $\Pi[A]$ and its
conclusion ${Z[A] \vdash A}$ in the left (original) tree: that is, the
transformation process does not alter $\Pi$ but leaves it intact. In
fact, this is true only if $Z$ contains no parametric ancestors of
$A$, and examples can be constructed using contraction where this
fails. Thus our assertion failed, and so we could not prove Wansing's
result. 
We attempted to repair this defect by reference to the relevant
part of Wansing's proof \cite[page~54]{wansing-displaying-modal-logic}
in which he states that the extension to situations more complex than
that of Figure~\ref{fig-cut-p-new} is to be shown ``analogously''.
Unfortunately it is based on a diagram much like
Figure~\ref{fig-cut-p-new} and appears to rely on the fact that the
subtree $\Pi$ in Figure~\ref{fig-cut-p-new} appears unchanged in the
transformed tree. 

\section{A New Proof of Strong Normalisation}
\label{s-sn}
%\cc{sn.tex}\cc{s-sn}

We now give a new direct proof of strong normalisation which does not
rely on the size of derivation trees. Given a derivation tree with
(\emph{cut}) at its root, the tree can be changed to deal with that
particular cut; we call these ``cut-reductions''.  Following Wansing
\cite[\S 4.2]{wansing-displaying-modal-logic}, we classify these
cut-reductions as principal or parametric. More generally, we can
change a tree by performing a cut-reduction on any subtree; we call
such changes \emph{reductions}.

\subsection{Defining Strongly Normalisable Derivations}

\begin{definition}[\texttt{reduces}] 
  Assuming a relation \texttt{cutReduces} (defined later),
  derivation tree $\Pi_0$ \texttt{reduces} to derivation tree
  $\Pi_1$ if either 
 (a) $\Pi_0$ \texttt{cutReduces} to $\Pi_1$, or
 (b) $\Pi_0$ and $\Pi_1$ are identical except that exactly one
       of the immediate subtrees of $\Pi_0$ reduces to the
       corresponding immediate subtree of $\Pi_1$.

% from SN0.thy
\begin{verbatim} 
  reduces_Unf = "reduces (Unf ?seq) ?ptn = False"
  reduces_Pr  = "reduces (Pr ?seq ?rule ?ptl) ?ptn = 
   ( (EX ptl'. onereduces ?ptl ptl' & ?ptn = Pr ?seq ?rule ptl') 
     | cutReduces (Pr ?seq ?rule ?ptl) ?ptn )"

  onereduces_Nil  = "onereduces [] ?ptl' = False"
  onereduces_Cons = "onereduces (?h # ?t) ?ptl' = ( ?ptl' ~= [] 
    & (reduces ?h (hd ?ptl') & ?t = (tl ?ptl') 
       | ?h = (hd ?ptl') & onereduces ?t (tl ?ptl')) )"
\end{verbatim}
\end{definition}

% The actual definition of the relation \texttt{cutReduces} is given
% later, meaning that all proofs until then are modulo this definition.

Wansing \cite[p.~52]{wansing-displaying-modal-logic} defines a
strongly normalisable derivation tree as a tree from which every
sequence of reductions is finite.  We define inductively the set of
strongly normalisable derivation trees using Isabelle's inductive
definition feature, which defines the minimal set closed under the
given rules. For example, the rules $\{0 \in \mathcal S, n \in
\mathcal S \Longrightarrow n+2 \in \mathcal S\}$ define $\mathcal S$
to be the set of even naturals, although the set of all naturals also
satisfies the rules.  For more details, see \cite[\S~2.10]{IsLogHOL}.

\begin{definition}[\texttt{sn\_set}]\label{defn:sn_set}
  The set \texttt{sn\_set} is the smallest set of derivation trees
  such that:
  (a) if $\Pi_0$ cannot be reduced then $\Pi_0
    \in \texttt{sn\_set}$ 
  (b) if every tree $\Pi_1$ to which $\Pi_0$
      reduces is in \texttt{sn\_set} then $\Pi_0 \in
      \texttt{sn\_set}$.

% in SN0.thy
\begin{verbatim} 
inductive "sn_set" 
  intrs 
    snI "(ALL ptn. 
    reduces ?pt ptn --> ptn : sn_set) ==> ?pt : sn_set"
\end{verbatim}
\end{definition}

Note that Definition~\ref{defn:sn_set}.(b) subsumes
Definition~\ref{defn:sn_set}.(a), which is why there is only one
clause in the corresponding Isabelle code.

\begin{definition}[strongly normalisable] 
\label{defn:sn}
A derivation tree is \emph{strongly normalisable} iff it is a member of
\texttt{sn\_set}.
\end{definition}

\subsection{Various Well-Founded Orderings}

To prove that every derivation tree is strongly normalisable, we use a
binary relation \texttt{dtorder} on derivation trees, and show that it
is well-founded. First we need an auxiliary definition.

\begin{definition}[\texttt{sn1red}]
\label{defn:sn1red}
For two lists \texttt{ptl1} and \texttt{ptl2} of derivation trees,
\texttt{sn1red ptl1 ptl2} holds iff the lists \texttt{ptl1} and
\texttt{ptl2} are non-empty and differ at only one position where
\texttt{ptl1} contains tree \texttt{pt1} and \texttt{ptl2} contains
tree \texttt{pt2}, and \texttt{pt1} reduces to \texttt{pt2}, and
\texttt{pt1} is strongly normalisable.

% from SN0.thy
\begin{verbatim} 
sn1red_Nil  = "sn1red [] ?ptl2 = False"
sn1red_Cons = "sn1red (?h # ?t) ?ptl2 = (?ptl2 ~= [] & 
       (reduces ?h (hd ?ptl2) & ?h : sn_set & ?t = (tl ?ptl2) 
       | ?h = (hd ?ptl2) & sn1red ?t (tl ?ptl2)) )"
\end{verbatim}
\end{definition} 

\begin{definition}[\texttt{LRPorder}, \texttt{cutorder}, 
  \texttt{sn1order}, \texttt{dtorder}]\label{defn:porders} 
  We define four binary relations, \texttt{LRPorder},
  \texttt{cutorder}, \texttt{sn1order} and \texttt{dtorder} on
  derivation trees as sets of ordered pairs:
\begin{enumerate}
  
\item\label{defn:porders-LRP} $\LRPord{\Pi_1}{\Pi_0}$ if the bottom
  inferences of derivations $\Pi_1$ and $\Pi_0$ are both \cut, and
  either
  \begin{enumerate}
  \item\label{defn:porders-LRP-1} the cut in $\Pi_1$ is left-principal
    or right-principal, and the cut in $\Pi_0$ is neither, or
  \item\label{defn:porders-LRP-2} the cut in $\Pi_1$ is (left- and
    right-)principal, and the cut in $\Pi_0$ is not (left- and
    right-)principal.
  \end{enumerate}
  
\item\label{defn:porders-cut} $\cutord{\Pi_1}{\Pi_0}$ if the bottom
  inferences of derivations $\Pi_1$ and $\Pi_0$ are both \cut, and if
  either
  \begin{enumerate}
  \item\label{defn:porders-cut-1} the size of the cut-formula of
    $\Pi_1$ is smaller than that of $\Pi_0$, or
  \item\label{defn:porders-cut-2} each derivation has the same
    cut-formula, and $\LRPord{\Pi_1}{\Pi_0}$.
  \end{enumerate}
  
\item\label{defn:porders-sn1} $\snoneord{\Pi_1}{\Pi_0}$ 
  if $\Pi_0$ and $\Pi_1$ are the same except that one of the immediate 
  subtrees of  $\Pi_0$ is strongly normalisable and reduces to the 
  corresponding immediate subtree of $\Pi_1$.
  
\item\label{defn:porders-dt} 
  $\dtord{\Pi_1}{\Pi_0}$ iff any of the following hold:
  \begin{enumerate}

  \item\label{defn:porders-dt-1} the bottom inference of $\Pi_1$ is
    not (\emph{cut}), but that of $\Pi_0$ is
  
  \item\label{defn:porders-dt-2} $\cutord{\Pi_1}{\Pi_0}$
  
  \item $\snoneord{\Pi_1}{\Pi_0}$.
  \end{enumerate}

\end{enumerate}

The definitions use Isabelle's inductive definitions but we give
only \texttt{dtorder}.

\begin{verbatim}      
inductive "dtorder" 
 intrs
  cnc "?rule ~= cutr ==> 
       (Pr ?seq1 ?rule ?ptl1, Pr ?seq2 cutr ?ptl2) : dtorder"
  dtc "?pts : cutorder ==> ?pts : dtorder"
  snr "?pts : sn1order ==> ?pts : dtorder"
\end{verbatim}

\end{definition}

\comment{
We define a partial ordering \texttt{cutorder} on derivation trees
whose bottom inference is (\emph{cut}), by the criteria
\begin{enumerate}
\item the size of the cut-formula
(\textit{e.g.}, a cut on $A $ is smaller than a cut on $B \land C$)
\item if the cuts are on the same formula, whether the cut is
(left- and right-) principal (smallest),
left-principal or right-principal, or neither (biggest)
(the order on this criterion is \texttt{LRPorder})
\end{enumerate}
A pair of cuts not covered by the above (such as a
left-principal and a right-principal cut on the same formula,
or cuts on $\lnot A$ and $A \land B$), are incomparable.

We next define a partial order \texttt{dtorder}
on derivation trees, by the rules 
\begin{enumerate}
\item $\Pi_i > \Pi_j$ if the bottom inference of 
$\Pi_i$ is cut and the bottom inference of $\Pi_j$ is not cut;
\item if the bottom inferences of 
both $\Pi_i$ and $\Pi_j$ are cut then the ordering \texttt{cutorder} applies;
\item if the bottom inferences of 
$\Pi_i$ and $\Pi_j$ are the same, and 
$\Pi_j$ is obtained from $\Pi_i$ by reduction of an immediate subtree
which is strongly normalisable (this ordering is called \texttt{sn1order}),
then $\Pi_i > \Pi_j$.
\end{enumerate}
}

\begin{theorem}[\texttt{wf\_LRPorder, wf\_cutorder, wf\_sn1order}]
\label{thm:well-founded-1}
The relations \texttt{LRPorder}, \texttt{cutorder} and
\texttt{sn1order} are well-founded. Only one shown.
\begin{verbatim}
   wf_LRPorder = "wf LRPorder" : thm 
\end{verbatim}
%    wf_cutorder = "wf cutorder" : thm
%    wf_sn1order = "wf sn1order" : thm 
\end{theorem}

Despite the notation, these relations are \textit{not} reflexive, and
some are \textit{not} transitive.  Intuitively, $(\Pi_1, \Pi_0) \in
\texttt{dtorder}$ means that $\Pi_1$ is closer to being cut-free (in
some sense) than is $\Pi_0$. To prove that \texttt{dtorder} is
well-founded, we first need a result on the interaction between
\texttt{cutorder} and \texttt{sn1order}.

\begin{lemma}[\texttt{sntr'}]
\label{lemma:sntrprime}
If
$\pordered{\Pi_2}{cut}{\Pi_1}$ and
$\pordered{\Pi_1}{sn1}{\Pi_0}$ then
$\pordered{\Pi_2}{cut}{\Pi_0}$.

\begin{verbatim}
sntr' = "[| (?pty, ?ptza) : cutorder; (?ptza, ?ptz) : sn1order |]   
         ==> (?pty, ?ptz) : cutorder" : thm
\end{verbatim}
\end{lemma}

\begin{theorem}[\texttt{wf\_dtorder}]
\texttt{dtorder} is well-founded. Not shown.
\end{theorem}

\subsection{Inheriting Strong Normalisation}

We next define \texttt{snHered}, a property of derivation trees which
indicates that a tree inherits strong normalisation from its immediate
subtrees.

\begin{definition}[\texttt{snHered}]
\label{snHered} 
$\Pi$ satisfies \texttt{snHered} iff:
if all the immediate subtrees of $\Pi$ are
strongly normalisable then $\Pi$ is strongly normalisable.

\begin{verbatim}
   snHered_def = "snHered ?pt == 
     set (nextUp ?pt) <= sn_set --> ?pt : sn_set"
\end{verbatim}
\end{definition}

\begin{lemma}[\texttt{hereds\_sn}]\label{hereds-sn}
  A derivation tree $\Pi$ is strongly normalisable iff
  every subtree of $\Pi$ has the property \texttt{snHered}.

\begin{verbatim}
hereds_sn = "(ALL pts. 
  isSubt ?pt pts --> snHered pts) = (?pt : sn_set)" : thm
\end{verbatim}
\end{lemma}

\begin{proof} The ``only if'' part uses
  the theorem \texttt{sn\_set.induct} below which is generated
  automatically by Isabelle from the inductive definition of
  \texttt{sn\_set}.
\begin{verbatim}
sn_set.induct = "[| ?xa : sn_set ; (!!pt. (ALL ptn. 
    reduces pt ptn --> ptn : sn_set & ?P ptn) ==> ?P pt ) |]  
 ==> ?P ?xa" : thm 
\end{verbatim}
  
  The ``if'' part uses induction on the height of $\Pi$ using the
  assumption that every subtree of $\Pi$, including itself, has the
  property \texttt{snHered}.
\end{proof}

\subsection{Reasoning About Cut-Reductions}

We intend to define cut-reduction in a way that enables us to make
some assertions about cut-reductions.  We first define properties
\texttt{nparRedP} and \texttt{c8redP} of reductions (which in fact
apply to parametric and principal reductions respectively).  The
definitions do not require that the bottom inference of the derivation
be \cut, but they are used only where this is so.

\begin{definition}[\texttt{nparRedP}]\label{defn:nparRedP}
  The relation \texttt{nparRedP} holds between two derivation trees
  $\Pi_0$ and $\Pi_1$ if every subtree $\Pi_1^s$ of $\Pi_1$ with a
  bottom inference \cut\ satisfies either
  (a) $\Pi_1^s$ is a proper subtree of $\Pi_0$, or
%\item\label{defn:nparRedP-2} 
  (b) $\Pi_1^s$ and $\Pi_0$ have the same
     cut-formula and $\LRPord{\Pi_1^s}{\Pi_0}$.
% from CutRed.thy
\begin{verbatim}
nparRedP_Unf = nparRedP "(Unf ?seq) ?ptn = False"

nparRedP_Pr  = "nparRedP (Pr ?seq ?rule ?ptl) ?ptn = (ALL pts.
 isSubt ?ptn pts & isCut pts --> 
    isSubt (Pr ?seq ?rule ?ptl) pts & Pr ?seq ?rule ?ptl ~= pts 
  | cutForm (Pr ?seq ?rule ?ptl) = cutForm pts & 
    (pts, Pr ?seq ?rule ?ptl) : LRPorder)"
\end{verbatim}

\end{definition}

\begin{definition}[\texttt{c8redP}]\label{defn:c8redP}
  A reduction from $\Pi_0$ to $\Pi_1$ satisfies \texttt{c8redP} if
  the lowest rule of $\Pi_0$ is \cut, and for each subtree
  $\Pi_1^s$ of $\Pi_1$ whose lowest rule is \cut, either
  (a) $\Pi_1^s$ is a proper subtree of $\Pi_0$, or 
% \item\label{defn:c8redP-2} 
  (b) the cut-formula of $\Pi_1^s$ is
        smaller than the cut-formula of the lowest rule of $\Pi_0$.
% from CutRed.thy
\begin{verbatim}
   c8redP = "c8redP ?pt ?ptn == ALL pts.
     isSubt ?ptn pts & botRule pts = cutr -->
       isSubt ?pt pts & ?pt ~= pts | 
       size (cutForm pts) < size (cutForm ?pt)"
\end{verbatim}
% \footnote{Where does it enforce that the bottom rule of
%   pi1 is cut? It doesn't, but it is used in cutReduces, which does the
%   check.}

\end{definition}

Note that both Definition~\ref{defn:c8redP}(b)
%\ref{defn:c8redP-2} 
and
Definition~\ref{defn:nparRedP}(b)
%\ref{defn:nparRedP-2} 
imply $(\Pi_1^s,
\Pi_0) \in \texttt{dtorder}$.
%
We now define a \emph{cut-reduction} (being either parametric or
principal) as satisfying one of the properties \texttt{nparRedP} and
\texttt{c8redP}, as well as some further simple conditions which help
the proof.

\begin{definition}[\texttt{cutReduces}]
\label{defn:cutReduces}
The derivation tree $\Pi_0$ \emph{cut-reduces} to $\Pi_1$ if the
following hold simultaneously: $\Pi_0$ and $\Pi_1$ satisfy either
\texttt{nparRedP} (for a parametric reduction) or \texttt{c8redP} (for
a principal reduction); 
$\Pi_0$ and $\Pi_1$ have the same conclusion;
the bottom rule of $\Pi_0$ is \cut;
$\Pi_0$ and $\Pi_1$ are not identical;
$\Pi_1$ does not consist solely of an unfinished leaf; and
$\Pi_0$ has at least one immediate subtree.

% from CutRed.thy
\begin{verbatim}
cutReduces_Pr = "cutReduces (Pr ?seq ?rule ?ptl) ?ptn = (
 (nparRedP (Pr ?seq ?rule ?ptl) ?ptn 
   | c8redP (Pr ?seq ?rule ?ptl) ?ptn) 
  & conclPT ?ptn = ?seq & ?rule = cutr 
  & (Pr ?seq ?rule ?ptl) ~= ?ptn & ~ isUnf ?ptn & ?ptl ~= [] )"
\end{verbatim}
\end{definition}

Note that for the purposes of the proof of strong normalisation, we
have defined cut-reduction weakly in that, for example, we do not
require that the new derivation tree \texttt{ptn} be well-formed (via
\texttt{allPT wfb ptn}) or require that it use rules which belong to
the calculus (via \texttt{allPT (frb rls) ptn}).  However the
definition is also strong in that it requires that either
\texttt{nparRedP} or \texttt{c8redP} holds.  Later we will show that
the reductions in which we are interested do satisfy \texttt{nparRedP}
or \texttt{c8redP}.  The result of this is that we prove strong
normalisation for a larger class of reductions than we are really
interested in. The requirement that $\Pi_0$ and $\Pi_1$ be distinct is
necessary to ensure that null reductions are excluded.

\subsection{Strong-Normalisation}

\begin{lemma}[\texttt{dth}]
\label{dth}
\label{sn-siord}
For a given derivation $\Pi_0$, if all derivation trees
$\dtord{\Pi'}{\Pi_0}$ have the property \texttt{snHered}, then so does
$\Pi_0$.

\begin{verbatim}
dth = "ALL pt'. (pt', ?pt) : dtorder --> snHered pt' 
       ==> snHered ?pt" : thm
\end{verbatim}

\end{lemma}

\begin{proof}
  The machine-proof is quite complicated, and a careful examination of
  it highlights why the definition of \textit{dtorder} needs to be so
  complex.
\end{proof}

\begin{theorem}[\texttt{all\_sn}]\label{thm:all-sn}
Every derivation tree is strongly normalisable.
\begin{verbatim}
   all_snH = "snHered ?pt"    : thm
   all_sn  = "strongNorm ?pt" : thm
\end{verbatim}
\end{theorem}

\begin{proof}
By well-founded induction, it follows from Lemma~\ref{dth} that
every derivation tree satisfies \texttt{snHered};
the result follows from Lemma~\ref{hereds-sn}.
\qed \end{proof}

At this point we have actually shown using Isabelle that
a class of reductions which we have defined is strongly
normalising.  
We need to show that this class of reductions contains the ones
we are interested in.

\subsection{Making A Cut (Left) Principal}

The following are the main theorems used to develop the mechanised
proof based on making cuts (left and right) principal.  

We use a functional definition \texttt{mLP ptAY seqY ptA} of the
transformation used to make a cut left-principal.  The arguments to
\texttt{mLP} are: a derivation tree \texttt{ptAY} with root $A \vdash
Y$ such as the right subtree $\Pi_R$ of the left tree given in
Figure~\ref{fig-cut-p-new}; a sequent \texttt{seqY}; and a derivation
tree \texttt{ptA} with root \texttt{seqA} such as the tree with root
$X \vdash A$ which forms the left subtree of the left tree given in
Figure~\ref{fig-cut-p-new}.  Sequent \texttt{seqA} will usually contain
occurrences of $A$ in one or more succedent position(s), and sequent
\texttt{seqY} is \texttt{seqA} with zero, one or more of these
occurrences of $A$ changed to $Y$.  The result of \texttt{mLP} is a
new derivation tree, with root \texttt{seqY} whose new cuts (on $A$)
are left-principal, such as the right tree given in Figure~
\ref{fig-cut-p-new}. The definition of \texttt{mRP} is similar.  We
also defined a function \texttt{ncLP seqY ptA} which returns true iff
calculating the tree \texttt{mLP ptAY seqY ptA} does not involve
traversing another cut; \texttt{ncRP} has analogous meaning.

We need to show that the transformation performed by \texttt{mLP} are
in the class of reductions that are strongly normalising. 
We assume an initial cut as in
Figure~\ref{fig-cut-p-new}: that is, with conclusion $X \vdash Y$ and
cut-formula $A$.

\begin{theorem}[\texttt{pRedLP2}]\label{pRedLP2}
  Consider a parametric reduction of a cut which proceeds by
  transforming the left subtree (to change its conclusion from $X[A]
  \vdash A$ to $X[Y] \vdash Y$), using the function \texttt{mLP}.
  Assume that the subtree satisfies the condition \texttt{ncLP} (in
  effect, that the transformation can be performed without traversing
  another cut). Also assume the cut is not already left-principal.
  Then the reduction satisfies the condition \texttt{nparRedP}.

\begin{verbatim}
pRedLP2 = "[| ?ptY = mLP ?ptAY ?seqY ?ptA; 
      ?pt = Pr ?seqY cutr [?ptA, ?ptAY];
      ~ rootIsSucP ?ptA; valid rls ?pt; 
      ncLP ?seqY ?ptA |]
   ==> nparRedP ?pt ?ptY & valid rls ?ptY & 
       conclPT ?ptY = conclPT ?pt" : thm 
\end{verbatim}
\end{theorem}

The condition \url{~}~\url{rootIsSucP}~\url{?ptA} means that the cut
is not already left-principal.  It is required to avoid null
reductions: if the cut is already left-principal, then \texttt{?pt}
and \texttt{mLP ?ptAY ?seqY ?ptA} are equal. The following result is
similar, but says that the parametric reduction is a cut-reduction.

\begin{theorem}[\texttt{pRedLP3}]
\label{pRedLP3}
The same as Theorem~\ref{pRedLP2}, except with conclusion
{\rm \texttt{ cutReduces ?pt ?ptY \& valid rls ?ptY"}}
\end{theorem}

There are analogous theorems \texttt{pRedRP2}, \texttt{pRedRP3},
which guarantee that we can make a cut
right-principal, and consequently another theorem 
which guarantees that we can make a cut (left and right)
principal. The associated functions also produce derivations that satisfy 
\texttt{cutReduces}.
Thus every parametric move corresponds to a reduction which is in the
class of strongly normalising reductions.

\subsection{Dealing With Principal Cuts} 
\label{s-deal}

For each logical connective and constant in the calculus, we prove a
result like the following one for $\lor$.

\begin{theorem}[\texttt{orC8}]
  Assume we are given a valid derivation tree \texttt{pt}. Assume that
  if the bottom rule of \pt\ is \cut, then the cut is principal and
  its cut-formula is $A \lor B$. Also assume that any cuts above the
  root of \pt\ have either $A$ or $B$ as cut-formula. Then then there
  is a valid derivation tree \texttt{ptn} with the same conclusion as
  \texttt{pt}, such that \emph{all} cuts in \texttt{ptn} have either
  $A$ or $B$ as cut-formula.

\begin{verbatim}
orC8 = "[| allPT wfb ?pt; allPT (frb rls) ?pt;
  cutIsLRP (?A v ?B) ?pt; allNextPTs (cutOnFmls {?B, ?A}) ?pt |]
 ==> EX ptn. conclPT ptn = conclPT ?pt & allPT wfb ptn &
    allPT (frb rls) ptn & allPT (cutOnFmls {?B, ?A}) ptn" : thm
\end{verbatim}
\end{theorem}

We have also shown that the principal reductions described above
satisfy \texttt{c8redP}, and are in fact cut-reductions: for each
logical connective.  These results (one for each logical connective)
were combined to give Theorem~\ref{vformC8W'}.

\begin{theorem}[\texttt{vformC8W'}]
\label{vformC8W'}
If the bottom-most rule of a valid derivation tree $\Pi_0$ is a (left
and right) principal cut, then there exists a valid derivation tree
$\Pi_1$ with the same conclusion as $\Pi_0$, such that $\Pi_0$
\texttt{cutReduces} to $\Pi_1$.

\begin{verbatim}
vformC8W' = "[| ?pt = Pr ?seq cutr ?ptl;  valid rls ?pt ;
       cutIsLRP ?form ?pt |]
   ==> EX ptn. cutReduces ?pt ptn & valid rls ptn" : thm
\end{verbatim}
\end{theorem}

Thus every principal move gives a reduction which is in the class of
strongly normalising reductions.

\subsection{Cut-Elimination Via Strong Normalisation}

Recall that a \emph{valid} derivation tree is one which is
well-formed, uses rules from \texttt{rls}, and has no unfinished
leaves.

\begin{definition}[\texttt{validRed}]
\label{defn:validRed}
A \emph{valid reduction} is a reduction of tree $\Pi_0$ to tree $\Pi_1$,
where $\Pi_1$ is a valid tree which has the same conclusion as $\Pi_0$.
\end{definition}

% from SNCE.thy
\begin{verbatim}
validRed_def "validRed == {(ptn, pt).
    conclPT ptn = conclPT pt & valid rls ptn & reduces pt ptn}"
\end{verbatim}

\begin{lemma}\label{lem:valid-subtree}
 A valid reduction of a subtree is a valid reduction of the whole tree.
\end{lemma}

\begin{theorem}\label{para-princ-valid}
 Parametric and principal moves give valid reductions.  
\end{theorem}

\begin{proof}
  Using \texttt{pRedLP3}, \texttt{pRedRP3} and \texttt{vformC8W'}.
\end{proof}

\begin{theorem}[\texttt{cutExRed}]\label{cutExRed}
   For any valid tree with a single cut at the bottom there exists
   a cut-reduction to another valid tree with the same conclusion,
   using only parametric and principal moves.

\begin{verbatim}
cutExRed = "[| isCut ?pt; valid rls ?pt; 
       allNextPTs (cutOnFmls {}) ?pt |]
   ==> EX ptn. cutReduces ?pt ptn & valid rls ptn" : thm
\end{verbatim}
\end{theorem}

\begin{proof}
  We use only \texttt{pRedLP3} and \texttt{pRedRP3} and
  \texttt{vformC8W'}, so that the reductions are restricted
  to parametric and principal moves. Note that \texttt{conclPT ptn =
    conclPT pt} is implied by \texttt{cutReduces pt ptn}.
\end{proof}

\begin{theorem}[\texttt{ExRed}]
  For any valid tree which contains a cut, there is available at least
  one valid reduction (apply Theorem~\ref{cutExRed} to a top-most cut).

 \begin{verbatim}
 ExRed = "[| valid rls ?pt ; ~ allPT (Not o isCut) ?pt |]
    ==> EX ptn. (ptn, ?pt) : validRed" : thm
 \end{verbatim}
\end{theorem}
 
Below, the notation \verb|^*| denotes transitive closure.

\begin{theorem}[\texttt{validRed\_min}]
  For any derivation tree $\Pi$ there exists a tree $\Pi^r$ obtained
  from $\Pi$ such that $\Pi^r$ cannot be further validly reduced.
\begin{verbatim}
validRed_min = "EX ptn. (ptn, ?pt1) : validRed^* & 
                ~ (EX pts. (pts, ptn) : validRed)" : thm
\end{verbatim}
\end{theorem}

\begin{proof}
  Every tree with a cut admits at least one valid reduction, and
  parametric and principal moves give valid reductions, and there is
  no infinite sequence of valid reductions. So repeatedly perform any
  sequence of principal and parametric reductions until no reduction
  is possible.
\end{proof}

Below, \texttt{allPT (Not o isCut) ptn} means that \texttt{ptn} is
cut-free since \texttt{o} stands for composition of functions in
Isabelle/HOL.

\begin{theorem}[\texttt{redNoCut}]\label{redNoCut}
  For any valid tree $\Pi$, there exists a cut-free valid tree
  $\Pi^r$, obtained from $\Pi$ by repeated valid reductions, such that
  $\Pi^r$ has the same conclusion as $\Pi$.

\begin{verbatim}
redNoCut = "valid rls ?pt ==> EX ptn.  
   allPT (Not o isCut) ptn & valid rls ptn & 
   (ptn, ?pt) : validRed^* & conclPT ptn = conclPT ?pt" : thm
\end{verbatim}
\end{theorem}

\begin{corollary}
  Any sequence of principal and parametric moves starting from some
  derivation $\Pi$ containing cuts, eventually terminates with a
  cut-free derivation $\Pi^r$ that has the same conclusion as $\Pi$.
\end{corollary}

\begin{proof}
  In Theorem~\ref{redNoCut}, starting from $\Pi$, choose a sequence of
  parametric and principal reductions only.  Since valid reductions
  are reductions, they are strongly normalising by
  Theorem~\ref{thm:all-sn}, hence this sequence must terminate with
  some valid derivation $\Pi^r$, to which no reduction is applicable.
  This is only possible if $\Pi^r$ is cut-free.
\end{proof}

We therefore have cut-elimination via strong normalisation.

\begin{theorem}[\texttt{cutElim\_SN}]
  If a sequent can be derived using rules \texttt{rls}, then it can be
  derived from those rules omitting \cut.

\begin{verbatim}
cutElim_SN = "IsDerivableR rls {} ?concl ==> 
              IsDerivableR (rls - {cutr}) {} ?concl" : thm
\end{verbatim}
\end{theorem}

\section{Further Work}
\label{sec:concl-furth-work}

Belnap's theorem is expressed to apply to any Display Calculus
satisfying his conditions.  To prove his theorem in that form would
require modelling an arbitrary Display Calculus, with generalised
rules for arbitrary sets of structural and logical connectives. A
theoretical framework for such generalised rules can be found in
\cite{gore-ggg-igpl}. This would require a ``deeper'' embedding still.
For, in our first implementation \cite{dawson-gore-dra-jelia98}, we
set up the specific connectives and rules of \dRA\ in Isabelle, and
used Isabelle proofs as the \dRA-derivations.  In the present
implementation, we again set up the specific connectives and rules of
\dRA, although we set up data structures to model arbitrary
derivations.  To prove the generic Belnap theorem, we need to set up
the necessary structures to model arbitrary sets of connectives and
rules from \cite{gore-ggg-igpl}.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Here begins the end matter

% \bibliography{./ce,/home/discus/disk1/rpg/bib/fm,/home/discus/disk1/rpg/bib/modal,/home/discus/disk1/rpg/bib/mtl,/home/discus/disk1/rpg/bib/logic,/home/discus/disk1/rpg/bib/tl,/home/discus/disk1/rpg/bib/paracon,/home/discus/disk1/rpg/bib/linear,/home/discus/disk1/rpg/bib/algebra,/home/discus/disk1/rpg/bib/relevant,/home/discus/disk1/rpg/bib/dl,/home/discus/disk1/rpg/bib/atp}

% \bibliographystyle{plain}

\begin{thebibliography}{10}

\bibitem{basin-matthews-structuring}
D~A. Basin and S Matthews.
\newblock Structuring metatheory on inductive definitions.
\newblock {\em Information and Computation}, 162(1-2):80--95, 2000.

\bibitem{belnap-display}
N~D {Belnap}.
\newblock Display logic.
\newblock {\em J.\ of Philosophical Logic}, 11:375--417, 1982.

\bibitem{borisavljevic-dosen-petric-permuting-cut-with-contraction}
M Borisavljevic, K Dosen, and Z Petric.
\newblock On permuting cut with contraction.
\newblock {\em Mathematical Structures in Computer Science}, 10:99--136, 2000.

\bibitem{church-types}
A~Church.
\newblock A formulation of the simple theory of types.
\newblock {\em Journal of Symbolic Logic}, 5:56--68, 1940.

\bibitem{dawson-gore-dra-jelia98}
J~Dawson and R~Gor{\'{e}}.
\newblock A mechanised proof system for relation algebra using display logic.
\newblock In {\em JELIA98: Proceedings of the European Workshop on Logic in
  Artificial Intelligence}, LNAI 1489:264-278. Springer, 1998.
\newblock \url{http://discus.anu.edu.au/~jeremy/dra/dra-files/}.

\bibitem{dawson-gore-embedding-comparing}
J~E Dawson and R Gor{\'e}.
\newblock Embedding display calculi into logical frameworks: Comparing {T}welf
  and {I}sabelle.
\newblock In C~Fidge (Ed), {\em Proc.\ CATS 2001}, {\em ENTCS}, 42:89--103,
  \url{http://www.elsevier.nl/gej-ng/31/29/23/68/show/Products/notes/cover.htt
}, 2001. Elsevier.

\bibitem{HOL-introduction}
M~J~C Gordon and T~F Melham.
\newblock {\em Introduction to {HOL}: a Theorem Proving Environment for Higher
  Order Logic}.
\newblock Cambridge University Press, 1993.

\bibitem{gore-relalg}
R~Gor{\'{e}}.
\newblock Cut-free display calculi for relation algebras.
\newblock In D~van Dalen and M~Bezem (Eds), {\em CSL96: Selected Papers of
  the Annual Conference of the European Association for Computer Science
  Logic}, volume LNCS 1258, pages 198--210. Springer, 1997.

\bibitem{gore-ggg-igpl}
R~Gor{\'e}.
\newblock {G}aggles, {G}entzen and {G}alois: How to display your favourite
  substructural logic.
\newblock {\em Logic Journal of the Interest Group in Pure and Applied Logic},
  6(5):669--694, 1998.

\bibitem{gore-sld-igpl}
R~Gor{\'{e}}.
\newblock Substructural logics on display.
\newblock {\em Logic Journal of the Interest Group in Pure and Applied Logic},
  6(3):451--504, 1998.

\bibitem{heuerding-phd}
A Heuerding.
\newblock {\em Sequent Calculi for Proof Search in some Modal Logics}.
\newblock PhD thesis, Institute for Applied Mathematics and Computer Science,
  University of Berne, Switzerland, 1998.

\bibitem{matthews-smaill-basin-experience}
S.~Matthews, A.~Smaill, and D.~Basin.
\newblock Experience with {FS$_0$} as a framework theory.
\newblock In G~Huet and G~Plotkin (Eds), {\em Logical Environments}, pages
  61--82. Cambridge University Press, 1993.

\bibitem{matthews-implementing-isabelle}
S Matthews.
\newblock Implementing {FS$_0$} in isabelle: adding structure at the metalevel.
\newblock In J~Calmet and C~Limongelli (Eds), {\em Proc.\ Disco'96}.
  Springer, 1996.

\bibitem{mikhajlova-vonwright-isomorphism}
A Mikhajlova and J von Wright.
\newblock Proving isomorphism of first-order proof systems in {HOL}.
\newblock In J~Grundy and M~Newey (Eds), {\em Theorem Proving in
  Higher-Order Logics}, LNCS 1479, pages 295--314. Springer, 1998.

\bibitem{IsLogHOL}
T Nipkow, L~C. Paulson, and M Wenzel.
\newblock Isabelle's logics: {HOL}.
\newblock Technical report.
\newblock 15 February 2001, \texttt{doc/logics-HOL.dvi} in the Isabelle
  distribution.

\bibitem{isabelle-ref}
L~C. Paulson.
\newblock The {Isabelle} reference manual.
\newblock Technical report.
\newblock 15 February 2001, \texttt{doc/ref.dvi} in the Isabelle distribution.

\bibitem{pfenning-structural-lics94}
F Pfenning.
\newblock Structural cut elimination.
\newblock In {\em Proc.\ LICS 94}, 1994.

\bibitem{pym-harland-uniform}
D. Pym and J. Harland.
\newblock A Uniform Proof-theoretic Investigation of Linear Logic Programming.
\newblock J.\ of Logic and Computation 4:2:175-207, 1994.

\bibitem{szabo-gentzen}
M.~E. Szabo, editor.
\newblock {\em The collected papers of Gerhard Gentzen}, pages 68--131.
\newblock North-Holland, Amsterdam, 1969.

% \bibitem{wansing-strong}
% H~Wansing.
% \newblock Strong cut-elimination in display logic.
% \newblock {\em Reports on Mathematical Logic}, 29:117--131, 1995 (published
%   1996).

\bibitem{wansing-displaying-modal-logic}
H Wansing.
\newblock {\em Displaying Modal Logic}, volume~3 of {\em Trends in Logic}.
\newblock Kluwer Academic Publishers, Dordrecht, August 1998.

\end{thebibliography}

\newpage

\section{Appendix: An Operational View of Cut-Elimination}
\label{sec:an-operational-view}

In this section we give an operational view of cut-elimination, to
give some intuition of what is involved in the overall cut-elimination
procedure a l\`a Belnap \cite{belnap-display}. We assume familiarity
with notions like ``parametric ancestors'' of a cut formula from
\cite{belnap-display}.

% \subsection{Principal Cuts and Belnap's Condition C8}
% \label{subsec:princ-cuts-beln}

\begin{definition}
  An application of the cut rule is \emph{left-principal}
  [\emph{right-principal}] if the cut-formula is the principal formula
  of the left [right] premiss of the cut rule.
\end{definition}

\begin{figure}
\vpf
\begin{center}
\bpf
\A "$\Pi_{ZAB}$"
\U "$Z \vdash A, B$"
\U "$Z \vdash A \lor B$" "($\vdash \lor $)"
\A "$\Pi_{AX}$"
\U "$A \vdash X$"
\A "$\Pi_{BY}$"
\U "$B \vdash Y$"
\B [1em] "$A \lor B \vdash X, Y$" "($\lor \vdash $)"
\B [2em] "$Z \vdash X, Y$" "($cut$)"
\epf
\end{center}
\caption{Principal cut on formula $A \lor B$}
%\cc{fig-orcut-new}
\label{fig-orcut-new}
\end{figure}

\begin{figure}
\vpf
\begin{center}
\bpf
\A "$\Pi_{ZAB}$"
\U "$Z \vdash A, B$"
\U "$*A, Z \vdash B$" "($cs1$)"
\A "$\Pi_{BY}$"
\U "$B \vdash Y$"
\B [2em] "$*A, Z \vdash Y$" "($cut$)"
\U "$Z \vdash A, Y$" "($\overline{cs1}$)"
\U "$Z, *Y \vdash A$" "($cs2$)"
\A "$\Pi_{AX}$"
\U "$A \vdash X$"
\B [2em] "$Z, *Y \vdash X$" "($cut$)"
\U "$Z \vdash X, Y$" "($\overline{cs2}$)"
\epf
\end{center}
\caption{Transformed principal cut on formula $A \lor B$}
%\cc{fig-orcut'}
\label{fig-orcut'-new}
\end{figure}

Belnap's condition C8 guarantees that all principal cuts can be
replaced by cuts on strictly smaller formulae. For example, the
principal cut on $A \lor B$ shown in Figure~\ref{fig-orcut-new} can be
replaced by the derivation shown in Figure~\ref{fig-orcut'-new}.
The replacement derivation contains cuts only on $A$ and $B$, which
are smaller formulae than $A \lor B$.  
Where the cut-formula is a single formula variable, it is necessarily
introduced by the identity axiom:
such cuts can be removed trivially.

The transformation of a pricipal cut on $A$ into one or more cuts on
strict subformulae of $A$ is known as a ``principal move''. We now
need a way to turn arbitrary cuts into principal ones.
 
\begin{figure}
\hfill 
\bpf
\A "$\Pi[A]$"
\U "$Z[A] \vdash A $" "(intro-$A$)"
\U "$\Pi_L[A]$" "($\pi$)"
\U "$X \vdash A $" "($\rho$)"
\A "$\Pi_R$"
\U "$A \vdash Y$" 
\B [2em] "$X \vdash Y$" "($cut$)"
\epf
\hspace{0.1cm}
\bpf
\A "$\Pi'[Y]$"
\U "$Z[Y] \vdash A $" "(intro-$A$)"
\A "$\Pi_R$"
\U "$A \vdash Y$" 
\B [2em] "$Z[Y] \vdash Y$" "($cut$)"
\U "$\Pi_L[Y]$" "($\pi$)"
\U "$X \vdash Y $" "($\rho$)"
\epf
\hfill \mbox{}
\caption{Making a cut left-principal}
\label{fig-cut-p-new}
\end{figure}

In the case of a cut that is not left-principal, say we have a tree
like the one on the left in Figure~\ref{fig-cut-p-new}.  Then we
transform the subtree rooted at $X \vdash A$ by simply changing its
root sequent to $X \vdash Y$, and proceeding upwards, changing all
ancestor occurrences of $A$ to $Y$.  In doing this we run into
difficulty at each point where $A$ is introduced: at such points we
insert an instance of the cut rule.  The diagram on the right hand
side of Figure~\ref{fig-cut-p-new} shows this in the case where $A$ is
introduced at just one point.

In Figure~\ref{fig-cut-p-new}, the notation $\Pi_L[A]$ and $Z[A]$
means that the sub-derivation $\Pi_L$ and structure $Z$ may contain
occurrences of $A$ which are parametric ancestors of the cut-formula
$A$: thus (intro-$A$) is the lowest rule where $A$ is the principal
formula on the right of $\vdash$. The notation $\Pi_L[Y]$ and $Z[Y]$
means that all such ``appropriate'' instances of $A$ are changed to
$Y$: that is, instances of $A$ which can be traced to the instance
displayed on the right in $X \vdash A$.  The rules contained in the
new sub-derivation $\Pi_L[Y]$ are the same as the rules used in
$\Pi_L$; thus it remains to be proved that $\Pi_L[Y]$ is well-formed.
The resulting cut in the diagram on the right of
Figure~\ref{fig-cut-p-new} is left-principal.  Notice that the
original sub-derivation $\Pi$ may be transformed into a
\textit{different} sub-derivation $\Pi'$ during this process since the
parametric ancestors of $A$ occurring in $\Pi[A]$ will in turn need to
be ``cut away'' below where they are introduced, and replaced by $Y$.

There is one condition:
for a parametric move, the portion of the
derivation tree that is changed must not contain a cut: that is, there
must be no cuts in the parts of the trees in
Figure~\ref{fig-cut-p-new} shown below:

\hfill 
\bpf
\A "$Z[A] \vdash A $" 
\U "$\Pi_L[A]$" "($\pi$)"
\U "$X \vdash A $" "($\rho$)"
\epf
\hfill
\bpf
\A "$Z[Y] \vdash Y$" 
\U "$\Pi_L[Y]$" "($\pi$)"
\U "$X \vdash Y $" "($\rho$)"
\epf
\hfill \mbox{}

Subsequently, the ``mirror-image'' procedure is followed, to convert a
left-principal cut into one or more (left- and right-)principal cuts.

The process of making a cut left-principal, or of making a
left-principal cut (left and right) principal is called a ``parametric
move''.

\end{document}

%%% Local Variables: 
%%% mode: latex
%%% TeX-master: t
%%% End: