\section{Proofs of Cut-Elimination in Classical Propositional Logic}
The Isabelle proof described in \S\ref{s-gls-mix} took the form of a proof of
admissibility of the mix (or multicut) rule, in which multiple occurrences of
the cut-formula are removed at once.
We note that this proof was based on that in \cite{GR08} which does prove
cut- (rather than mix-) elimination.
Of course, given contraction rules (or contraction admissibility), cut- and
mix-admissibility are equivalent.
However the proof is different --- in proving mix-admissibility directly, the
inductive hypothesis has the form that when two sequents
$\mathcal S'$ and $\mathcal S''$ are derivable, then the result $\mathcal S$
of applying the mix rule to them is derivable.
But if we have an inductive hypothesis of cut-admissibility, when the pattern of
the inductive proof means that we can apply an inductive hypothesis to
$\mathcal S'$ and $\mathcal S''$, it does not follow that we can apply
contraction to $\mathcal S'$ and $\mathcal S''$ and then expect the inductive
hypothesis to apply to the results of those contractions.
This makes a proof of mix-admissibility easier than a proof of
cut-admissibility.
However we now describe a formalised proof of cut-admissibility for classical
propositional logic.
% sequents
We model a sequent as a pair of multisets of formulae, written \texttt{X |- Y}.
Thus the order of formulae within $X$ and $Y$ is not significant, but the number
of occurrences of a formula is.
Although some of our results are parameterised over an arbitrary
``subformula'' relation, and assume this to be well-founded,
we consistently use the ``immediate'' subformula relation
(eg, $(A, A \land B)$).
%rules
We use a system where the rules for the binary connectives are invertible,
so for example, we have
$$
\frac{X \vdash Y, A \quad X \vdash Y, B}{X \vdash Y, A \land B}
\qquad
\frac{X, A, B \vdash Y}{X, A \land B\vdash Y}
$$
Thus approach had the convenient feature that all rules could be considered
as a ``base rule'' extended by an arbitrary context, where the extension applies
to the conclusion and all premises (as shown below,
for the $\vdash\land$ and axiom rules).
$$
\frac{\vdash A \quad \vdash B}{\vdash A \land B}
\qquad
\frac{X \vdash Y, A \quad X \vdash Y, B}{X \vdash Y, A \land B}
\qquad \qquad
A \vdash A \quad X, A \vdash Y, A
$$
We defined systems with and without
explicit weakening and contraction rules.
First, the system \texttt{lksss} = \texttt{extrs lksne},
which is just the logical introduction rules
and the axiom.
We showed weakening admissibility \texttt{lksss\_wk\_adm},
and then contraction admissibility \texttt{lksss\_ctr\_adm}.
This required proving invertibility of the rules \texttt{lks\_inv\_rl}.
Next, we add the weakening rules to get the system
\texttt{lkxss \{\}} = \texttt{extrs lkxne}.
For this we showed contraction admissibility \texttt{lkxss\_ctr\_adm}.
Similarly to the above,
this required proving an invertibility result, \texttt{lkx\_inv\_rl}.
It does not require \emph{all} the rules to be invertible --- clearly, weakening
rules are not invertible.
These invertibility results were obtained from ``generalised'' invertibility
results \texttt{gen\_inv} and \texttt{gen\_inv\_wk}.
To describe the general results we need some definitions.
\begin{definition}[\texttt{id\_decomp}]
A rule set \textit{rls} has the \emph{identity decomposition property},
\texttt{id\_decomp} \textit{rls}, iff
whenever $\displaystyle \frac{ps}{A \vdash A} \in rls$,
the sequents in the list $ps$ are derivable.
\end{definition}
\begin{definition}[\texttt{unique\_concl}]
A rule set \textit{rls} has the \emph{unique conclusion property},
\texttt{unique\_concl} \textit{rls}, iff
no two distinct rules in \textit{rls} have the same conclusion.
\end{definition}
\begin{definition}[\texttt{iscrls}]
\emph{Identity} or \emph{single conclusion} rules,
\texttt{iscrls} are rules which are $A \vdash A$,
$ \vdash A$ or $A \vdash $, for some A
\end{definition}
\begin{lemma}[\texttt{gen\_inv, gen\_inv\_wk}]
When a set \textit{rls} of identity or single conclusion rules
satisfies the unique conclusion property,
and the set of their extensions satisfies the identity decomposition property,
then the set \textit{extrs rls} of their extensions are invertible (with respect
to the derivation system consisting of those rules).
They are also invertible with respect
to the derivation system consisting of those rules and the (extended) weakening
rules.
\end{lemma}
\begin{verbatim}
id_decomp_def : "id_decomp ?rls ==
ALL A ps. (ps, {#A#} |- {#A#}) : ?rls --> set ps <= derrec ?rls {}"
unique_concl_def : "unique_concl ?rls ==
ALL qs qs' d. (qs, d) : ?rls & (qs', d) : ?rls --> qs = qs'"
gen_inv : "[| ?rls <= iscrls; unique_concl ?rls; id_decomp (extrs ?rls);
?x : extrs ?rls |] ==> inv_rl (extrs ?rls) ?x"
gen_inv_wk : "[| ?rls <= iscrls; ?rlsw = ?rls Un wkrls;
unique_concl ?rls; id_decomp (extrs ?rls); ?x : extrs ?rls |] ==>
inv_rl (extrs ?rlsw) ?x"
\end{verbatim}
For contraction admissibility we have similar generalised results.
\begin{definition}[\texttt{subfml\_cp\_prop}]
A rule \textit{(ps, c)} has the \emph{subformula property},
\texttt{subfml\_cp\_prop} \textit{(ps, c)}, iff, for every $p$ in $ps$,
either $p >= c$ or, every formula in $p$ has a sub-formula in $c$.
Here $p >= c$ means that the antecedent of $c$ is a sub-multiset of the
antecedent of $p$, and similarly for their succedents.
\end{definition}
\begin{lemma}[\texttt{gen\_ctr\_adm, gen\_ctr\_adm\_wk}]
When a set \textit{rls} of identity or single conclusion rules
satisfies the subformula property,
and the set of their extensions are invertible,
then contraction is admissible
(with respect to the derivation system consisting of extensions of \textit{rls}).
Similarly if the weakening rules are included (they need not be invertible
or satisfy the subformula property).
\end{lemma}
\begin{verbatim}
subfml_cp_prop.simps : "subfml_cp_prop ?sub (?ps, ?c) =
(ALL p:set ?ps. ?c <= p |
(ALL fp. ms_mem fp p --> (EX fc. ms_mem fc ?c & (fp, fc) : ?sub)))"
gen_ctr_adm : "[| ?A : wfp ?sub; ?rls <= iscrls;
Ball ?rls (subfml_cp_prop ?sub);
Ball (extrs ?rls) (inv_rl (extrs ?rls)) |] ==> ctr_adm (extrs ?rls) ?A"
gen_ctr_adm_wk : "[| ?A : wfp ?sub; ?rls <= iscrls; ?rlsw = ?rls Un wkrls;
Ball ?rls (subfml_cp_prop ?sub);
Ball (extrs ?rls) (inv_rl (extrs ?rlsw)) |] ==> ctr_adm (extrs ?rlsw) ?A"
\end{verbatim}
The results mentioned above follow as corollaries.
\begin{corollary}
For the systems \texttt{lksss} and \texttt{lkxss \{\}},
\begin{itemize}
\item the logical introduction rules are invertible
\item contraction is admissible
\end{itemize}
\end{corollary}
\begin{verbatim}
lks_inv_rl : "Ball (extrs lksne) (inv_rl (extrs lksne))"
lkx_inv_rl : "Ball (extrs lksne) (inv_rl (extrs lkxne))"
lksss_ctr_adm : "ctr_adm lksss ?A"
lkxss_ctr_adm : "ctr_adm (lkxss {}) ?A"
\end{verbatim}
% CUT-ADMISSIBILITY of lksss and lkxss {}
We now look at cut-admissibility for these systems.
We obtained a general cut-admissibility result, which depends on a further
definition.
\begin{definition}[\texttt{cas, car}]
A pair of sequents $((X_l \vdash Y_l), (X_r \vdash Y_r))$
is in the set \texttt{cas} \textit{rls} $A$
if cut on $A$ is admissible, that is:
if $(X_l \vdash Y_l)$ and $(X_r \vdash Y_r)$
are derivable (using rules \textit{rls}),
then $(X_l, (X_r - A) \vdash (Y_l - A), Y_r)$ is derivable.
A pair of sequents $((X_l \vdash Y_l), (X_r \vdash Y_r))$
is in the set \texttt{car} \textit{rls} $A$ iff
$(X_l, (X_r - A) \vdash (Y_l - A), Y_r)$ is derivable.
\end{definition}
\begin{definition}[\texttt{c8s\_prop}]
A set of rules satisfies the \emph{C8 property}
(so-called following its use in Display Logic)
if, whenever a formula is introduced on both sides
(as shown by way of an example where $A = A' \lor A''$),
and assuming cut-admissibility for smaller cut-formulae,
we have admissibility of the cut shown.
\end{definition}
\begin{prooftree}
\AxiomC{$X_l \vdash Y_l, A', A''$}
\RightLabel{$(\vdash\lor)$}
\UnaryInfC{$X_l \vdash Y_l, A' \lor A''$}
\AxiomC{$A', X_r \vdash Y_r$}
\AxiomC{$A'', X_r \vdash Y_r$}
\RightLabel{$(\lor\vdash)$}
\BinaryInfC{$A' \lor A'', X_r \vdash Y_r$}
\RightLabel{\textit{cut?}}
\BinaryInfC{$X_l, X_r \vdash Y_l, Y_r$}
\end{prooftree}
\begin{verbatim}
car_eq : "((?Xl |- ?Yl, ?Xr |- ?Yr) : car ?rls ?A) =
((?Xl + (?Xr - {#?A#}) |- ?Yl - {#?A#} + ?Yr) : derrec ?rls {})"
car_eq' : "((?X |- mins ?A ?Y, mins ?A ?U |- ?V) : car ?rls ?A) =
((?X + ?U |- ?Y + ?V) : derrec ?rls {})"
cas_eq : "((?seql, ?seqr) : cas ?rls ?A) =
(?seql : derrec ?rls {} & ?seqr : derrec ?rls {} -->
(?seql, ?seqr) : car ?rls ?A)"
c8s_prop_def :
"c8s_prop ?isubfml ?rls ?A == ALL ps qs pse qse Xl Yl Xr Yr.
(ps, 0 |- {#?A#}) : ?rls --> (qs, {#?A#} |- 0) : ?rls -->
pse = map (extend (Xl |- Yl)) ps --> qse = map (extend (Xr |- Yr)) qs -->
set pse Un set qse <= derrec (extrs ?rls) {} -->
(ALL sfa. (sfa, ?A) : ?isubfml --> (ALL Xl Yl Xr Yr.
(Xl |- mins sfa Yl, mins sfa Xr |- Yr) : cas (extrs ?rls) sfa)) -->
(Xl |- mins ?A Yl, mins ?A Xr |- Yr) : car (extrs ?rls) ?A"
\end{verbatim}
Then we have the general cut-admissibility theorem:
\begin{theorem}
If the basic rules in set \textit{rls} are single-conclusion or identity rules,
and \textit{rls} also contains the identity rules,
and if \textit{rls} satisfies the C8 property,
then for the system \texttt{extrs rls}, if contraction is admissible,
then cut is admissible.
\end{theorem}
\begin{corollary}
Cut is admissible for the systems \texttt{lksss} and \texttt{lkxss \{\}}.
\end{corollary}
\begin{verbatim}
gen_scas_all : "[| wf ?psubfml; All (c8s_prop ?psubfml ?rls);
All (ctr_adm (extrs ?rls)); ?rls <= iscrls;
ALL A. ([], {#A#} |- {#A#}) : ?rls |] ==>
(?Xl |- mins ?A ?Yl, mins ?A ?Xr |- ?Yr) : cas (extrs ?rls) ?A"
lks_scas_all : "(?Xl |- mins ?A ?Yl, mins ?A ?Xr |- ?Yr) : cas lksss ?A"
lkx_scas_all : "(?Xl |- mins ?A ?Yl, mins ?A ?Xr |- ?Yr) : cas (extrs lkxne) ?A"
\end{verbatim}
% method of proof of cut-admissibility
To prove the cut-admissibility result, we used an induction scheme which permits
inductive proof of any predicate which depends on \emph{two} derivable sequents.
The inductive step in such a proof is defined by \texttt{gen\_step2sr}.
\begin{verbatim}
gen_step2sr_def' : "gen_step2sr ?P ?A ?sub ?rls ((?psl, ?cl), ?psr, ?cr) =
((ALL A'. (A', ?A) : ?sub --> (ALL dl:derrec ?rls {}.
ALL dr:derrec ?rls {}. ?P A' (dl, dr))) -->
(ALL pl:set ?psl. pl : derrec ?rls {} & ?P ?A (pl, ?cr)) -->
(ALL pr:set ?psr. pr : derrec ?rls {} & ?P ?A (?cl, pr)) -->
?cl : derrec ?rls {} --> ?cr : derrec ?rls {} --> ?P ?A (?cl, ?cr))"
gen_step2sr_lem : "[| ?A : wfp ?sub;
ALL A. ALL (psa, ca):?rls. ALL (psb, cb):?rls.
gen_step2sr ?P A ?sub ?rls ((psa, ca), psb, cb);
?seqa : derrec ?rls {}; ?seqb : derrec ?rls {} |] ==>
?P ?A (?seqa, ?seqb)"
\end{verbatim}
It is this which we need to prove for every possible last derivation steps on
either side of the desired cut.
Let the last derivation step on the left be $(psl, cl)$,
and, on the right, $(psr, cr)$.
It concerns a property $P$ which depends on a formula $A$ (typically, that
cut-admissibility holds for $A$).
We assume that cut-admissibility holds for sub-formulae $A'$ of $A$.
We also assume that for each premise $pl$ in $psl$, $P\ A\ (pl, cr)$ holds,
and that for each premise $pr$ in $psr$, $P\ A\ (cl, pr)$ holds,
and prove that these imply that $P\ A\ (cl, cr)$ holds.
If we can do this for all then this is sufficient to prove that
$P\ A\ (seql, seqr)$ holds for any pair of derivable sequents
(\texttt{gen\_step2sr\_lem}).
This approach had the advantage that we could prove separate lemmas
stating that \texttt{gen\_step2sr} holds for various rules in the last position
on either side of the cut, rather than having to prove this as part of one
single huge result.
The same approach was taken in the inductive proofs of
logical introduction rule invertibility and contraction admissibility,
although these cases are simpler because there is one preceding derivation to
deal with, not two.
Note that this induction scheme is ``induction on the fact of derivation'' of
the premises of the cut, where we assume the desired result holds for the
preceding step of the derivation.
A proof of these results might
instead use ``induction on the size of the derivation'',
assuming the desired result holds for any smaller derivation.
As we see below, this approach would be more general.
But so far our formalisation of these results has not required
a derivation whose size can be measured or compared
(although this is described in \S\ref{s-dtobj}).
% adding explicit contraction rules
Finally, we add the contraction rules to get the system
\texttt{lkxcss \{\}} = \texttt{extrs lkxcne}.
We now consider proving cut-admissibility of this system.
The easiest way is to simply omit the contraction rules and use the previous
results (for \texttt{lkxss \{\}} = \texttt{extrs lkxne}),
since both contraction and cut are admissible in that system.
One could also use an inductive argument similar to that described above to
prove mix-admissibility (ie, where the property $P$ is that the mix, or
multicut, rule, rather than the cut rule, is admissible).
TODO - try this - how different/difficult is it?
But we want to explore what is involved in proving cut-admissibility
directly, by induction: ie, admissibility of a single cut is
the property which we prove by assuming it as an inductive hypothesis.
Now where the last step on both sides is contraction of the cut-formula,
preceded by introduction of that formula, it can be seen that we need
invertibility of the logical introduction rules
(whereas in the work described above, we needed invertibility to
prove contraction admissibility).
Further, to prove this in the presence of explicit contraction rules
we need an inductive hypothesis of ``multi-invertibility''
(which means, for example, that if $X, (A \land B)^n \vdash Y$ is derivable then
$X, A^n \vdash Y$ and $X, B^n \vdash Y$ are derivable).
So we proved multi-invertibility as \texttt{lkxc\_inv\_rl}.
Then, in the scenario where the last $n$ steps are
contraction of the cut-formula,
preceded by some other rule $\rho$,
we typically need to perform first the $n$ contractions,
then the cut (by induction), then the rule $\rho$.
This means that the inductively justified cut is performed on a sequent
which does not appear in the original derivation.
Thus the inductive scheme described above will not work:
instead we can use induction on the height of the derivation tree
which leads to the premises of the cut.
See the diagram below.
So we need to use explicit derivation trees as described in \S\ref{s-dtobj}.
Analogous to \texttt{gen\_step2sr} we define a predicate
\texttt{height\_step2\_tr} which expresses the key step of a proof
of a property of two sequents by induction on height of their derivations.
(In the case of \texttt{height\_step2\_tr}, though, we need to put the condition
that the sequents be derivable into $P$).
Then the result \texttt{height\_step2\_tr\_lem} expresses the result that
if we can prove this step for all derivation trees then $P\ A$ holds for all
derivation trees.
\begin{verbatim}
height_step2_tr_eq : "height_step2_tr ?P ?A ?sub (?dta, ?dtb) =
((ALL A'. (A', ?A) : ?sub --> (ALL a b. ?P A' (a, b))) -->
(ALL dtp. (dtp, ?dta) : measure heightDT --> ?P ?A (dtp, ?dtb)) -->
(ALL dtq. (dtq, ?dtb) : measure heightDT --> ?P ?A (?dta, dtq)) -->
?P ?A (?dta, ?dtb))"
height_step2_tr_lem : "[| ?A : wfp ?sub;
ALL A dta dtb. height_step2_tr ?P A ?sub (dta, dtb) |] ==>
?P ?A (?dta, ?dtb)"
\end{verbatim}
The proof of cut-admissibility then goes like this:
First, consider the case where the cut-formula is compound (eg $A' \land A''$).
Since we have invertibility of the logical introduction rules, we can use
this fact and the inductive assumption that cuts on $A'$ or $A''$
are admissible, to get admissibility of the cut on $A' \land A''$.
The significant part of the proof in this case
is reused from the cut-admissibility of \texttt{lksss}.
Then we have to consider the case where the cut-formula is atomic.
We need to trace up contractions on that cut-formula on either side and
see what the next rule above that is. Note that the number of consecutive
contractions on either side may be zero --- it turns out that there is no need
to treat this case differently from the case where there is one or more
contractions.
There was one exception to this --- the case of a weakening on the cut-formula
is slightly different depending on whether there are zero or more contractions
below the weakening.
\footnote{We wasted considerable time and effort before
discovering this fact, in that we dealt with various cases for the rule
immediately above the cut being weakening, a parametric case of contraction,
etc}.
Here is the case of a parametric derivation step immediately above the topmost
contraction on the cut-formula.
The rule $\rho$ which changes $X_l'$ and $Y_l'$ to $X_l$ and $Y_l$ has to
allow arbitrary context, so that it can be applied after the cut.
\begin{prooftree}
\AxiomC{$X_l' \vdash Y_l', A^n$}
\RightLabel{$(\rho)$}
\UnaryInfC{$X_l \vdash Y_l, A^{n+1}$}
\RightLabel{$(ctr ^ n)$}
\UnaryInfC{$X_l \vdash Y_l, A$}
\AxiomC{$A, X_r \vdash Y_r$}
\RightLabel{\textit{cut?}}
\BinaryInfC{$X_l, X_r \vdash Y_l, Y_r$}
\end{prooftree}
\begin{prooftree}
\AxiomC{$X_l' \vdash Y_l', A^n$}
\RightLabel{$(ctr ^ n)$}
\UnaryInfC{$X_l' \vdash Y_l', A$}
\AxiomC{$A, X_r \vdash Y_r$}
\RightLabel{\textit{cut?}}
\BinaryInfC{$X_l', X_r \vdash Y_l', Y_r$}
\RightLabel{$(\rho)$}
\UnaryInfC{$X_l, X_r \vdash Y_l, Y_r$}
\end{prooftree}
The remaining cases are where the cut-formula $A$ is atomic (otherwise we
we have the case where invertibility applies, as mentioned above).
These cases are the axiom $A \vdash A$ or weakening on $A$, and are easy:
in the case of $A \vdash A$, we have that $A$ is in $X_l$ or $Y_r$,
and in the case of weakening, it either reverses the last contraction (if any),
or, if there were no contractions, is a trivial case.
STILL TO DO - get a generalised form of this theorem, as gen\_scas\_all
\begin{verbatim}
lkxc_inv_rl : "Ball (extrs (multiples lksne)) (inv_rl (extrs lkxcne))"
lkxc_scas_all :
"(?Xl |- mins ?A ?Yl, mins ?A ?Xr |- ?Yr) : cas (extrs lkxcne) ?A"
\end{verbatim}