\section{The Proof of Cut-Admissibility for \GLS} \label{s-caglr}
Valentini's proof of cut-admissibility for \GLS{} uses a triple
induction on the size of the cut-formula, the heights of the
derivations of the left and right premises of cut, and a third
parameter which he called the ``width''.
Roughly speaking, the width of a cut-instance is the number of \GLR{}
rule instances above that cut which contain a parametric ancestor of
the cut-formula in their conclusion. The Gor\'e and
Ramanayake~\cite{gr-valentini} proof follows Valentini in using these
three measures, but gives a constructive way to calculate the width of
a cut by inspecting the branches of the left and right sub-derivations
of a cut rule instance.
\begin{figure}[t]
\centering
\normalsize
\bpf
\A "$\displaystyle \mu\left\{\ \frac{\Pi_l}{\GLRp}\right.$"
\U "\GLRc" "\GLR(B)"
\A "$\Pi_r$"
\U "${\Box B^k, Y \vdash Z}$" "($\rho$)"
\B [2em] "${\Box X, Y \vdash Z}$" "(\textit{multicut})"
\epf
\caption{A multicut on cut formula $\Box B$ where $\Box B$ is
left-principal via \GLR{}}
\label{f-muxbn}
\end{figure}
As is usual, the proof of cut-admissibility for \GLS{} proceeds by
considering whether the cut-formula is principal only in the left
premise of the cut, principal only in the right premise of the cut,
or principal in both. The crux of the proof is a ``reduction'' when
the cut-formula is of the form $\Box B$ and is principal in both
the left and right premises of the cut as shown below:
%
\begin{prooftree}
\AxiomC{$X, \Box X, \Box B \vdash B$}
\RightLabel{GLR(B)}
\UnaryInfC{$\Box X \vdash \Box B$}
%
\AxiomC{$Y, \Box Y, B, \Box B, \Box C\vdash C$}
\RightLabel{GLR(C)}
\UnaryInfC{$\Box B, \Box Y \vdash \Box C$}
%
\dottedLine
\RightLabel{cut}
\BinaryInfC{$\Box X, \Box Y \vdash \Box C$}
\end{prooftree}
%
The solution is to replace this cut on $\Box B$ by cuts which are
``smaller'' either because their cut-formula is smaller, or because
their width is smaller. In reality, most of the work involves a cut
instance which is only left-principal as shown in
Figure~\ref{f-muxbn}, and most of this section is devoted to how we
utilised our general methods to formalise these ``reductions'' as
given by Gor\'e and Ramanayake~\cite{gr-valentini}. But there are some
important differences which we now explain.
As explained in \S\ref{s-dp}, our general methods do not model
derivation trees explicitly, but use a shallow embedding. So our proof
uses induction on the size of the cut-formula and on the fact of
derivation, rather than on the size of the cut-formula and the height
of derivation trees. Also, we actually proved the admissibility of
``multicut'': that is, if $\Gamma \vdash A^n, \Delta$ and $\Gamma,
A^m \vdash \Delta$ are both cut-free derivable, then so is $\Gamma
\vdash \Delta$. This avoids problems in the cases where these
sequents are derived using contraction on $A$.
In all other aspects, our proof of cut-admissibility for \GLS\ is
based on that given by Gor\'e and Ramanayake~\cite{gr-valentini}. In
particular, although we do not actually require
\cite[Lemma~19]{gr-valentini}, we use the construction in that lemma,
which is fundamental to overcoming the difficulties of adapting
standard proofs to \GLS, as we explain shortly. Consequently, our
proof uses the idea of induction on the \emph{width} as defined in
\cite{gr-valentini}, although as we shall see, our proof is expressed
in terms of \delz, which approximates to the \delzs\ of
\cite{gr-valentini}, not width {\em per se}.
To cater for width/\delz, we could have altered our shallow embedding,
\texttt{derrec} but that destroys the modularity of our
approach. Instead, we defined width/\delz{} by using the datatype
\texttt{dertree} from \S\ref{s-dse} to represent explicit derivation
tree objects. These trees, and the general lemmas about them, are
similar to the trees of \cite{dawson-gore-cut-admissibility}. Thus we
``mix and match'' a deep embedding of derivation trees with a shallow
embedding of inductively defined sets of derivable sequents.
To ensure the correctness of our ``mixing and matching'' we needed the
following relationship between the two embeddings, where a
\textit{valid} tree is one whose inferences are in the set of rules
and which as a whole has no premises.
\begin{lemma}
Sequent \texttt{a} is shallowly derivable from the empty set of
premises using rules \texttt{rls} iff some derivation tree
\texttt{dt} is valid wrt.\ \texttt{rls} and has a conclusion
\texttt{a}.
\begin{verbatim}
derrec_valid_eq :
"(?a : derrec ?rls {}) = (EX dt. valid ?rls dt & conclDT dt = ?a)"
\end{verbatim}
\end{lemma}
% The provability logic \GL\
% and its sequent system \GLS, including the \GLR\ rule,
% are described in \S\ref{s-intro}.
% It is clear that the \GLR\ rule implies the transitivity rule
% (by weakening $\Box B$ from the premise of \GLR),
% as shown on the left below;
% the rule on the right holds,
% as can be shown using \textit{(cut)} by the derivation in Figure~\ref{f-dlxxc},
% or by consideration of the semantics.
% $$
% \begin{array}{c}
% \lxxc \\ \hline \GLRp \\ \hline \GLRc
% \end{array}
% \qquad \qquad
% \frac \GLRp \lxxc
% $$
We now define a function \delz{} which is closely related to
$\partial^0$ and the width of a cut of \cite{gr-valentini}, although
we are able to avoid mentioning the annotated derivations of
\cite{gr-valentini}. The argument of this function will be the
sub-tree $\mu$ of Figure~\ref{f-muxbn}. Here, \texttt{glra} is the
set of \emph{all} \GLR{}-rule instances, \texttt{antec} is a function to return
the antecedent of a sequent, and \texttt{:\#} is multiset membership:
%
\begin{verbatim}
primrec
Der : "del0 B (Der seq dtl) =
(if (map conclDT dtl, seq) : glra then
if Box B :# antec seq then 1 else 0
else del0s B dtl)"
Unf : "del0 B (Unf dt) = 0"
Nil : "del0s B [] = 0"
Cons : "del0s B (dt # dts) = del0 B dt + del0s B dts"
\end{verbatim}
%
Thus, the calculation of \delz\ traces up each branch of a derivation
tree until an instance of the \GLR\ rule is found: it then counts 1 if
$\Box B$ is in the antecedent, meaning that $B$ is in the $X$ of the
statement of \GLR. Where the derivation tree branches below any \GLR\
rule, the value is summed over the branches.
We now give a sequence of lemmata which eventually give the
``reduction'' for left-and-right \GLR{}-principal occurrences
of a cut-formula of the form $\Box B$.
\begin{lemma}[\texttt{gr20e}] \label{l-gr20e} If $\mu$ is a valid
derivation tree with conclusion \GLRp{}, and \delz\ $\mu = 0$, then
\lxxc\ is derivable (shallowly).
\end{lemma}
\begin{verbatim}
"[| valid glss ?mu ; del0 ?B ?mu = 0 ; (conclDT ?mu) =
(mset_map Box ?X + ?X + {# Box ?B#} |- {#?B#}) |]
==> (mset_map Box ?X + ?X |- {#?B#}) : derrec glss {}" : Thm.thm
\end{verbatim}
\begin{proof}
By applying the \GLR\ rule to the conclusion, we can derive \GLRc.
Tracing $\Box B$ upwards in $\mu$, it is parametric in each
inference, except possibly weakening, contraction or the axiom $\Box
B \vdash \Box B$. That is, since \delz\ $\mu = 0$, when we meet a
\GLR\ inference as we trace upwards, $\Box B$ must have already
disappeared (through weakening). So, tracing upwards, we can
change each instance of $\Box B$ to $\Box X$ in the usual way.
The axiom $\Box B \vdash \Box B$ is changed to \GLRc, which is
derivable. Contraction is not problematic since we use, as the
inductive hypothesis, that \emph{all} occurrences of $\Box B$ can be
replaced by $\Box X$.
\end{proof}
To abbreviate the statement of several lemmas, we define a function
\muxbn:
\begin{verbatim}
defs muxbn_def : "muxbn B n ==
ALL mu dtb seq. ([conclDT mu], seq) : glr B -->
del0 B mu <= n --> (Der seq [mu], dtb) : masdt glss (Box B)"
\end{verbatim}
Intuitively, \texttt{muxbn B n} states that if the left premise
derivation consists of a derivation $\mu$ whose conclusion can be used
as the premise of an instance of the \GLRB\ rule, and \delz~\texttt{B}
$\mu \leq n$, then multicut-admissibility for explicit derivation
trees (\texttt{masdt}) holds for the rule set \texttt{glss} with cut-formula $\Box B$:
see Figure~\ref{f-muxbn}, which is from~\cite{gr-valentini}).
% (In the following diagram, $\mu = \pi / \GLRp$).
That is, the multicut in Figure~\ref{f-muxbn} is admissible.
% \vpf
% \begin{figure}
% \centering
% \normalsize
% \bpf
% \A "$\displaystyle \mu\left\{\ \frac{\pi}{\GLRp}\right.$"
% \U "\GLRc" "\GLR"
% \A "$\sigma$"
% \U "${\Box B^k, U \vdash W}$"
% \B [2em] "${\Box X, U \vdash W}$" "(\textit{multicut})"
% \epf
% \caption{Definition of \muxbn}
% \label{f-muxbn}
% \end{figure}
The next lemma says that multicut admissibility on $B$
implies \muxbn\ $B$ 0.
\begin{lemma}[\texttt{cw0\_lbl, cw0\_mas'}] \label{l-cw0-mas'} If
$\mu$ is a valid derivation tree with conclusion \GLRp, and \delz\
$B\ \mu = 0$, and multicut on $B$ is admissible, and $\Box B^k, Y
\vdash Z$ is derivable, then $\Box X, Y \vdash Z$ is derivable.
\end{lemma}
% \begin{verbatim}
% cw0_mas' : "mas glss ?B = UNIV ==> muxbn ?B 0"
% \end{verbatim}
\begin{proof}
\GLRc{} is derivable from \GLRp{} via \GLR(X,B). By
Lemma~\ref{l-gr20e}, \lxxc\ is derivable. The rest of the proof is
by induction on the derivation of $\Box B^k, Y \vdash Z$, in effect,
by tracing relevant occurrences of $\Box B$ up that
derivation. Weakening and contraction involving $\Box B$ are not
problematic, and the axiom $\Box B \vdash \Box B$ is changed to
\GLRc, which is derivable. Suppose an inference $\GLR(Y,C)$ is
encountered, where $B$ is in $Y$. This inference, allowing for
multiple copies of B, is as shown below left where $Z$ is $Y$ with
$B$ deleted:
$$
\bpf
\A "$\Box B^k, B^k, \Box Z, Z, \Box C \vdash C$"
\U "$\Box B^k , \Box Z \vdash \Box C$" "\GLR(Y,C)"
\epf
\hspace*{-2em}
\bpf
\A "Lemma~\ref{l-gr20e}"
\U "\lxxc"
\A "${\Box X, B^k, \Box Z, Z, \Box C \vdash C}$"
\B "${\Box X, \Box X, X, \Box Z, Z, \Box C \vdash C}$"
"\textit{mcut(B)}"
\U "${\Box X, X, \Box Z, Z, \Box C \vdash C}$" "\textit{ctr}"
\U "${\Box X, \Box Z \vdash \Box C}$" "\GLR(C)"
\epf
$$
By induction, we have
${\Box X, B^k, \Box Z, Z, \Box C \vdash C}$ is derivable.
From there we have the derivation shown above right.
As the machine-checking process confirmed,
additional weakening steps are necessary
if $\Box B$ is in $Z$ or if $B$ is in $\Box Z$.
% \vpf
% \begin{center}
% \bpf
% \A "\lxxc"
% \A "${\Box X, B^k, \Box Z, Z, \Box C \vdash C}$"
% \B "${\Box X, \Box X, X, \Box Z, Z, \Box C \vdash C}$"
% "(\textit{multicut on B})"
% \U "${\Box X, X, \Box Z, Z, \Box C \vdash C}$" "(\textit{contraction})"
% \U "${\Box X, \Box Z \vdash \Box C}$" "(\GLR(C))"
% \epf
% \end{center}
\qed
\end{proof}
This construction is like that of case 2(a)(ii) in the proof of
Theorem~22 of \cite{gr-valentini}. Having shown, in
Lemma~\ref{l-cw0-mas'}, that \muxbn\ $B$ 0 holds, we now use the
construction of \cite[Lemma~19]{gr-valentini} to show that \muxbn\ $B$
$n$ holds for all $n$: except that our inductive
assumptions and results involve admissibility of multicut, not cut.
Again we use induction: assume that \muxbn\ $B$ $n$, and show that
\muxbn\ $B$ $(n+1)$ holds.
So let a derivation tree $\mu / \GLRc$ be given,
whose bottom inference is \GLRXB, and where \delz\ $B\ \mu = n + 1$.
We follow the construction of \cite[Lemma~19]{gr-valentini}
to obtain a derivation $\mu'$ of \GLRp,
where \delz\ $B\ \mu' \leq n$.
% We first look at deriving \lxxc.
Since \delz\ $B\ \mu > 0$, the tree $\mu$ is as shown below left
(with other branches not shown).
We first delete the topmost application of \GLR{} obtaining
a tree $\mu^{-}$.
Then adjoin $\Box A$ to each antecedent of $\mu^{-}$
obtaining the tree on the right,
whose topmost sequent is now a weakened axiom,
and which requires us to weaken in an occurrence of
$A$ just above the final GLR{}-rule instance:
\vpf
\begin{center}
\bpf
\A "${\Box G, G, \Box B^k, B^k, \Box A\vdash A} $"
\U "\gbna" "\GLR(A)"
\U "\vdots"
\U "\GLRp"
\U "\GLRc" "\GLR(X,B)"
\epf
\qquad
\bpf
% \A "$\Box A \vdash \Box A$"
% \U "$\Box A, A, \gbna$" "(\textit{weakening})"
\A "$\Box A, \gbna$"
\U "\vdots"
\U "$\Box A, \GLRp$"
\U "$\Box A, A, \GLRp$" "(\textit{weakening})"
\U "$\Box A, \GLRc$" "\GLR(B)"
\epf
\end{center}
The tree on the right has a smaller value of \delz\ $B$, and so can be
used as the left branch of an admissible (i.e.\ ``smaller'') multicut.
We do this twice, the right branches being derivations of \GLRp\ and
${\Box G, G, \Box B^k, B^k, \Box A\vdash A} $ respectively; which
after contractions, give derivations of $\Box A, \lxxc$ and ${\Box G,
G, \Box X, B^k, \Box A\vdash A}$. These two are cuts 1 and 2 of the
description in \cite[Lemma~19]{gr-valentini}, producing derivations
which are cut-free equivalents of $\Lambda_{11}$ and $\Lambda_{12}$
from \cite[Lemma~19]{gr-valentini}. The result \texttt{gr20a} in the
Isabelle code %Lemma~\ref{l-gr20a}
gives the existence of these two derivations; it uses the result
\texttt{add\_Box\_ant} which permits adding $\Box A$ to the antecedent
of each sequent of a derivation tree.
We combine these derivations of $\Box A, \lxxc$ and $\Box G, G, \Box
X, B^k, \Box A\vdash A$ using an admissible (``smaller'') multicut on
$B$, and then use contraction to obtain $\Box G, G, \Box A, \Box X, X
\vdash A$. This is cut 3 of~\cite[Lemma~19]{gr-valentini}. Then the
\GLR\ rule gives $\Box G, \Box X \vdash \Box A$. This is derivation
$\Lambda_{2}$ of \cite[Lemma~19]{gr-valentini}. Because of this \GLR\
rule at this point, we do not need to worry about the \delz\ $B$
values of any of the subtrees mentioned above, whose existence is
given by the inductive assumptions of multicut-admissibility. We now
weaken the conclusion of this tree to $\Box X, \Box G, \Box B^k \vdash
\Box A$, giving (a counterpart of) $\Lambda_{3}$ of
\cite[Lemma~19]{gr-valentini}.
\begin{figure}[t]
\centering
\normalsize
\bpf
\A "\GLRp"
\U "\GLRc" "(\GLR)"
\A "\GLRp"
\B "${\Box X, \Box X, X \vdash B}$" "(\textit{cut})"
\U "\lxxc" "(\textit{contraction})"
\epf
\caption{Derivation of \lxxc\ from \GLRp}
\label{f-dlxxc}
\end{figure}
Returning to $\mu^{-}$, we this time adjoin $\Box X$ on
the left giving the tree below right,
but we can now use $\Lambda_{3}$ as a derivation for its leaf:
\vpf
\begin{center}
\bpf
\A "\gbna"
\U "\vdots"
\U "\GLRp"
\U "\GLRc" "\GLR"
\epf
\qquad \qquad
\bpf
\A "$\Lambda_3$"
\U "$\Box X, \gbna$"
\U "\vdots"
\U "$\Box X, \GLRp$"
\U "$\GLRp$" "(\textit{contraction})"
\U "\GLRc" "\GLR"
\epf
\end{center}
That is, we have replaced a given derivation $\mu$ of \GLRp\, where
\delz\ $B\ \mu = n + 1$, with a derivation $\mu'$ of \GLRp\ where
\delz\ $B\ \mu' = n$.
We record this as the result as Lemma~\ref{l-gr20bc}~(\texttt{gr20b})
below, however, we do not use this result directly. Instead, we obtain
Lemma~\ref{l-gr20bc}~(\texttt{gr20c'}) below by referring to
Figure~\ref{f-muxbn} and noting that we have replaced $\mu$ by $\mu'$.
\comment{
\begin{verbatim}
PROBABLY OMIT THIS - AT PRESENT WE HAVE A DESCRIPTION OF THIS RESULT BUT NO
FORMAL STATEMENT
gr20a; "[| ([conclDT ?dtglp], ?seq) : glrxb ?X ?B;
valid glss ?dtglp; del0 ?B ?dtglp = Suc ?n; muxbn ?B ?n |]
==> EX A H dtH.
({#Box A#} + mset_map Box ?X + ?X |- {#?B#}) : derrec glss {} &
({#Box A#} + mset_map Box (?X + ms_delete {?B} H) + H |- {#A#})
: derrec glss {} &
premsDT dtH = [mset_map Box H |- {#Box A#}] &
conclDT dtH = conclDT ?dtglp &
allDT (frg glss) dtH & del0 ?B dtH <= ?n" : Thm.thm
\end{verbatim}
}
\begin{lemma}[\texttt{gr20b, gr20c'}] \label{l-gr20bc}
Assume that multicut-admissibility holds for cut-formula $B$,
and that \muxbn\ $B$ $n$ holds.
\begin{enumerate}
\item \label{l-gr20b}
If $\mu$ is a derivation of \GLRp, where \delz\ $B\ \mu = n + 1$,
then there exists another derivation $\mu'$ of \GLRp\ with
\delz\ $B\ \mu' \leq n$.
\item \label{l-gr20c} \muxbn\ $B$ $(n+1)$ holds.
\end{enumerate}
\begin{verbatim}
gr20b : "[| ((conclDT ?mu) =
(mset_map Box ?X + ?X + {# Box ?B#} |- {#?B#})) ;
valid glss ?mu ; del0 ?B ?mu = Suc ?n ;
mas glss ?B = UNIV; muxbn ?B ?n |]
==> EX dtn.
valid glss dtn & conclDT dtn = conclDT ?mu & del0 ?B dtn <= ?n"
gr20c' : "[| muxbn ?B ?n; mas glss ?B = UNIV |] ==> muxbn ?B (Suc ?n)"
\end{verbatim}
\end{lemma}
\begin{proof} The proof of the first part is the construction
described above. For the second part, we are given an instance of
Figure~\ref{f-muxbn}, where $\mu$ has \delz\ $B\ \mu = n + 1$.
Using the first part, we can transform $\mu$ to $\mu'$, with \delz\
$B\ \mu' = n$. Since \muxbn\ $B$ $n$ holds, the multicut on $\Box
B$ where the left premise derivation is $\mu' / \GLRc$ is
admissible. Hence the conclusion $\Box X, Y \vdash Z$ of
Figure~\ref{f-muxbn} is derivable.
\end{proof}
The next result, which we do not use directly, approximates to
\cite[Lemma~19]{gr-valentini}.
\begin{lemma}[\texttt{gr20d}] \label{l-gr20d} If
multicut-admissibility holds for cut-formula $B$, \muxbn\ $B\ 0$
holds, and \GLRp\ is derivable, then \lxxc\ is
derivable.
\end{lemma}
% \begin{verbatim}
% gr20d : "[| ((conclDT ?dtglp) =
% (mset_map Box ?X + ?X + {# Box ?B#} |- {#?B#})) ;
% valid glss ?dtglp; mas glss ?B = UNIV; muxbn ?B 0 |]
% ==> (mset_map Box ?X + ?X |- {#?B#}) : derrec glss {}"
% \end{verbatim}
\begin{proof}
By using Lemma~\ref{l-gr20bc}\ref{l-gr20c} repeatedly, \muxbn\ $B\
n$ holds for any $n$. Then the cut of Figure~\ref{f-dlxxc} is
admissible, giving the result.
\end{proof}
\begin{lemma}[\texttt{cw0\_mas, cut\_glr}] \label{l-cut-glr} If
(shallow) multicut-admissibility holds for cut-formula $B$, then it
holds (deeply) for cut-formula $\Box B$ where the left premise
derivation finishes with a \GLRB\ rule. That is, \muxbn\ $B\ n$
holds for any $n$.
\end{lemma}
\begin{verbatim}
cut_glr : "[| mas glss ?B = UNIV; ([conclDT ?mu], ?seq) : glr ?B |]
==> (Der ?seq [?mu], ?dtb) : masdt glss (Box ?B)"
cw0_mas : "mas glss ?B = UNIV ==> muxbn ?B ?n"
\end{verbatim}
\begin{proof}
By combining Lemma \ref{l-cw0-mas'} with
repeated use of Lemma~\ref{l-gr20bc}\ref{l-gr20c}.
\end{proof}
We now have the admissibility of principal multicuts on $\Box B$,
which we indicated was the crux of the cut-admissibility for \GL{}.
For the other cases, the usual proofs hold. Although there is no
particular difficulty in showing this, many details need to be
checked: this takes almost 3 pages of \cite{gr-valentini}, even with
``omitting details'', and citing ``standard transformations''. The
value of automated theorem proving at this point is simply to ensure
that all the necessary details have been checked. The proof uses
techniques described in \S\ref{s-cagen}, which means that a relatively
small amount of the proofs remaining for this theorem are peculiar to
the \GLS\ calculus. Consequently, we get:
% mar_glss : "[| ?seqa : derrec glss {}; ?seqb : derrec glss {} |]
% ==> (?seqa, ?seqb) : mar glss ?A"
\begin{theorem}
Multicut is admissible in \GLS.
\begin{verbatim}
(?G |- ?D) : derrec glss {} ; (?P |- ?S) : derrec glss {}
===> (?G + (ms_delete {?A} ?P) |- (ms_delete {?A} ?D) + ?S))
: derrec glss {}
\end{verbatim}
\end{theorem}
% \section{Conclusions and Further Work}
% \label{s-concl-fw}
% We have described an proof in the Isabelle theorem prover of
% cut admissibility for the sequent calculus \GLS\ for the provability logic \GL.
% The proof is based on the proof given in \cite{gr-valentini},
% particularly the construction in \cite[Lemma~19]{gr-valentini},
% though our presentation uses it in a slightly different way.
% We have made considerable use of definitions and theorems which are
% useful beyond this particular sequent system, or, indeed, beyond
% proofs about sequent systems. We have developed axiomatic type
% classes based on properties of $+$, $-$, $0$ and $\leq$ for multisets,
% and shown how a rather small set of axioms about these operators is
% sufficient for defining a lattice. Multiset sequents (pairs of
% multisets) also belong to this type class. We have applied relevant
% simplification procedures to this type class, which was useful in the
% proofs. We have described extensive definitions and theorems relating
% to abstract derivability, which we have used in several different
% metalogical theories and proofs, and we have discussed the issue of
% deep versus shallow embeddings in the light of the experience of this
% and previous work.