\section{GL Stuff from here onwards}
Here, we describe general methods for formalising the proof-theory of
traditional, propositional, multiset-based sequent calculi containing
explicit structural rules in Isabelle/HOL. We then show how we used them
to formalise a proof of cut-admissibility for \GL{} based on, but
different from, that of~\cite{gr-valentini}.
In Section~\ref{sec:preliminaries}, we briefly describe traditional
sequent calculi, discuss the need for multisets, and describe the
controversy surrounding the cut-elimination theorem for the set-based
sequent system \GLS{} for provability logic \GL{} .
%
In Section~\ref{s-deriv} we describe techniques and functions we have
defined for reasoning about derivations, and show how we formalise
formulae, sequents and rules, showing some of the $\GL$ sequent rules
as examples.
%
In Section~\ref{s-axtc} we describe an Isabelle axiomatic type class
which we developed to facilitate reasoning about multisets of
formulae, and sequents based on them. This development explores the
interaction between lattice ($\land$, $\lor$, $\leq$) and
``arithmetic'' ($+$, $-$, 0, $\leq$) operations.
%
In Section~\ref{s-cagen} we discuss those aspects of our Isabelle
proofs which we tried to make as general as possible, and how they are
useful for proving cut-elimination and other results in arbitrary
sequent calculi which meet the associated preconditions.
%
In Section~\ref{s-caglr} we describe the cut-admissibility proof for
\GLS.
%and in Section~\ref{s-concl-fw} we conclude.
We assume the reader is familiar with basic
proof-theory, ML and Isabelle/HOL.
%
Our Isabelle code can be found at
\url{http://users.rsise.anu.edu.au/~jeremy/isabelle/200x/seqms/}.
Some of this work was reported informally in \cite{gore-india}.
%\end{document}
%TO DO - include explanation of Isabelle notation
\section{Sequents, Multisets, Sets and Provability Logic}
\label{sec:preliminaries}
Proof-theorists typically work with sequents $\Gamma
\vdash \Delta$ where $\Gamma$ and $\Delta$ are ``collections'' of
formulae. The ``collections'' found in the literature increase in
complexity from simple sets for classical logic, to multisets for
linear logic, to ordered lists for substructural logics, to complex
tree structures for display logics.
%
A sequent rule typically has a rule name, a (finite) number of premises,
a side-condition and a conclusion.
% \begin{prooftree}
% \AxiomC{$\Gamma_1 \vdash \Delta_1
% \qquad\cdots\qquad
% \Gamma_n \vdash \Delta_n$}
% \LeftLabel{RuleName}
% \RightLabel{Condition}
% \UnaryInfC{$\Gamma_0 \vdash \Delta_0$}
% \end{prooftree}
Rules are read top-down as ``if all the premises hold then the
conclusion holds''. A derivation of the judgement $\Gamma \vdash
\Delta$ is typically a finite tree of judgements with root $\Gamma
\vdash \Delta$ where parents are obtained from children by ``applying
a rule''. We use ``derivation'' to refer to a
proof \emph{within} a calculus, reserving ``proof'' for a
meta-theoretic proof of a theorem \emph{about} the calculus.
Provability logic \GL{} is obtained by adding L\"ob's axiom
$\Box (\Box A \to A) \to \Box A$ to propositional normal modal logic
\textrm{K}.
Its Kripke semantics is based on rooted transitive Kripke frames without
infinite forward chains.
%
It rose to prominence when Solovay showed that $\Box A$
could be interpreted as ``A is provable in Peano
Arithmetic''~\cite{Sol76}.
%
An initial proof-theoretic account was given by Leivant in
1981 when he ``proved'' cut-elimination for a set-based sequent
calculus for \GL~\cite{Lei81}.
%
But Valentini in 1983 found a simple counter-example and gave a new
cut-elimination proof~\cite{Val83}.
%
The issue seemed to have been settled, but in 2001, Moen~\cite{moen}
claimed that Valentini's transformations don't terminate if the
sequents $\Gamma \vdash \Delta$ are based on multisets. There is of
course no \textit{a priori} reason why a proof based on sets should
not carry over with some modification to a proof based on multisets.
%so this set the cat amongst the pigeons.
%
% In response, Negri~\cite{Neg05} in 2005 gave a new cut-elimination
% proof using sequents built from labelled formulae $w: A$, which
% captures that the traditional formula $A$ is true at the possible
% world $w$. But this is not satisfactory as it brings the underlying
% (Kripke) semantics of modal logic into the proof theory. Mints in 2005
% announced a new proof using traditional methods~\cite{Min06}.
%
The issue was recently resolved by Gor\'e and Ramanayake~\cite{gr-valentini}
who showed that Moen is incorrect, and that Valentini's proof using
multisets is \textit{mostly} okay. Many such examples exist in the literature.
The sequent system \GLS{} for the logic \GL{} as given by Gor\'e and
Ramanayake in \cite{gr-valentini}, like Valentini's, contains explicit
weakening and contraction rules and, modulo these, a typical
(additive) set of rules for the classical logical connectives $\lnot,
\land, \lor, \to$. The one additional rule \GLR{} which characterises \GL{}
is shown below:
\begin{prooftree}
\AxiomC{$\GLRp$}
\RightLabel{GLR \mbox{ or } GLR(B) \mbox{ or } GLR(X,B)}
\UnaryInfC{$\GLRc$}
\end{prooftree}
% $$
% \frac \GLRp \GLRc ~~ GLR \mbox{ or } GLR(B) \mbox{ or } GLR(X,B)
% $$
The rule is unusual since the principal formula $\Box B$ changes
polarity from conclusion to premise.
To identify the principal formula involved, or all the formulae, we may
refer to it as \GLRB, or \GLRXB.
% The full set of rules of \GLS\ is
% shown in \cite{gr-valentini}, but
Note that our formalisation does not
regard the cut (or multicut) rules shown below as being part of the
system:
\[
\AxiomC{$\Gamma \vdash A, \Delta$}
\AxiomC{$\Gamma, A \vdash \Delta$}
\LeftLabel{(cut)}
\BinaryInfC{$\Gamma \vdash \Delta$}
\DisplayProof
\qquad
\AxiomC{$\Gamma \vdash A^{n}, \Delta$}
\AxiomC{$\Gamma, A^{m} \vdash \Delta$}
\LeftLabel{(multicut)}
\BinaryInfC{$\Gamma \vdash \Delta$}
\DisplayProof
\]
Thus our results will be lemmata of the form:
if $\Gamma \vdash A, \Delta$ is (cut-free) derivable
and $\Gamma, A \vdash \Delta$ is (cut-free) derivable
then $\Gamma \vdash \Delta$ is (cut-free) derivable.
\section{Reasoning About Derivations and Derivability}
\label{s-deriv}
% \section{Isabelle} \label{s-isa}
% Isabelle is an automated proof assistant \cite{isabelle-ref}. Its
% meta-logic is an intuitionistic typed higher-order logic, sufficient
% to support the built-in inference steps of higher-order unification
% and term rewriting. Isabelle accepts inference rules of the form
% $\alpha_1, \alpha_2, \ldots, \alpha_n, \Longrightarrow \beta$
% where the $\alpha_i$ and $\beta$ are expressions of the Isabelle
% meta-logic, or are expressions using a new syntax, defined by the
% user, for some ``object logic''. Most users build on one of the
% comprehensive ``object logics'' already supplied, like Isabelle/HOL
% \cite{IsLogHOL}, which is an Isabelle theory based on the higher-order
% logic of Church % \cite{church-types}
% and the HOL system of Gordon \cite{HOL-introduction}. HOL offers
% inductively defined sets, and recursive datatypes, which we use
% extensively.
In previous work~\cite{dawson-gore-embedding-comparing}, we describe
our initial attempts to formalise display calculi in various logical
frameworks and describe why we selected Isabelle/HOL. When we model a
derivation tree, explicitly, we do so using the capability in HOL of
defining types like the datatypes of SML, see \S\ref{s-dse}.
%Here we return to simpler sequents and seek generality.
\subsection{Deep and Shallow Embeddings} \label{s-dse} In
\cite{dawson-gore-cut-admissibility} we proved cut admissibility for
\dRA, the display calculus for relation algebras. The proof was based
on a proof of Belnap, which applied generally to display calculi whose
inference rules satisfied certain properties. In that paper we
described the formalisation as a ``deep embedding''. We now believe
that to describe a particular formalisation as either a ``deep
embedding'' or a ``shallow embedding'' over-simplifies the issue as we
now discuss.
In~\cite{dawson-gore-cut-admissibility}, we modelled a derivation tree
in HOL as a recursive structure which consists of a
root sequent (which should be the conclusion of some rule), and an
associated list of (sub)trees (each of which should derive a premise of
that rule). The type of explicit derivation trees is a recursive
datatype, such as:
\begin{verbatim}
datatype 'a dertree = Der 'a ('a dertree list)
| Unf 'a (* unfinished leaf which remains unproved *)
\end{verbatim}
%
We then had to define a predicate on such a tree, saying whether it
was in fact correctly based on the given rules, and so represented a
valid derivation.
%
We modelled a sequent rule as an object in HOL consisting of a list of
premises and a conclusion, each of which was a sequent in the language
of the logic (of relation algebras). Notably, our language of formulae
included ``variables'' which could be instantiated with formulae,
so we defined functions for such substitutions.
%
Thus it is more accurate to say in
~\cite{dawson-gore-cut-admissibility}, we used a deep embedding for
\emph{derivations}, \emph{rules} and \emph{variables}, since we
modelled each of these features of the proof-theory explicitly rather
than identifying them with the same features of Isabelle.
In alternative approaches to work of this type, the set of derivable
sequents can be modelled as an inductively defined set \texttt{ders}
(say),
% (as we describe in \S\ref{s-dp}).
where the derivation tree does not exist, but it ``happens'' in the
course of proving that a specific sequent is derivable. That is,
derivability in the sequent calculus studied equates to derivability
in Isabelle, giving a shallow embedding of {\em derivations}.
In such alternative approaches, the rules of the sequent calculus can
either be encoded explicitly
%(as assumed in the treatment in \S\ref{s-dp}),
or they can be encoded in the clauses for the inductive definition of
\texttt{ders}. In the latter case, the rules of the
sequent calculus under study are encoded as theorems of Isabelle.
Thus, even with a shallow embedding of derivations, we have a choice
of either a deep or a shallow embedding of {\em rules}:
%where \textit{ders} means the set of derivable sequents:
% \begin{gather}
% (\{P, Q\}, {P \land Q}) \in \mathit{rules} \tag{deep}
% \qquad
% \frac{ (ps, c) \in \mathit{rules} \quad ps \subseteq \mathtt{ders}}
% { c \in \mathtt{ders}}
% \\[1ex]
% \frac {P \in \mathtt{ders} \qquad Q \in \mathtt{ders}}
% {P \land Q \in \mathtt{ders}} \tag{shallow}
% \end{gather}
%
\begin{description}
\item[\rm Deep Rules and Shallow Derivation:] \ \\[0.5em]
\begin{tabular}[c]{l}
$\mathtt{(\{\Gamma \vdash P, ~~\Gamma \vdash Q\},
\;\; \Gamma \vdash P \land Q) \in rules}$
\\[0.5em]
$\mathtt{ [|~(ps, c) \in rules ~;~ ps \subseteq \mathtt{ders}~|]
==> c \in \mathtt{ders}}$
\\[0.5em]
\end{tabular}
\item[\rm Shallow Rules and Shallow Derivations:] \ \\[0.5em]
\begin{tabular}[c]{l}
$\mathtt{[|~\Gamma \vdash P \in \mathtt{ders} ~;~
\Gamma \vdash Q \in \mathtt{ders}~|]
==> \Gamma \vdash P \land Q \in \mathtt{ders}}$.
\end{tabular}
\end{description}
Finally, compared to \cite{dawson-gore-cut-admissibility}, it is much
more common to let the variables in a rule be the variables of the
theorem prover, and for the theorem prover to do the substitution. In
general, however, a deep embedding of one feature can be combined with
a shallow embedding of another.
Our experience is that shallow embeddings generally permit easier
proofs, but sometimes they are inadequate to express a desired
concept. For example, with a deep embedding of rules it is easy to
express that one set of rules is a subset of another set. With a deep
embedding of derivation trees it is easy to express that one property
of derivation trees is the conjunction of two other properties, or
that a derivation tree has a particular height, whereas to express
such things using \derl, \derrec, etc (see \S\ref{s-dp}), one would
have to redefine these predicates incorporating the particular
properties of interest. Indeed, in this work (see \S\ref{s-caglr}) we
discuss how we found a shallow embedding of derivability inadequate.
We have developed the code on which
\cite{dawson-gore-cut-admissibility} was based to reflect the fact
that Belnap's original proof applied generally to display calculi
satisfying certain conditions. Some of these conditions required a
deep embedding of variables (and, therefore, of the rules), for
example, that no structure variable appears twice in the conclusion of
the statement of a rule. For others, a shallow embedding of variables
was adequate, since the condition applied equally to the statement of
the rule and to all its substitution instances, for example that a
premise of a rule contains no variable not contained in the
conclusion.
% In \cite{dawson-gore-machine-checked-strong-normalisation}, we
% considered the individual steps involved in changing a derivation
% containing cuts to a derivation not containing cuts. The explicit
% manipulation of derivation trees, such that each change makes a tree
% ``smaller'' in some sense, made it necessary to have a deep embedding
% of derivation trees.
\subsection{Our General Derivability Predicates} \label{s-dp}
We now briefly describe the functions we used to describe
derivability. A fuller account is given in \cite{gore-india}. This
framework is general in that a rule merely consists of ``premises''
and a ``conclusion'', and is independent of whether the things derived
are formulae, sequents, or other constructs, but we will refer to them
as formulae.
\begin{verbatim}
types 'a psc = "'a list * 'a" (* single step inference *)
consts
derl, adm :: "'a psc set => 'a psc set"
dersl :: "'a psc set => ('a list * 'a list) set"
dercsl :: "'a psc set => ('a list list * 'a list) set"
derrec :: "'a psc set => 'a set => 'a set"
dersrec :: "'a psc set => 'a set => 'a list set"
\end{verbatim}
An inference rule \texttt{'a psc} is a list of premises \texttt{ps}
and a conclusion \texttt{c}. Then, \texttt{derl rls} is the set of
rules derivable from the rule set \texttt{rls}, and \texttt{derrec rls
prems} is the set of formulae derivable using rules \texttt{rls}
from the set \texttt{prems} of premises. These were defined as
inductive sets, using also the functions \texttt{dersl} and
\texttt{dersrec}. So $\mathtt{(ps, c) \in \derl\rls}$ reflects the
shape of a derivation tree: \texttt{ps} is a list of exactly the
premises used, in the correct order, whereas $\mathtt{c \in \derrec\
\rls\ prems}$ holds even if the set of premises \texttt{prems}
contains extra formulae.
The auxiliary function \texttt{dersrec} represents several formulae,
all derivable from the premises. The auxiliary function
\texttt{dersl} represents several derivation trees side by side;
$\mathtt{(ps, cs) \in dersl~rls}$ when \texttt{ps} is the
concatenation of their lists of premises, and \texttt{cs} is the list
of their conclusions. The function \texttt{dercsl} is similar, but it
shows which premises are part of which tree. That is, $\mathtt{(pss, cs) \in
\texttt{dercsl}\ rls}$ implies $\mathtt{(concat\ pss, cs) \in \texttt{dersl}\
rls}$. Curiously, we have used this framework for several years, and
only recently we found the need to redefine \derl\ using
\texttt{dercsl} rather than \texttt{dersl} as the auxiliary function.
The key clauses for these functions are shown below:
\begin{verbatim}
dtderI : "[| (ps, concl) : pscrel ; (pss, ps) : dersl pscrel |]
==> (pss, concl) : derl pscrel"
derI : "[| (ps, concl) : pscrel ; ps : dersrec pscrel prems |]
==> concl : derrec pscrel prems"
\end{verbatim}
We obtained the expected results linking \texttt{derl} and \texttt{derrec},
and a number of results expressing transitivity of derivation and the
results of derivation using derived rules, for example:
\begin{verbatim}
derl_trans : "[| (?ps, ?c) : derl ?rls; (?x, ?ps) : dersl ?rls |]
==> (?x, ?c) : derl ?rls"
derl_deriv_eq : "derl (derl ?rls) = derl ?rls"
derrec_trans_eq : "derrec ?rls (derrec ?rls ?prems) = derrec ?rls ?prems"
\end{verbatim}
A simplified verson of the induction principle
generated by the definition of the inductive set \derrec\ is as follows:
$$
\frac {x \in derrec\ rls\ prems \quad
\forall c \in prems. P\ c \quad
\forall (ps, c) \in rls. (\forall p\ in\ ps. P\ p) \Rightarrow P\ c}
{P\ x}
$$
We may contrast use of this principle with induction on the height of a
derivation tree.
Using induction on the height, it is possible, and in some proofs necessary,
to transform a sub-tree in some height-preserving (or height-reducing) way, and
use the assumption that the transformed tree has property $P$.
That approach is not available when using the induction principle above,
and was not needed for our earlier proofs.
Where we have a property of two derivations, such as cut-admissibility,
we need a more complex induction principle, which we derived
(limited to the case where the set \textit{prems} of ultimate assumptions is
empty) :
$$
\begin{array}{c}
cl \in derrec\ rlsl\ \{\} \quad cr \in derrec\ rlsr\ \{\} \\
\forall (lps, lc) \in rlsl, (rps, rc) \in rlsr.\
(\forall lp\ in\ lps. P\ lp\ rc) \land (\forall rp\ in\ rps. P\ lc\ rp)
\Rightarrow P\ lc\ rc \\
\hline { P\ cl\ cr}
\end{array}
$$
Finally, $(ps, c)$ is an admissible rule iff: if all premises in $ps$
are derivable, then $c$ is derivable:\\
$(ps, c) \in \adm\ \rls \iff
(set\ ps \subseteq \derrec\ \rls \ \ \Rightarrow \ \ c \in \derrec\
\rls)
$.
We obtained the following four results, which were surprisingly tricky
since \adm\ is not monotonic:
\begin{verbatim}
"derl ?rls <= adm ?rls" "adm (adm ?rls) = adm ?rls"
"adm (derl ?rls) = adm ?rls" "derl (adm ?rls) = adm ?rls"
\end{verbatim}
\subsection{Sequents, Formulae and the \GLS\ rules} \label{s-gls}
We define a language of formulae, with the usual connectives:
\begin{verbatim}
datatype formula = FC string (formula list) (* formula connective *)
| FV string (* formula variable *)
| PP string (* primitive proposition *)
\end{verbatim}
%
The formula connectives are fixed for each logic,
but we make the datatype as general as possible,
replacing connectives such as $\land$, $\lor$, etc, with \texttt{FC}
\textit{name}:
\begin{verbatim}
consts
Btimes :: "formula => formula => formula" ("_ && _" [67,68] 67)
Bplus :: "formula => formula => formula" ("_ v _" [63,64] 63)
defs
Btimes "Btimes f g == FC ''Btimes'' [f, g]"
Bplus "Bplus f g == FC ''Bplus'' [f, g]"
\end{verbatim}
%
This permits a single definition of the immediate (proper) subformula
relation, \texttt{ipsubfml}, which will not need to be changed when new
connectives are added.
\begin{verbatim}
consts ipsubfml :: "(formula * formula) set"
inductive "ipsubfml" (* proper immediate subformula relation *)
intros ipsI : "P : set Ps ==> (P, FC conn Ps) : ipsubfml"
\end{verbatim}
%
A sequent is a pair of multisets of formulae, written
$\Gamma \vdash \Delta$ or $\Gamma \Rightarrow \Delta$.
\begin{verbatim}
datatype 'a sequent =
Sequent "'a multiset" "'a multiset" ("((_)/ |- (_))" [6,6] 5)
\end{verbatim}
Given a rule such as $(\vdash \land)$ in the form on the left,
$$
\MR_s = \frac{\vdash A \quad \vdash B} {\vdash A \land B}
\qquad
\MR_e = \frac{X \vdash Y, A \quad X \vdash Y, B} {X \vdash Y, A \land B}
$$
we define functions \texttt{extend} and \texttt{pscmap}, such that
(for example)
\begin{gather*}
\texttt{extend}\ (X \vdash Y)\ (U \vdash V) = (X + U) \vdash (Y + V) \\
\MR_e = \texttt{pscmap}\ (\texttt{extend}\ (X \vdash Y))\ \MR_s
\end{gather*}
Then \texttt{extrs} $S$ means the set of all extensions of all rules in the set
$S$.
\begin{verbatim}
consts
extend :: "'a sequent => 'a sequent => 'a sequent"
extrs :: "'a sequent psc set => 'a sequent psc set"
defs
extend_def : "extend fmls seq == seq + fmls"
pscmap_def : "pscmap f (ps, c) = (map f ps, f c)"
\end{verbatim}
Then we define \texttt{glss}, the set of rules of \GLS;
we show just a selection.
Sets \texttt{wkrls} and \texttt{ctrrls} $A$ refer to weakening and contraction
(on $A$) rules.
\begin{verbatim}
inductive "glir" (* right introduction rules *)
intros
andr : "([{#} |- {#A#}, {#} |- {#B#}], {#} |- {#A && B#}) : glir"
orr1 : "([{#} |- {#A#}], {#} |- {#A v B#}) : glir"
inductive "glne" (* rules before being extended *)
intros
wkI : "(ps, c) : wkrls ==> (ps, c) : glne"
ctrI : "(ps, c) : ctrrls A ==> (ps, c) : glne"
ilI : "(ps, c) : glil ==> (ps, c) : glne"
irI : "(ps, c) : glir ==> (ps, c) : glne"
inductive "glr B"
intros
I : "([mset_map Box X + X + {#Box B#} |- {#B#}],
mset_map Box X |- {#Box B#}) : glr B"
inductive "glss"
intros
axiom : "([], {#A#} |- {#A#}) : glss"
extI : "psc : glne ==> pscmap (extend flr) psc : glss"
glrI : "psc : glr B ==> psc : glss"
\end{verbatim}
Note that $X$ is a multiset, and $\Box X$ is informal notation for applying
$\Box$ to each member of $X$; this is formalised as \texttt{mset\_map},
used in the formalised \GLR\ rule.
Using a similar notation we write $\Box B^k$ for $(\Box B)^k$,
the multiset containing $n$ copies of $\Box B$.
Development of \texttt{mset\_map} and relevant lemmas is in the source files
\texttt{Multiset\_no\_le.\{thy,ML\}}.
\section{An Axiomatic Type Class for Multisets and Sequents} \label{s-axtc}
Isabelle provides a theory of finite multisets with an ordering
based on an underlying ordering on members: given multiset $M$,
we obtain a multiset $N$ less than $M$ by removing an occurrence of some $m$
from $M$, and replacing it with any number of occurrences of elements
smaller than $m$. We defined a different ordering $\leq$ analogous to
$\subseteq$ for sets: $N \leq M$ if, for all $x$, $N$ contains no more
occurrences of $x$ than does $M$.
An axiomatic type class is characterised by a number of axioms, which hold
for all members of a type in the type class.
The axioms can be used to prove theorems, which then apply
and can be used efficiently for any type which is in the class.
The multiset operators $\leq$, $+$, $-$ and $0$ have several useful properties,
which are described by the following axiomatic type classes
\pmz{} and \pmgez.
\begin{verbatim}
axclass pm0 < comm_monoid_add, minus
pm0_plus_minus : "A + B - A = B"
pm0_minus_minus : "A - B - C = A - (B + C)"
\end{verbatim}
Thus for any type in class \pmz, the operations
$+$ and $0$ form a commutative monoid, the type allows the $-$ operator, and
the two axioms given hold.
\begin{verbatim}
axclass pm_ge0 < pm0, ord, leq
all_ge0 : "0 <= A"
diff_0_le : "(m - n = 0) = (m <= n)"
le_add_diff_eq : "B <= A ==> B + (A - B) = A"
pm_ge0_less_le : "(x < y) = (x <= y & x ~= y)"
pg_leq_def : "a [= b == a <= b"
\end{verbatim}
Class \pmgez\ also has an $\leq$ operator,
$0$ is the smallest element via \texttt{all\_ge0},
and \texttt{le\_add\_diff\_eq} also holds.
For a new type, the other axioms could be made definitions,
but for natural numbers, $\leq$ and $<$ are already defined.
The naturals obey all these axioms. An explanation of \texttt{[=} is
below.
\begin{lemma}
Multisets are in \pmz\ and \pmgez\ using our definition of $\leq$.
Moreover, if $\Gamma$ and $\Delta$ are of any type in the classes
\pmz\ or \pmgez, then our sequents $\Gamma \vdash \Delta$ are also
in these classes.
\end{lemma}
Isabelle contains ``simplification procedures'',
which will simplify $a - b + c + b$ to $a + c$ for integers,
and which will simplify $a + b + c - b$ to $a + c$
for integers \emph{or} naturals.
We have been able to apply almost all of the ones which apply to naturals,
to types of the classes \pmz\ and \pmgez.
It is easy to show that, in the obvious way, multisets form a lattice.
Isabelle's lattice type class is based on using \texttt{[=} as the
order operator. By including the definitional
axiom \texttt{pg\_leq\_def} above,
we were able to show:
\begin{lemma}
Any type of class \pmgez\ forms a lattice.
\end{lemma}
\begin{proof}
The definitions of meet and join are shown below:
\begin{verbatim}
"pg_meet ?c ?d == ?c - (?c - ?d)" "pg_join ?c ?d == ?c + (?d - ?c)"
\end{verbatim}
From these it is possible to prove
the axioms for a lattice
at some length,
and so any type in the class \pmgez\ is also in Isabelle's class
\texttt{lattice}.
\end{proof}
We produced a number of equivalent definitions characterising atoms of the
lattice, and properties of atomic elements
(our definition of \texttt{atomic} includes $0$).
Thus we showed that $x$ is atomic is equivalent to each of the following:
\begin{tabular}[c]{l@{\extracolsep{1cm}}l}
\multicolumn{2}{c}{$\forall y < x.\ y = 0$}
\\
$\forall z.\ x \leq z \lor (\mathit{pg\_meet}\ x\ z = 0)$
& $\forall a b.\ x \leq a + b \Rightarrow x \leq a \lor x \leq b$
\\
$\forall a b.\ 0 < a \land 0 < b \Rightarrow a + b \not= x$
& $\forall a b.\ a < x \land b < x \Rightarrow a + b \not= x$
\end{tabular}
%
The source files for this development are \texttt{pmg*.\{thy,ML\}}.
\section{Generalising Cut-Admissibility Proofs} \label{s-cagen}
We now show how our previous work on multicut admissibility for LK
can be formulated to make it as general as possible.
%
In \S\ref{s-deriv} we mentioned the induction principle used for
deriving cut-admissibility, or indeed any property $P$ of two-premise
subtrees. In the diagram below, to prove $P (\MR_{l}, \MR_{r})$, the
induction assumption is that $P (\MQ_{li}, \MR_{r})$ and $P (\MR_{l},
\MQ_{rj})$ hold for all $i$ and $j$:
% \begin{center}
% \bpf
% \A "$\MQ_{l1} \ldots \MQ_{ln}$"
% \U "$\MR_{l}$" "$\rho_l$"
% \A "$\MQ_{r1} \ldots \MQ_{rm}$"
% \U "$\MR_{r}$" "$\rho_r$"
% \B "?" "(\textit{cut ?})"
% \epf
% \end{center}
\begin{center}
\begin{prooftree}
\AxiomC{$\MQ_{l1} \ldots \MQ_{ln}$}
\RightLabel{$\rho_l$}
\UnaryInfC{$\MR_{l}$}
\AxiomC{$\MQ_{r1} \ldots \MQ_{rm}$}
\RightLabel{$\rho_r$}
\UnaryInfC{$\MR_{r}$}
\RightLabel{(\textit{cut ?})}
\dottedLine
\BinaryInfC{?}
\end{prooftree}
\end{center}
A proof of $P (\MR_{l}, \MR_{r})$ using this induction assumption
inevitably proceeds according to what the rules $\rho_l$ and $\rho_r$
are, and further, for a cut-formula $A$, whether it is principal in
either or both of $\rho_l$ and $\rho_r$. But our proofs also use
induction on the size of the cut-formula, or, more generally, on some
well-founded relation on formulae.
% Since the $\MQ_{li}$, etc, above are sequents,
So we actually consider a property $P\ A\ (cl, cr)$
where $psl$ are the premises $\MQ_{l1} \ldots \MQ_{ln}$ of rule
$\rho_l$, and $cl$ is its conclusion $\MR_{l}$, and analogously for
$\rho_r$, and $A$ is the cut-formula. In proving $P\ A\ (\MR_{l},
\MR_{r})$, in addition to the inductive assumption above, we assume
that $P\ A' \ (da, db)$ holds generally for $A'$ smaller than $A$ and
all sequents $da$ and $db$ which are derivable from the empty set of
premises using rules from $rls$. These intuitions give the
following definition \texttt{gen\_step2sr} of
a condition which permits one step of the inductive proof
where we shorten ``derivable from the empty set of premises using rule
from \texttt{rls}'' to ``\texttt{rls}-derivable'':
\begin{definition}
For a formula \texttt{A},
a property \texttt{P},
a subformula relation \texttt{sub},
a set of rules \texttt{rls},
inference rules \texttt{(psl, cl)}, and
\texttt{(psr, cr)},
the property \texttt{gen\_step2sr} holds if:
\texttt{P A (cl, cr)} holds whenever
\texttt{P~A'~(da,~db)} holds
for all subformulae \texttt{A'} of \texttt{A}
and all \texttt{rls}-derivable sequents
\texttt{da} and \texttt{db};
for each \texttt{pa} in \texttt{psl}, \texttt{pa} is
\texttt{rls}-derivable and \texttt{P A (pa, cr)} holds;
for each \texttt{pb} in \texttt{psr},
\texttt{pb} is \texttt{rls}-derivable and
\texttt{P A (cl, pb)} holds;
\texttt{cl} and \texttt{cr} are \texttt{rls}-derivable.
\begin{verbatim}
gen_step2sr_simp :
"gen_step2sr ?P ?A ?sub ?rls ((?psl, ?cl), (?psr, ?cr)) =
( (ALL A'. (A', ?A) : ?sub -->
(ALL da:derrec ?rls {}. ALL db:derrec ?rls {}. ?P A' (da, db)))
-->
(ALL pa:set ?psl. pa : derrec ?rls {} & ?P ?A (pa, ?cr)) -->
(ALL pb:set ?psr. pb : derrec ?rls {} & ?P ?A (?cl, pb)) -->
?cl : derrec ?rls {} --> ?cr : derrec ?rls {}
--> ?P ?A (?cl, ?cr) )"
\end{verbatim}
\end{definition}
The main theorem \texttt{gen\_step2sr\_lem} below for proving an
arbitrary property $P$ states that if the step of the inductive proof
is possible for all cases of final rule instances $(psl, cl)$ and
$(psr, cr)$ on each side, then $P$ holds in all cases.
\begin{theorem}
If \texttt{A} is in the well-founded part of the subformula relation;
sequents \texttt{seql} and \texttt{seqr} are \texttt{rls}-derivable ;
and for all formulae \texttt{A}, and all rules \texttt{(psl, cl)} and
\texttt{(psr, cr)}, our induction step condition
\texttt{gen\_step2sr P A sub rls ((psl, cl), (psr, cr))}
holds, then
\texttt{P A (seql, seqr)} holds.
\begin{verbatim}
gen_step2sr_lem :
"[| ?A : wfp ?sub ;
?seql : derrec ?rls {} ; ?seqr : derrec ?rls {} ;
ALL A. ALL (psl, cl):?rls. ALL (psr, cr):?rls.
gen_step2sr ?P A ?sub rls ((psl, cl), (psr, cr)) |]
==> ?P ?A (?seql, ?seqr)"
\end{verbatim}
\end{theorem}
This enables us to split up an inductive proof, by showing,
separately, that \texttt{gen\_step2sr} holds for particular cases of
the final rules $(psl, cl)$ and $(psr, cr)$ on each side. In some
cases these results apply generally to different calculi. For
example, the inductive step for the case where the cut-formula $A$ is
parametric, not principal, on the left is encapsulated in the
following theorem where \texttt{prop2 mar ?erls ?A (?cl, ?cr)} means
that the conclusion of a multicut on \texttt{A} with premises
\texttt{cl} and \texttt{cr} is derivable using rules \texttt{erls}.
\begin{theorem}
If weakening is admissible for the rule set \texttt{erls}, all
extensions of some rule \texttt{(ps, c)} are in the rule set
\texttt{erls}, and the final rule instance of the left hand subtree
is an extension of \texttt{(ps, c)} where the cut-formula \texttt{A}
is not in the succedent of \texttt{c} (meaning that \texttt{A} is
parametric on the left), then
\verb!gen_step2sr (prop2 mar ?erls) ?A ?sub ?rls (?pscl, ?pscr)!
holds.
\begin{verbatim}
mk_sr1 lmg_gen_step' :
"[| wk_adm ?erls; extrs {(?ps, ?c)} <= ?erls; ~ ?A :# succ ?c;
?pscl = pscmap (extend ?flr) (?ps, ?c) |]
==> gen_step2sr (prop2 mar ?erls) ?A ?sub ?rls (?pscl, ?pscr)"
\end{verbatim}
\end{theorem}
Of course, for a system containing explicit weakening rules, such as
\GLS{}, weakening is {\it a fortiori} admissible. But as noted
earlier, we have tried to make our results as general as possible.
This result codifies parametric proof steps, and is quite general
enough to apply in the proof for \GLS. As we note later, the proof
for \GLS\ involves one really difficult case and a lot of fairly
routine cases. In dealing with the routine cases, automated theorem
proving has the benefit of ensuring that no detail is overlooked; in
addition we have the fact that, as in this example, we often have more
general theorems that apply directly to a set of rules such as \GLS.
Notice that all of this section has used a shallow embedding of
derivations since no explicit derivation trees were required.
\section{The Proof of Cut-Admissibility for \GLS} \label{s-caglr}
Valentini's proof of cut-admissibility for \GLS{} uses a triple
induction on the size of the cut-formula, the heights of the
derivations of the left and right premises of cut, and a third
parameter which he called the ``width''.
Roughly speaking, the width of a cut-instance is the number of \GLR{}
rule instances above that cut which contain a parametric ancestor of
the cut-formula in their conclusion. The Gor\'e and
Ramanayake~\cite{gr-valentini} proof follows Valentini in using these
three measures, but gives a constructive way to calculate the width of
a cut by inspecting the branches of the left and right sub-derivations
of a cut rule instance.
% \begin{figure}[t]
% \centering
% \normalsize
% \bpf
% \A "$\displaystyle \mu\left\{\ \frac{\Pi_l}{\GLRp}\right.$"
% \U "\GLRc" "\GLR(B)"
% \A "$\Pi_r$"
% \U "${\Box B^k, Y \vdash Z}$" "($\rho$)"
% \B [2em] "${\Box X, Y \vdash Z}$" "(\textit{multicut})"
% \epf
% \caption{A multicut on cut formula $\Box B$ where $\Box B$ is
% left-principal via \GLR{}}
% \label{f-muxbn}
% \end{figure}
As is usual, the proof of cut-admissibility for \GLS{} proceeds by
considering whether the cut-formula is principal only in the left
premise of the cut, principal only in the right premise of the cut,
or principal in both. The crux of the proof is a ``reduction'' when
the cut-formula is of the form $\Box B$ and is principal in both
the left and right premises of the cut as shown below:
%
\begin{prooftree}
\AxiomC{$X, \Box X, \Box B \vdash B$}
\RightLabel{GLR(B)}
\UnaryInfC{$\Box X \vdash \Box B$}
%
\AxiomC{$Y, \Box Y, B, \Box B, \Box C\vdash C$}
\RightLabel{GLR(C)}
\UnaryInfC{$\Box B, \Box Y \vdash \Box C$}
%
\dottedLine
\RightLabel{cut}
\BinaryInfC{$\Box X, \Box Y \vdash \Box C$}
\end{prooftree}
%
The solution is to replace this cut on $\Box B$ by cuts which are
``smaller'' either because their cut-formula is smaller, or because
their width is smaller. In reality, most of the work involves a cut
instance which is only left-principal as shown in
Figure~\ref{f-muxbn}, and most of this section is devoted to how we
utilised our general methods to formalise these ``reductions'' as
given by Gor\'e and Ramanayake~\cite{gr-valentini}. But there are some
important differences which we now explain.
As explained in \S\ref{s-dp}, our general methods do not model
derivation trees explicitly, but use a shallow embedding. So our proof
uses induction on the size of the cut-formula and on the fact of
derivation, rather than on the size of the cut-formula and the height
of derivation trees. Also, we actually proved the admissibility of
``multicut'': that is, if $\Gamma \vdash A^n, \Delta$ and $\Gamma,
A^m \vdash \Delta$ are both cut-free derivable, then so is $\Gamma
\vdash \Delta$. This avoids problems in the cases where these
sequents are derived using contraction on $A$.
In all other aspects, our proof of cut-admissibility for \GLS\ is
based on that given by Gor\'e and Ramanayake~\cite{gr-valentini}. In
particular, although we do not actually require
\cite[Lemma~19]{gr-valentini}, we use the construction in that lemma,
which is fundamental to overcoming the difficulties of adapting
standard proofs to \GLS, as we explain shortly. Consequently, our
proof uses the idea of induction on the \emph{width} as defined in
\cite{gr-valentini}, although as we shall see, our proof is expressed
in terms of \delz, which approximates to the \delzs\ of
\cite{gr-valentini}, not width {\em per se}.
To cater for width/\delz, we could have altered our shallow embedding,
\texttt{derrec} but that destroys the modularity of our
approach. Instead, we defined width/\delz{} by using the datatype
\texttt{dertree} from \S\ref{s-dse} to represent explicit derivation
tree objects. These trees, and the general lemmas about them, are
similar to the trees of \cite{dawson-gore-cut-admissibility}. Thus we
``mix and match'' a deep embedding of derivation trees with a shallow
embedding of inductively defined sets of derivable sequents.
To ensure the correctness of our ``mixing and matching'' we needed the
following relationship between the two embeddings, where a
\textit{valid} tree is one whose inferences are in the set of rules
and which as a whole has no premises.
\begin{lemma}
Sequent \texttt{a} is shallowly derivable from the empty set of
premises using rules \texttt{rls} iff some derivation tree
\texttt{dt} is valid wrt.\ \texttt{rls} and has a conclusion
\texttt{a}.
\begin{verbatim}
derrec_valid_eq :
"(?a : derrec ?rls {}) = (EX dt. valid ?rls dt & conclDT dt = ?a)"
\end{verbatim}
\end{lemma}
% The provability logic \GL\
% and its sequent system \GLS, including the \GLR\ rule,
% are described in \S\ref{s-intro}.
% It is clear that the \GLR\ rule implies the transitivity rule
% (by weakening $\Box B$ from the premise of \GLR),
% as shown on the left below;
% the rule on the right holds,
% as can be shown using \textit{(cut)} by the derivation in Figure~\ref{f-dlxxc},
% or by consideration of the semantics.
% $$
% \begin{array}{c}
% \lxxc \\ \hline \GLRp \\ \hline \GLRc
% \end{array}
% \qquad \qquad
% \frac \GLRp \lxxc
% $$
We now define a function \delz{} which is closely related to
$\partial^0$ and the width of a cut of \cite{gr-valentini}, although
we are able to avoid mentioning the annotated derivations of
\cite{gr-valentini}. The argument of this function will be the
sub-tree $\mu$ of Figure~\ref{f-muxbn}. Here, \texttt{glra} is the
set of \emph{all} \GLR{}-rule instances, \texttt{antec} is a function to return
the antecedent of a sequent, and \texttt{:\#} is multiset membership:
%
\begin{verbatim}
primrec
Der : "del0 B (Der seq dtl) =
(if (map conclDT dtl, seq) : glra then
if Box B :# antec seq then 1 else 0
else del0s B dtl)"
Unf : "del0 B (Unf dt) = 0"
Nil : "del0s B [] = 0"
Cons : "del0s B (dt # dts) = del0 B dt + del0s B dts"
\end{verbatim}
%
Thus, the calculation of \delz\ traces up each branch of a derivation
tree until an instance of the \GLR\ rule is found: it then counts 1 if
$\Box B$ is in the antecedent, meaning that $B$ is in the $X$ of the
statement of \GLR. Where the derivation tree branches below any \GLR\
rule, the value is summed over the branches.
We now give a sequence of lemmata which eventually give the
``reduction'' for left-and-right \GLR{}-principal occurrences
of a cut-formula of the form $\Box B$.
\begin{lemma}[\texttt{gr20e}] \label{l-gr20e} If $\mu$ is a valid
derivation tree with conclusion \GLRp{}, and \delz\ $\mu = 0$, then
\lxxc\ is derivable (shallowly).
\end{lemma}
\begin{verbatim}
"[| valid glss ?mu ; del0 ?B ?mu = 0 ; (conclDT ?mu) =
(mset_map Box ?X + ?X + {# Box ?B#} |- {#?B#}) |]
==> (mset_map Box ?X + ?X |- {#?B#}) : derrec glss {}" : Thm.thm
\end{verbatim}
\begin{proof}
By applying the \GLR\ rule to the conclusion, we can derive \GLRc.
Tracing $\Box B$ upwards in $\mu$, it is parametric in each
inference, except possibly weakening, contraction or the axiom $\Box
B \vdash \Box B$. That is, since \delz\ $\mu = 0$, when we meet a
\GLR\ inference as we trace upwards, $\Box B$ must have already
disappeared (through weakening). So, tracing upwards, we can
change each instance of $\Box B$ to $\Box X$ in the usual way.
The axiom $\Box B \vdash \Box B$ is changed to \GLRc, which is
derivable. Contraction is not problematic since we use, as the
inductive hypothesis, that \emph{all} occurrences of $\Box B$ can be
replaced by $\Box X$.
\end{proof}
To abbreviate the statement of several lemmas, we define a function
\muxbn:
\begin{verbatim}
defs muxbn_def : "muxbn B n ==
ALL mu dtb seq. ([conclDT mu], seq) : glr B -->
del0 B mu <= n --> (Der seq [mu], dtb) : masdt glss (Box B)"
\end{verbatim}
Intuitively, \texttt{muxbn B n} states that if the left premise
derivation consists of a derivation $\mu$ whose conclusion can be used
as the premise of an instance of the \GLRB\ rule, and \delz~\texttt{B}
$\mu \leq n$, then multicut-admissibility for explicit derivation
trees (\texttt{masdt}) holds for the rule set \texttt{glss} with cut-formula $\Box B$:
see Figure~\ref{f-muxbn}, which is from~\cite{gr-valentini}).
% (In the following diagram, $\mu = \pi / \GLRp$).
That is, the multicut in Figure~\ref{f-muxbn} is admissible.
% \vpf
% \begin{figure}
% \centering
% \normalsize
% \bpf
% \A "$\displaystyle \mu\left\{\ \frac{\pi}{\GLRp}\right.$"
% \U "\GLRc" "\GLR"
% \A "$\sigma$"
% \U "${\Box B^k, U \vdash W}$"
% \B [2em] "${\Box X, U \vdash W}$" "(\textit{multicut})"
% \epf
% \caption{Definition of \muxbn}
% \label{f-muxbn}
% \end{figure}
The next lemma says that multicut admissibility on $B$
implies \muxbn\ $B$ 0.
\begin{lemma}[\texttt{cw0\_lbl, cw0\_mas'}] \label{l-cw0-mas'} If
$\mu$ is a valid derivation tree with conclusion \GLRp, and \delz\
$B\ \mu = 0$, and multicut on $B$ is admissible, and $\Box B^k, Y
\vdash Z$ is derivable, then $\Box X, Y \vdash Z$ is derivable.
\end{lemma}
% \begin{verbatim}
% cw0_mas' : "mas glss ?B = UNIV ==> muxbn ?B 0"
% \end{verbatim}
\begin{proof}
\GLRc{} is derivable from \GLRp{} via \GLR(X,B). By
Lemma~\ref{l-gr20e}, \lxxc\ is derivable. The rest of the proof is
by induction on the derivation of $\Box B^k, Y \vdash Z$, in effect,
by tracing relevant occurrences of $\Box B$ up that
derivation. Weakening and contraction involving $\Box B$ are not
problematic, and the axiom $\Box B \vdash \Box B$ is changed to
\GLRc, which is derivable. Suppose an inference $\GLR(Y,C)$ is
encountered, where $B$ is in $Y$. This inference, allowing for
multiple copies of B, is as shown below left where $Z$ is $Y$ with
$B$ deleted:
$$
% \bpf
% \A "$\Box B^k, B^k, \Box Z, Z, \Box C \vdash C$"
% \U "$\Box B^k , \Box Z \vdash \Box C$" "\GLR(Y,C)"
% \epf
% \hspace*{-2em}
% \bpf
% \A "Lemma~\ref{l-gr20e}"
% \U "\lxxc"
% \A "${\Box X, B^k, \Box Z, Z, \Box C \vdash C}$"
% \B "${\Box X, \Box X, X, \Box Z, Z, \Box C \vdash C}$"
% "\textit{mcut(B)}"
% \U "${\Box X, X, \Box Z, Z, \Box C \vdash C}$" "\textit{ctr}"
% \U "${\Box X, \Box Z \vdash \Box C}$" "\GLR(C)"
% \epf
$$
By induction, we have
${\Box X, B^k, \Box Z, Z, \Box C \vdash C}$ is derivable.
From there we have the derivation shown above right.
As the machine-checking process confirmed,
additional weakening steps are necessary
if $\Box B$ is in $Z$ or if $B$ is in $\Box Z$.
% \vpf
% \begin{center}
% \bpf
% \A "\lxxc"
% \A "${\Box X, B^k, \Box Z, Z, \Box C \vdash C}$"
% \B "${\Box X, \Box X, X, \Box Z, Z, \Box C \vdash C}$"
% "(\textit{multicut on B})"
% \U "${\Box X, X, \Box Z, Z, \Box C \vdash C}$" "(\textit{contraction})"
% \U "${\Box X, \Box Z \vdash \Box C}$" "(\GLR(C))"
% \epf
% \end{center}
\qed
\end{proof}
This construction is like that of case 2(a)(ii) in the proof of
Theorem~22 of \cite{gr-valentini}. Having shown, in
Lemma~\ref{l-cw0-mas'}, that \muxbn\ $B$ 0 holds, we now use the
construction of \cite[Lemma~19]{gr-valentini} to show that \muxbn\ $B$
$n$ holds for all $n$: except that our inductive
assumptions and results involve admissibility of multicut, not cut.
Again we use induction: assume that \muxbn\ $B$ $n$, and show that
\muxbn\ $B$ $(n+1)$ holds.
So let a derivation tree $\mu / \GLRc$ be given,
whose bottom inference is \GLRXB, and where \delz\ $B\ \mu = n + 1$.
We follow the construction of \cite[Lemma~19]{gr-valentini}
to obtain a derivation $\mu'$ of \GLRp,
where \delz\ $B\ \mu' \leq n$.
% We first look at deriving \lxxc.
Since \delz\ $B\ \mu > 0$, the tree $\mu$ is as shown below left
(with other branches not shown).
We first delete the topmost application of \GLR{} obtaining
a tree $\mu^{-}$.
Then adjoin $\Box A$ to each antecedent of $\mu^{-}$
obtaining the tree on the right,
whose topmost sequent is now a weakened axiom,
and which requires us to weaken in an occurrence of
$A$ just above the final GLR{}-rule instance:
% \vpf
% \begin{center}
% \bpf
% \A "${\Box G, G, \Box B^k, B^k, \Box A\vdash A} $"
% \U "\gbna" "\GLR(A)"
% \U "\vdots"
% \U "\GLRp"
% \U "\GLRc" "\GLR(X,B)"
% \epf
% \qquad
% \bpf
% % \A "$\Box A \vdash \Box A$"
% % \U "$\Box A, A, \gbna$" "(\textit{weakening})"
% \A "$\Box A, \gbna$"
% \U "\vdots"
% \U "$\Box A, \GLRp$"
% \U "$\Box A, A, \GLRp$" "(\textit{weakening})"
% \U "$\Box A, \GLRc$" "\GLR(B)"
% \epf
% \end{center}
The tree on the right has a smaller value of \delz\ $B$, and so can be
used as the left branch of an admissible (i.e.\ ``smaller'') multicut.
We do this twice, the right branches being derivations of \GLRp\ and
${\Box G, G, \Box B^k, B^k, \Box A\vdash A} $ respectively; which
after contractions, give derivations of $\Box A, \lxxc$ and ${\Box G,
G, \Box X, B^k, \Box A\vdash A}$. These two are cuts 1 and 2 of the
description in \cite[Lemma~19]{gr-valentini}, producing derivations
which are cut-free equivalents of $\Lambda_{11}$ and $\Lambda_{12}$
from \cite[Lemma~19]{gr-valentini}. The result \texttt{gr20a} in the
Isabelle code %Lemma~\ref{l-gr20a}
gives the existence of these two derivations; it uses the result
\texttt{add\_Box\_ant} which permits adding $\Box A$ to the antecedent
of each sequent of a derivation tree.
We combine these derivations of $\Box A, \lxxc$ and $\Box G, G, \Box
X, B^k, \Box A\vdash A$ using an admissible (``smaller'') multicut on
$B$, and then use contraction to obtain $\Box G, G, \Box A, \Box X, X
\vdash A$. This is cut 3 of~\cite[Lemma~19]{gr-valentini}. Then the
\GLR\ rule gives $\Box G, \Box X \vdash \Box A$. This is derivation
$\Lambda_{2}$ of \cite[Lemma~19]{gr-valentini}. Because of this \GLR\
rule at this point, we do not need to worry about the \delz\ $B$
values of any of the subtrees mentioned above, whose existence is
given by the inductive assumptions of multicut-admissibility. We now
weaken the conclusion of this tree to $\Box X, \Box G, \Box B^k \vdash
\Box A$, giving (a counterpart of) $\Lambda_{3}$ of
\cite[Lemma~19]{gr-valentini}.
% \begin{figure}[t]
% \centering
% \normalsize
% \bpf
% \A "\GLRp"
% \U "\GLRc" "(\GLR)"
% \A "\GLRp"
% \B "${\Box X, \Box X, X \vdash B}$" "(\textit{cut})"
% \U "\lxxc" "(\textit{contraction})"
% \epf
% \caption{Derivation of \lxxc\ from \GLRp}
% \label{f-dlxxc}
% \end{figure}
% Returning to $\mu^{-}$, we this time adjoin $\Box X$ on
% the left giving the tree below right,
% but we can now use $\Lambda_{3}$ as a derivation for its leaf:
% \vpf
% \begin{center}
% \bpf
% \A "\gbna"
% \U "\vdots"
% \U "\GLRp"
% \U "\GLRc" "\GLR"
% \epf
% \qquad \qquad
% \bpf
% \A "$\Lambda_3$"
% \U "$\Box X, \gbna$"
% \U "\vdots"
% \U "$\Box X, \GLRp$"
% \U "$\GLRp$" "(\textit{contraction})"
% \U "\GLRc" "\GLR"
% \epf
% \end{center}
% That is, we have replaced a given derivation $\mu$ of \GLRp\, where
% \delz\ $B\ \mu = n + 1$, with a derivation $\mu'$ of \GLRp\ where
% \delz\ $B\ \mu' = n$.
We record this as the result as Lemma~\ref{l-gr20bc}~(\texttt{gr20b})
below, however, we do not use this result directly. Instead, we obtain
Lemma~\ref{l-gr20bc}~(\texttt{gr20c'}) below by referring to
Figure~\ref{f-muxbn} and noting that we have replaced $\mu$ by $\mu'$.
% \comment{
% \begin{verbatim}
% PROBABLY OMIT THIS - AT PRESENT WE HAVE A DESCRIPTION OF THIS RESULT BUT NO
% FORMAL STATEMENT
% gr20a; "[| ([conclDT ?dtglp], ?seq) : glrxb ?X ?B;
% valid glss ?dtglp; del0 ?B ?dtglp = Suc ?n; muxbn ?B ?n |]
% ==> EX A H dtH.
% ({#Box A#} + mset_map Box ?X + ?X |- {#?B#}) : derrec glss {} &
% ({#Box A#} + mset_map Box (?X + ms_delete {?B} H) + H |- {#A#})
% : derrec glss {} &
% premsDT dtH = [mset_map Box H |- {#Box A#}] &
% conclDT dtH = conclDT ?dtglp &
% allDT (frg glss) dtH & del0 ?B dtH <= ?n" : Thm.thm
% \end{verbatim}
% }
\begin{lemma}[\texttt{gr20b, gr20c'}] \label{l-gr20bc}
Assume that multicut-admissibility holds for cut-formula $B$,
and that \muxbn\ $B$ $n$ holds.
\begin{enumerate}
\item \label{l-gr20b}
If $\mu$ is a derivation of \GLRp, where \delz\ $B\ \mu = n + 1$,
then there exists another derivation $\mu'$ of \GLRp\ with
\delz\ $B\ \mu' \leq n$.
\item \label{l-gr20c} \muxbn\ $B$ $(n+1)$ holds.
\end{enumerate}
\begin{verbatim}
gr20b : "[| ((conclDT ?mu) =
(mset_map Box ?X + ?X + {# Box ?B#} |- {#?B#})) ;
valid glss ?mu ; del0 ?B ?mu = Suc ?n ;
mas glss ?B = UNIV; muxbn ?B ?n |]
==> EX dtn.
valid glss dtn & conclDT dtn = conclDT ?mu & del0 ?B dtn <= ?n"
gr20c' : "[| muxbn ?B ?n; mas glss ?B = UNIV |] ==> muxbn ?B (Suc ?n)"
\end{verbatim}
\end{lemma}
\begin{proof} The proof of the first part is the construction
described above. For the second part, we are given an instance of
Figure~\ref{f-muxbn}, where $\mu$ has \delz\ $B\ \mu = n + 1$.
Using the first part, we can transform $\mu$ to $\mu'$, with \delz\
$B\ \mu' = n$. Since \muxbn\ $B$ $n$ holds, the multicut on $\Box
B$ where the left premise derivation is $\mu' / \GLRc$ is
admissible. Hence the conclusion $\Box X, Y \vdash Z$ of
Figure~\ref{f-muxbn} is derivable.
\end{proof}
The next result, which we do not use directly, approximates to
\cite[Lemma~19]{gr-valentini}.
\begin{lemma}[\texttt{gr20d}] \label{l-gr20d} If
multicut-admissibility holds for cut-formula $B$, \muxbn\ $B\ 0$
holds, and \GLRp\ is derivable, then \lxxc\ is
derivable.
\end{lemma}
% \begin{verbatim}
% gr20d : "[| ((conclDT ?dtglp) =
% (mset_map Box ?X + ?X + {# Box ?B#} |- {#?B#})) ;
% valid glss ?dtglp; mas glss ?B = UNIV; muxbn ?B 0 |]
% ==> (mset_map Box ?X + ?X |- {#?B#}) : derrec glss {}"
% \end{verbatim}
\begin{proof}
By using Lemma~\ref{l-gr20bc}\ref{l-gr20c} repeatedly, \muxbn\ $B\
n$ holds for any $n$. Then the cut of Figure~\ref{f-dlxxc} is
admissible, giving the result.
\end{proof}
\begin{lemma}[\texttt{cw0\_mas, cut\_glr}] \label{l-cut-glr} If
(shallow) multicut-admissibility holds for cut-formula $B$, then it
holds (deeply) for cut-formula $\Box B$ where the left premise
derivation finishes with a \GLRB\ rule. That is, \muxbn\ $B\ n$
holds for any $n$.
\end{lemma}
\begin{verbatim}
cut_glr : "[| mas glss ?B = UNIV; ([conclDT ?mu], ?seq) : glr ?B |]
==> (Der ?seq [?mu], ?dtb) : masdt glss (Box ?B)"
cw0_mas : "mas glss ?B = UNIV ==> muxbn ?B ?n"
\end{verbatim}
\begin{proof}
By combining Lemma \ref{l-cw0-mas'} with
repeated use of Lemma~\ref{l-gr20bc}\ref{l-gr20c}.
\end{proof}
We now have the admissibility of principal multicuts on $\Box B$,
which we indicated was the crux of the cut-admissibility for \GL{}.
For the other cases, the usual proofs hold. Although there is no
particular difficulty in showing this, many details need to be
checked: this takes almost 3 pages of \cite{gr-valentini}, even with
``omitting details'', and citing ``standard transformations''. The
value of automated theorem proving at this point is simply to ensure
that all the necessary details have been checked. The proof uses
techniques described in \S\ref{s-cagen}, which means that a relatively
small amount of the proofs remaining for this theorem are peculiar to
the \GLS\ calculus. Consequently, we get:
% mar_glss : "[| ?seqa : derrec glss {}; ?seqb : derrec glss {} |]
% ==> (?seqa, ?seqb) : mar glss ?A"
\begin{theorem}
Multicut is admissible in \GLS.
\begin{verbatim}
(?G |- ?D) : derrec glss {} ; (?P |- ?S) : derrec glss {}
===> (?G + (ms_delete {?A} ?P) |- (ms_delete {?A} ?D) + ?S))
: derrec glss {}
\end{verbatim}
\end{theorem}
% \section{Conclusions and Further Work}
% \label{s-concl-fw}
% We have described an proof in the Isabelle theorem prover of
% cut admissibility for the sequent calculus \GLS\ for the provability logic \GL.
% The proof is based on the proof given in \cite{gr-valentini},
% particularly the construction in \cite[Lemma~19]{gr-valentini},
% though our presentation uses it in a slightly different way.
% We have made considerable use of definitions and theorems which are
% useful beyond this particular sequent system, or, indeed, beyond
% proofs about sequent systems. We have developed axiomatic type
% classes based on properties of $+$, $-$, $0$ and $\leq$ for multisets,
% and shown how a rather small set of axioms about these operators is
% sufficient for defining a lattice. Multiset sequents (pairs of
% multisets) also belong to this type class. We have applied relevant
% simplification procedures to this type class, which was useful in the
% proofs. We have described extensive definitions and theorems relating
% to abstract derivability, which we have used in several different
% metalogical theories and proofs, and we have discussed the issue of
% deep versus shallow embeddings in the light of the experience of this
% and previous work.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Here begins the end matter
% \bibliographystyle{plain}
% \bibliography{../ce,/home/discus/disk1/rpg/bib/fm,/home/discus/disk1/rpg/bib/modal,/home/discus/disk1/rpg/bib/mtl,/home/discus/disk1/rpg/bib/logic,/home/discus/disk1/rpg/bib/tl,/home/discus/disk1/rpg/bib/paracon,/home/discus/disk1/rpg/bib/linear,/home/discus/disk1/rpg/bib/algebra,/home/discus/disk1/rpg/bib/relevant,/home/discus/disk1/rpg/bib/dl,/home/discus/disk1/rpg/bib/atp}
% \begin{thebibliography}{10}
% \bibitem{belnap-display}
% N~D {Belnap}.
% \newblock Display logic.
% \newblock {\em J.\ of Philosophical Logic}, 11:375--417, 1982.
% % \bibitem{borisavljevic-dosen-petric-permuting-cut-with-contraction}
% % M Borisavljevic, K Dosen, and Z Petric.
% % \newblock On permuting cut with contraction.
% % \newblock {\em Mathematical Structures in Computer Science}, 10:99--136, 2000.
% % \bibitem{church-types}
% % A~Church.
% % \newblock A formulation of the simple theory of types.
% % \newblock {\em Journal of Symbolic Logic}, 5:56--68, 1940.
% \bibitem{dawson-gore-embedding-comparing}
% J~E Dawson and R Gor{\'e}.
% \newblock Embedding display calculi into logical frameworks: Comparing {T}welf
% and {I}sabelle.
% \newblock {\em ENTCS} 42, 89--103.
% %\newblock In C~Fidge (Ed), {\em Proc.\ CATS 2001}, {\em ENTCS}, 42:89--103,
% % \url{http://www.elsevier.nl/gej-ng/31/29/23/68/show/Products/notes/cover.htt},
% 2001. Elsevier.
% \bibitem{dawson-gore-cut-admissibility}
% J~E Dawson and R Gor{\'e}.
% \newblock Formalised Cut Admissibility for Display Logic.
% \newblock In Proc.\ TPHOLS'02, LNCS 2410, 131--147, Springer, 2002.
% % \bibitem{dawson-gore-machine-checked-strong-normalisation}
% % Jeremy~E.~Dawson and Rajeev Gor{\'e}.
% % A New Machine-checked Proof of Strong Normalisation for Display Logic.
% % In Computing: The Australasian Theory Symposium, 2003, ENTCS 78, 16--35, Elsevier.
% \bibitem{abella-gacek}
% A~Gacek.
% \newblock The Abella interactive theorem prover (system description)
% \newblock In Proc.\ IJCAR 2008 LNCS~5195:154-161, Springer, 2008.
% \bibitem{HOL-introduction}
% M~J~C Gordon and T~F Melham.
% \newblock {\em Introduction to {HOL}: a Theorem Proving Environment for Higher
% Order Logic}.
% \newblock Cambridge University Press, 1993.
% \bibitem{gore-india}
% R~Gor\'e.
% Invited talk in the Third Indian Conference on Logic and its
% Applications, Chennai, January 2009.
% \bibitem{gr-valentini}
% R~Gor\'e and R~Ramanayake.
% Valentini's Cut-elimination for Provability Logic Resolved.
% Proc.\ AiML 2008, pp 67-86, College Publications
% \url{http://users.rsise.anu.edu.au/~rpg/publications.html}
% % \bibitem{gore-relalg}
% % R~Gor{\'{e}}.
% % \newblock Cut-free display calculi for relation algebras.
% % \newblock In Proc.\ CSL96, LNCS 1258, 198--210, Springer, 1997.
% \bibitem{Lei81}
% D~Leivant.
% \newblock On the Proof Theory of the Modal Logic for Arithmetic Provability.
% Journal of Symbolic Logic, 46:531–538, 1981.
% \bibitem{moen}
% A~Moen.
% \newblock The proposed algorithms for eliminating cuts in the
% provability calculus GLS do not terminate.
% \newblock NWPT 2001, Norwegian Computing Center, 2001-12-10
% \url{http://publ.nr.no/3411}
% \bibitem{IsLogHOL}
% T Nipkow, L~C. Paulson, and M Wenzel.
% \newblock Isabelle's logics: {HOL}.
% \newblock Technical report.
% \newblock 15 February 2001, \texttt{doc/logics-HOL.dvi} in the Isabelle
% distribution.
% \bibitem{isabelle-ref}
% L~C. Paulson.
% \newblock The {Isabelle} reference manual.
% \newblock Technical report.
% \newblock 15 February 2001, \texttt{doc/ref.dvi} in the Isabelle distribution.
% \bibitem{pfenning-structural-lics94}
% F Pfenning.
% \newblock Structural cut elimination.
% \newblock In {\em Proc.\ LICS 94}, 1994.
% \bibitem{pym-harland-uniform}
% D. Pym and J. Harland.
% \newblock A Uniform Proof-theoretic Investigation of Linear Logic Programming.
% \newblock J.\ of Logic and Computation 4:2:175-207, 1994.
% \bibitem{twelf-system}
% F~Pfenning and C~Schürmann.
% \newblock System description:
% Twelf --- a meta-logical framework for deductive systems.
% \newblock In Proc.\ CADE-16, pages 202--206, Springer-Verlag LNAI 1632.
% \bibitem{Sol76}
% R~M~Solovay.
% \newblock Provability Interpretations of Modal Logic. Israel Journal
% of Mathematics, 25:287–304, 1976.
% \bibitem{Val83}
% S~Valentini.
% \newblock The Modal Logic of Provability: Cut-elimination. Journal of
% Philosophical Logic, 12:471–476, 1983.
% \bibitem{wansing-displaying-modal-logic}
% H Wansing.
% \newblock {\em Displaying Modal Logic}, volume~3 of {\em Trends in Logic}.
% \newblock Kluwer Academic Publishers, Dordrecht, August 1998.
% \end{thebibliography}
% \end{document}
% %%% Local Variables:
% %%% mode: latex
% %%% TeX-master: t
% %%% End: