\section{Introduction}

Relation algebras are extensions of Boolean algebras;
whereas Boolean algebras model subsets of a given set,
relation algebras model binary relations on a given set.
Relation algebras form the basis of relational databases \cite{db} and 
of the specification and proof of correctness of programs
(\cite{wp},\cite{hoare}).
Just as Boolean algebras can be studied as a logical system
(classical propositional logic), relation algebras can also
be studied in a logical, rather than an algebraic, fashion.

In particular, relation algebras can be formulated using Display
Logic \cite{dRA}, and in several other ways \cite{Mad}.

Display Logic \cite{Bel} is a syntactic proof system for non-classical logic,
based on the Gentzen sequent calculus \cite{Gallier}.
%It can be applied to give syntactic proof systems for various logics.
Its advantages include a generic cut-elimination theorem, which 
applies whenever the rules for the specific logic satisfy certain conditions.

Isabelle is an automated proof assistant \cite{Int},\cite{Ref}.
It is written in ML, and the user interacts with it by issuing further ML
commands; this permits programming complex arrangements of
proof steps in ML.
The capabilities built in include higher-order unification and 
term rewriting.
Isabelle permits the user to define his/her own logic in which to perform
proofs, although in practice most users would build on one of the
systems of logic supplied \cite{OL}.

The aim of the project was to implement the display logic formulation of 
relation algebras in Isabelle, and to demonstrate the relationship 
between this and some other formulations of relation algebras.
As the theory of relation algebras is undecidable, 
it is useful to implement the theory in
a powerful interactive proof assistant such as Isabelle.

\section{Relation Algebras}

A relation is a common concept; for example
given a set $P$ of people, we can consider the relation 
``$x$ is the father of $y$''.
This could be written father$(x,y)$;
alternatively, if we define
$$F = \{ (x,y) \,|\, x \in P, y \in P, x \textrm{ is the father of } y \}$$
then $(x,y) \in F$ means that ``$x$ is the father of $y$''.
Mathematically, a \textbf{relation}\index{relation|textbf}
is a set of pairs such as $F$.

Relations are a primary concept in many areas of computer science.
For example,
a set of (\textit{pre},\textit{post})
pairs can be used to model a program
or program specification;
it means that if the program is executed with the machine in 
initial state \textit{pre}, and terminates with the machine in 
state \textit{post}, then (\textit{pre},\textit{post}) is one of the pairs
in the set (\cite{hoare}).
As a relation is just a set of pairs,
this models a program as a relation on the state space of the machine.
(This is addressed further in \S\ref{Peirce} and \S\ref{Peirce-appl}).

Numerous attempts to describe a calculus for manipulating relations
(see \cite{Mad}) led to the characterization of relation algebras
by a system of equations, as described in \S\ref{ra-intro}.
Relation algebras provide an abstraction of the notion of binary
relations on a set;
reference to the actual elements of the set is abstracted away, leaving
the relations themselves as the objects being considered.
(This is analogous to the way in which Boolean algebras provide an abstraction
of the subsets of a set, where there is no reference to single elements of the
set).

Relation algebras thus have applications wherever relations are a 
primary concept,
such as in the modelling of programs and specifications 
and in knowledge representation (for example, in
%\emph
{relational} databases \cite{db}).

Some more theoretical background information about relation algebras
is given in \S\ref{ra-intro} and \S\ref{ra-class}.

\section{Gentzen sequents and Display Logic}

%There are several major paradigms for deductive (proof) systems.
%A \emph{Hilbert system}\index{Hilbert system}
%normally has a number of axioms and just one 
%inference rule, of the form ``from $A \Rightarrow B$ and $A$, deduce $B$''.
%Derived propositions accumulate; an inference step can 
%produce an instance of an axiom, or use an inference rule and previously 
%derived propositions to derive a new proposition.
%
%A \emph{natural deduction}\index{natural deduction}
%system is similar, but tends to have few axioms and many
%inference rules, including in particular one of the form
%``if assuming $A$ enables $B$ to be derived, deduce $A \Rightarrow B$''.
%This tends to make proofs much easier than in a Hilbert system.
%
%In Hilbert or natural deduction systems, $\Gamma \vdash P$ is often used to
%mean that $P$ can be derived from (instances of)
%the axioms and rules of the system,
%together with the assumptions in the set $\Gamma$.
%
%A \emph{term rewriting}\index{term rewriting}
%or \emph{equational}\index{equational}
%system allows proving the equivalence of terms by
%rewriting any chosen subterm using the equivalence rules of the system.
%(A proposition is proved by rewriting it to `true').

A Gentzen-type \emph{sequent calculus}\index{sequent calculus}
(see, for example, \cite{Gallier})
proves statements of the form
$\Gamma \vdash \Delta$, called sequents,
where $\Gamma$ and $\Delta$ denote comma-separated lists of formulae.
The comma is not a logical symbol but a ``structural'' one,
whose meaning depends on the side on which it appears.
The $\vdash$ (``turnstile'') symbol has a meaning analogous to deducibility.
%but it is part of the system, in that the axioms and rules all
%involve sequents (which contain $\vdash$).
%A Gentzen sequent is of the form 
For example,
$P,Q \vdash R,S$
means 
``($P$~and~$Q$) entails ($R$ or $S$)''.
A sequent rule is of the form
$$\frac {\mathcal P_1 \quad \mathcal P_2 \quad \ldots \quad \mathcal P_k }
{\mathcal C }$$
where each $\mathcal P_i$ is a sequent, called a premise, and 
$\mathcal C$ is also a sequent, called the conclusion.
The sequent calculus rules are primarily ``introduction rules'',
where the conclusion contains a connective not present in the premise(s).
For example, the rules for introducing the connective ``$\land$'' (``and'')
into the left and right sides of a sequent are
$$ \frac {\Gamma, A, B \vdash \Delta} {\Gamma, A \land B \vdash \Delta}
(\land \vdash) \quad \textrm{ and } \quad 
\frac {\Gamma \vdash \Delta, A \qquad \Gamma \vdash \Delta, B}
{\Gamma \vdash \Delta, A \land B}(\vdash \land )$$ 
The rule ($\land \vdash$), for example, may be understood informally as that 
if, from $\Gamma$ and $A$ and $B$, some $\delta$ in $\Delta$ follows,
then some $\delta$ in $\Delta$ follows from $\Gamma$ and $A \land B$.

The rules for introducing the connective ``$\supset$'' (`implies') into the
left and right sides of a sequent are
$$ \frac {\Gamma \vdash \Delta, A \qquad \Gamma, B \vdash \Delta}
{\Gamma, A \supset B \vdash \Delta} (\supset \vdash) \quad \textrm{ and } \quad 
\frac {\Gamma, A \vdash \Delta, B}
{\Gamma \vdash \Delta, A \supset B}(\vdash \supset )$$ 
There is also one rule (Id) with no premises, which says $p \vdash p\:$;
it is found at the leaves of a proof.
We use these rules to give an example proof.
\vpf
\begin{center}
\bpf
\A "$p  \vdash p$" 
\U "$p  \vdash p, q$" "($\vdash M $)"
\A "$q \vdash q$" 
\U "$p , q \vdash q$" "($M \vdash $)"
\B [2.0em] "$p , (p \supset q) \vdash q$" "($\supset \vdash $)"
\U "$p \land (p \supset q) \vdash q$" "($\land \vdash $)"
\epf
\end{center}
The end-sequent intuitively says ``$p$ and (if $p$ then $q$) entail $q$''.
Now suppose that we are asked to prove the goal
$p \land (p \supset q) \vdash q$.
%When such a proof is constructed backwards, ie, starting from the goal
Then the only rule which can produce this sequent as its conclusion is 
($\land \vdash $), as 
$\land$ is the `outside' connective of the formula on the left of the $\vdash$.
Thus we know that the last step, and the second last sequent, 
must be as shown above.
Next, since 
the `outside' connective of the formula $p \supset q $ is $\supset $,
the ($\supset \vdash $) rule is applied.
(When there are no further connectives to work on, the monotonicity rules
($\vdash M $) and ($M \vdash $) enable the sequents to be reduced to the form
of the identity rule $p \vdash p$).
Generally the choice of rules to apply
is dictated by the `outside' connectives of the formulae present.

Therefore, if a proof could be done using only rules like these,
and it were done backwards from the intended goal,
there would be a very restricted choice of rules to be applied at each step.
This is a great advantage in automated proof.

%In fact, if a sequent can be
%proved, without any premises, then it can be proved using these rules.
%On the other hand, one may wish to prove a sequent from premises,
%eg, a result of the form
%$$ \frac {\Gamma' \vdash \Delta'} {\Gamma \vdash \Delta} $$
%(Such a result may be wanted as a lemma for proving a further result).
%To prove such a result
%will normally require the cut rule, ie
%$$ \frac {\Gamma' \vdash \Delta',A \qquad \Gamma,A \vdash \Delta} 
%{\Gamma',\Gamma \vdash \Delta',\Delta} $$
%In doing backwards proof, using the cut rule involves guessing
%the arbitrary formula $A$, which is a disadvantage in doing automated proof.

Display Logic \cite{Bel} is a Gentzen-type sequent calculus.
In Display Logic a distinction is made between 
\emph{structures} and
\emph{formulae}.
A structure is the analogue of (in Gentzen) a sequence of formulae, separated
by commas; a structure is defined as a formula, or a combination of 
structures formed using the structural operators (which include comma).
There are rules (``display postulates")
for manipulating sequents to enable any chosen structure
to be ``displayed", ie, appear by itself, on one side of the turnstile.
These rules are bi-directional (ie, equivalences).
Display Logic is slightly different from 
sequent calculus in that the formula being
introduced by a rule such as
($\land \vdash $) or ($\vdash \land $)
will be by itself on its side of the turnstile.
However the display postulates allow a sequent to be put into this form.
%For example, one such rule is 
%$$ \frac {P,Q \vdash R} {P \vdash R,*Q} $$
%\begin{center}
%\begin{tabular}{@{}c@{}}
%$ {P,Q \vdash R} $ \\ \hline \hline $ {P \vdash R,*Q} $
%\end{tabular}
%\end{center}
%where ``*" is a structural connective corresponding to ``not".
%The third structural operator for classical propositional logic is ``$I$'',
%which means ``true" or ``false", depending on whether it is in
%an \textbf{antecedent} position (within the scope of an even/odd number
%of `*' symbols when on the left/right) or in a \textbf{succedent} position
%(vice versa).

A number of different logical systems can be formulated using the
method, or style, of Display Logic.
These include several normal modal logics, and intuitionistic logic
\cite{ILR}.
(Indeed, we also produced an Isabelle theory for the Display Logic system
{\boldmath $\delta$}\textbf{Kt} for tense logic).
Much of the behaviour of these logics is captured by the
display rules and some other simple ``structural rules",
rules which apply at the structural level, ie, they are expressed in terms of
arbitrary structures rather than formulae.
There are then the ``logical rules", which are rules for introducing the
various logical connectives.
There is also a rule, (cut),
%$$ \frac {\Gamma' \vdash \Delta',A \qquad \Gamma,A \vdash \Delta} 
%{\Gamma',\Gamma \vdash \Delta',\Delta} $$
$$ \frac {\Gamma \vdash A \qquad A \vdash \Delta} 
{\Gamma \vdash \Delta} $$
which captures the transitivity of deduction.
It creates a difficulty (for backwards proof search)
since we must choose some $A$.
However the cut-elimination theorem of sequent calculus 
says that if a sequent is provable, it is provable without using the cut rule.
It is therefore advantageous if the cut-elimination theorem applies.
In fact, it applies in all Display Logics,
provided that the rules satisfy certain conditions,
which are relatively easily checked (\cite{Bel}).

There is a definition for relation algebras in terms of Display
Logic \cite{dRA}, called \dRA\index{dRA@\dRA|textbf}.
It is this that we have implemented in Isabelle (see below).

%It uses six structural operators, three for classical logic, or Boolean algebra
%(as above), and three operators 
%``;", ``$\bullet$", and ``E",
%which, in an antecedent position,
%represent composition and converse
%of relations, and the identity relation.
%The Display Rules bear a striking similarity to the display rules for the 
%Boolean operators, for example
%\begin{center}
%\begin{tabular}{@{}c@{}}
%$ {P;Q \vdash R} $ \\ \hline \hline $ {P \vdash R;*\bullet Q} $
%\end{tabular}
%\end{center}

\section{Automating Display Logic in Isabelle}

Isabelle is an interactive computer-based proof system,
described in \cite{Int} and \cite{Ref}.
Isabelle is written in Standard ML;
the user can use the full facility of ML to write 
complicated and powerful proof procedures (``tactics'').

Isabelle has the unusual feature that it can be applied to
the user's own choice of logical system, for very little is built in.
The built-in logic is described as intuitionistic higher-order logic,
with three operators producing propositions: 
\texttt{==>} (implication, or deducibility),
\texttt{==} (equality, or substitutibility), and
\texttt{!!} (universal quantification).
These operators have only the properties necessary to support their
intended meanings.
For example, since
\mbox{\texttt{[| P; Q |] ==> R}}
means that \texttt R can be deduced from \texttt P and \texttt Q,
it necessarily follows that
\mbox{\texttt{[| P ==> Q; P |] ==> Q},} and that
{\par\noindent\centering
\texttt{([| P; Q |] ==> R) ==> ([| Q; P |] ==> R)}.
\par\noindent}
Likewise, since 
\texttt{P == Q}
means that \texttt P can be replaced by \texttt Q, or vice versa,
\texttt{==} is reflexive, symmetric and transitive.
We note that the Isabelle operator \texttt{==>} 
corresponds to the horizontal bar in the statement of an inference rule.

The user then has to define an object-level logic,
though several common ones (such as one for sequent calculus)
are packaged with the Isabelle distribution \cite{OL}.
Thus a logical system such as 
\dRA\ can be implemented in Isabelle so that the only
proof rules available are those of \dRA.
This contrasts with an implementation in other proof systems, where
the \dRA\ rules would be added to the rules of the logic on which 
that proof system was based.
(Of course, systems can be implemented in this way in Isabelle too;
thus RALL \cite{RALL}
is implemented on top of the \texttt{HOL} and \texttt{Lattice}
Isabelle theories).
The implementation of \dRA\ in Isabelle is described in \S\ref{impl}.

Isabelle can be used to show the equivalence of \dRA\ to the various
other systems; in most cases this serves only to confirm proofs appearing in the
literature, and in many cases it can only provide proofs of 
the various inductive steps of a proof that requires structural induction.
However it shows that \dRA\ is effective for proving the various
axioms or rules of the other systems.

Some \dRA\ proofs are, however, far from straightforward,
particularly if they are to be done without using (cut).
A proof which is done by hand, and then translated into \dRA\
in a natural way, will often contain cuts; the cut-elimination theorem shows
how to eliminate them, but this can produce a longer proof.

By virtue of the cut-elimination theorem 
we can search for a cut-free backwards proof;
there is then, at each step,
a limited choice of rules from which one must be chosen.
In the Display Logic formulation of classical propositional logic
(which is decidable), 
a straightforward implementation of these rules leads naturally to a decision
procedure, described in \S\ref{decproc}.

The procedure used to eliminate occurrences of the cut rule,
to produce a cut-free proof from a proof involving the cut rule,
was also automated in ML, see Chapter~\ref{ch:cut}. 
The functions described in that Chapter also enable manipulation of
proof trees and 
can produce a record of the elementary rules used when a complex
tactic is applied.

A decision procedure, or any systematic automated procedure, 
based on sequent calculus
needs to have methods of doing cut-free proofs.
As the theory of relation algebras is not decidable (see \cite{Schonfeld}), 
no simple application of such rules will guarantee success;
there is a need to find strategies that are useful
for producing cut-free proofs in 
\dRA, with a view to automating them.

\section{Outline of thesis}

In Chapter~\ref{ch:RA} we describe how the \dRA\ theory for relation algebras
is implemented in Isabelle.
A number of tactics and other functions are provided to facilitate proofs,
and to transform between structural and logical forms of sequents.
Some useful derived rules are proved, as are
some theorems from the literature.
We show how the \dRA\ system can justify inferences in the (axiomatic) theory
of relation algebras.

In Chapter~\ref{ch:gord} we describe an equational theory for doing
proofs in relation algebras, and show how its inferences can be justified in
\dRA.

In Chapter~\ref{ch:mad} we describe a different sequent calculus system,
\M, for relation algebras, and show how its sequents and proofs 
correspond to sequents and proofs in \dRA.
As a sequent in \M\ can correspond to any of several sequents in \dRA,
we have to show that these several sequents are equivalent in \dRA.
To show the correspondence of proofs, we have to 
rearrange a proof in \M\ extensively before we can turn it into a
proof in \dRA.

In Chapter~\ref{ch:cut} we describe the procedure for eliminating any
occurrence of the cut rule from a proof in \dRA, and the
ML functions which perform this procedure on an Isabelle proof in \dRA.
Thus a user can construct proofs using cuts, knowing that a
cut-free proof can be obtained automatically.
For example, when proofs were constructed in \dRA\ 
based on proofs in \cite{CT}, these proofs relied significantly on
lemmata proved separately. 
As incorporating a lemma uses the cut rule, 
the cut-elimination procedure is then needed for producing a cut-free proof.