Background Metalogic
In this chapter, our goal is to prove results about first order logic rather than in it. Principally, we shall show that the natural deduction calculus and the semantic definition of entailment fit together in that they produce the same set of sequents, though along the way we shall establish other important facts about the proof system and its semantics. For the purposes of reasoning at this level, there are a few techniques which occur on every page and which need to be understood. If you are already familiar with mathematical induction, and the use of sets to reason about least fixed points, you will probably find little new in this section; if you are not, please read on.
Sets
Basics
A set is the simplest form of collection of objects. The great virtue of sets is that a set is a single object which we can refer to and reason about, regardless of how many objects are in the set or of whether we can even know for certain which objects they are. Some sets contain just a few things: the set {Bob, Carol, Ted, Alice}, for example, contains just four people. The order in which we list them between the curly brackets is irrelevant, as sets are identical if they have the same members, regardless of how the sets or their members are specified. Other sets may have infinitely many members, and this is where set-based reasoning really becomes important. The set of formulae of propositional logic, for example, is just one well-defined object, so we can use it to reason about all formulae although it is not even in principle possible to list them all in a finite time.
As a special case, for any object a there is the set with a as its sole member. This set {a} is called the unit set or the singleton of a. An even more special case is the set with no members at all. This is the empty set, ∅, also called the null set, which is surprisingly useful in a lot of set-based constructions.
This is not the place for an exposition of set theory, but some of its vocabulary and concepts are needed for present purpoes and should be noted. The basic relation involving sets is membership, and for this we follow standard practice by using a stylised epsilon as the symbol, writing
a ∈ s
to mean that object a is in set s. Note that a may itself be a set, since sets of sets are important in many applications.
The last basic concept is that of set inclusion. If every member of set a is also a member of set b, we say that a is a subset of b, or that b is a superset of a, or that b includes a, and write
a ⊆ b
to symbolise this relationship between a and b. Note that every set is a subset of itself on this definition, and that if a ⊆ b and b ⊆ a, than a = b because in that case they have exactly the same members.
Intersections, unions and differences
The intersection a ∩ b of two sets a and b is the set of things that are in both a and b, while their union a ∪ b is the set of things that are in at least one of them. The set difference a ∖ b is a without the things in b:
ALLx(x ∈ a ∩ b IFF
(x ∈ a AND x ∈ b))
ALLx(x ∈ a ∪ b IFF
(x ∈ a OR x ∈ b))
ALLx(x ∈ a ∖ b IFF
(x ∈ a AND x ∉ b))
Intersections and unions of sets (or "families") of sets also exist: where S is a set of sets, ⋂S and ⋃S are the intersection and union of S respectively:
ALLx(x ∈ ⋂S IFF
ALL(y: y ∈ S) x
∈ y)
ALLx(x ∈ ⋃S
IFF SOME(y: y
∈ S) x ∈ y)
Where S is defined as a list of sets s1, ..., sn, we sometimes write
Si
for instance, to refer to the union of all the sets Si for i in the range from 1 to n. This notation is especially useful when the list of sets is infinite and indexed by the positive integers: in that case, the infinite union over all of them is
Si
This is the set of all objects that are in at least one of the (infinitely many) sets S1, S2, etc.
Sequences, products and powers
Sometimes it is convenient to consider a list of objects, where the order in which they occur is significant. In that case we are interested not just in the set of objects, but in the sequence of them. Sequences are somewhat like sets except that:
-
order matters, so <a,b> is
not the same sequence as <b,a> although
{a,b} is the same set as {b,a};
- repetitions matter, so <a,b> is not the same sequence as <a,a,b> although {a,b} is the same set as {a,a,b}.
Finite sequences of length n are often referred to as n-tuples. Infinite sequences are also important. Sometimes these can be seen as indexed by the natural numbers, so that we can speak of the n-th term of the sequence, but there are other possibilities. In any case, a sequence of members of a set s can be construed as a set of ordered pairs <a,i> where a is a member of s and i is a member of an index set I, there being one such pair for every i in I. I need not be the natural numbers, though it must share their essential feature of being well ordered, meaning that it is totally ordered in such a way that every subset of it has a first member in that order.
Where S1 and S2 are any sets, their cartesian product
S1 × S2
is the set of ordered pairs <a,b> with a ∈ S1 and b ∈ S2. In the same way,
S1 × S2 × S3
is the set of ordered triples <a,b,c> with a ∈ S1, b ∈ S2. and c ∈ S3, and so forth.
For any set S, the set S × S contains all the pairs of things from S. By analogy with arithmetical notation, we usually refer to it as S2 (pronounced "S squared"). More generally, Sn is the set of all the n-tuples of members of S. As a limiting case, we may simplify a little by identifying S1 with S, since it generally does no harm to identify a sequence <a> of length 1 with the object a itself. The even more limiting case S0 is simply the singleton {∅} since the empty set ∅, is the only sequence of length 0. In the other direction, Sω is the set of all infinite sequences of members of S indexed by the natural numbers, since "ω" is the standard notation for the limit of the order 1, 2, 3, etc.
Powers of sets in this sense have already been used in these notes, in defining the semantics of first order logic, where a relation is taken to be a subset of a power of the domain, and the domain is just a non-empty set.
Mathematical Induction
Induction over numbers
Suppose we want to show rigirously that some property holds of all natural numbers. For instance, we may wish to prove that the square root of 2 is irrational, following roughly the reasoning of the Pythagoreans. The problem reduces to one of natural number theory, since it is a matter of showing that there are no positive integers a and b such that (a⁄b)2 = 2, which is to say the equation a2 = 2b2 has no (positive) integer solution. That is, we need to prove ALLxALLy((x+1)2 ≠ 2y2) as a theorem about the natural numbers. How should we proceed?
We look at the possible values of a (or equivalently, we could look at b, or at a+b or some similarly sensible measure of the size of the instance). We observe firstly that the smallest positive number 1 cannot be the value of a. That establishes that there is at least an initial segment of the natural numbers in which there is no solution. Then we show that however big that initial segment may be, it can always be extended to an even bigger one without allowing any numbers which satisfy the equation. That is, we suppose that no number smaller than arbitrary n can serve as a in the equation, and show on that supposition that a cannot be n either. Recall the essential property of the natural numbers that every non-empty set of them has a smallest member. Since no number n can be the smallest value of a (greater than 0) such that for some b, a2 = 2b2, the set of solutions to the equation must be empty.
To complete the reasoning, we need to establish the conditional that if there is no solution with a < n then a ≠ n. This is easy: assume n2 = 2x2. Then n2 is even, so n is even, meaning there is a y such that n = 2y. But in that case 4y2 = 2x2, so x2 = 2y2, with x < n, contrary to the hypothesis that there is no solution less than n.
The important point, for present purposes, is the structure of the proof, known as proof by induction. Such a proof establishes a property P of every number x in a few steps:
- The base case: prove the result P(0).
- Formulate the induction hypothesis that P(k) holds, or perhaps that P(x) holds for all x ≤ k.
- The induction case: derive from that assumption that P(k+1) holds.
- Conclude ALLxP(x).
Clearly, if we know that P is true of 0 and is hereditary in the sense that it is preserved by the successor function, then for any individual number n we can show that P is true of it, just by iterating the induction step n times. The principle of mathematical induction allows us instead to conclude the stronger result ALLxP(x) in one step, without the need to go through infinitely many inferences. It amounts to the fact that the set of natural numbers is the intersection of all sets which contain 0 and are closed under successor function (the operation of adding one).
Induction is a powerful tool, as it allows results to be proved about infinite sets, provided they can be enumerated, or put into one-to-one correspondence with the natural numbers.
Induction in logic
In reasoning about logic, we frequently need to establish results about easily enumerated sets, so induction is very common. For example, consider the following proof that every truth function is expressible by neams of a formula in negation normal form.
Proof is by induction on the arity of the function
(i.e. the number of argument places it has). This
clearly corresponds to the number of atoms in the
formula. We could take the base case to be that of arity
zero, meaning just a truth value, but it is a little
less messy to ignore that case, so the base case for the
induction is that of arity 1. By inspection, we see that
there are four possible truth functions of one argument,
all of which are expressible. The induction hypothesis
is that every truth function of arity k is
expressed by a formula in NNF with k atoms. Let
φ be a function of k+1 boolean
arguments b1,
..., bk, bk+1.
Let ψ0 and ψ1 be defined:
ψ0(b1,
..., bk) = φ(b1,
..., bk, 0)
ψ1(b1,
..., bk) = φ(b1,
..., bk, 1)
Then for any
inputs b1,
..., bk+1 the value
of φ(b1,
..., bk+1) is
either ψ0(b1,
..., bk)
or ψ0(b1,
..., bk), depending on
the value of bk+1.
Note that ψ0 and ψ1 are k-ary
truth functions, and are therefore represented by NNF
formulae A and B each containing k
atoms. The formula A
AND pk+1 has the same
value as A when pk+1 is
true, and is false
otherwise. Similarly, B AND
NOTpk+1 is false
when pk+1 is true, and agrees
with B when pk+1 is
false. The disjunction (A
AND pk+1) OR (B AND
NOTpk+1) therefore has
the same truth value as A
if pk+1 is true, and the same
value as B if pk+1 is
false. That is, it expresses φ, and is in NNF as
required. This concludes the proof that all truth
functions are expressible.
Many sets of structures are constructed step by step. Just as the natural numbers are built up from 0 by applying the successor function, so the formulae are built up from atoms by applying connectives and quantifiers. In the same way, proofs are built up from assumptions by applying rules of inference, and certain classes of models can be built up from a minimal starting point by an operation of extending the interpretation function to the next symbol in a language. Induction works in a similar way in all such cases: show that the desired property holds at the start; show that it is preserved by the construction; conclude that it holds of everything.
For the purposes of reasoning about formal systems such as the natural deduction calculus, the commonest uses of induction are: (i) induction on the structure of formulae, and (ii) induction on the construction of proofs. A classic example of (i) is the proof that inter-derivable formulae can replace each other anywhere as subformulae of a formula without upsetting natural deduction proofs. A good example of (ii) is the proof of soundness: the rule of assumptions is sound (base case) and the other rules preserve validity (induction cases) so every provable sequent is valid. A slightly less obvious example is the reasoning implicit in the proof that every sequent provable using the primitive negation rules NOTE and NOTI can be proved using RAA instead.
It is always possible to re-cast induction over the structure of formulae or of proofs as induction over natural numbers, by counting things. Thus induction on formula construction can be construed as induction on the number of connectives or quantifiers in a formula, while induction on proof structure can be seen as induction on the number of inference steps they involve. However, it is a little artificial to do this, and not really necessary as it brings us nothing new.
Incremental construction
Definition
It is common in mathematics that we specify a set by defining a construction which builds it one member at a time. That is, we start with some base set (often the empty set) and go through some list (usually infinite) of items deciding in each case whether that item is in the collection or not. Sometimes, at each step, we consider a well-defined set of objects and choose one of them as the next one to add. The point of such a construction is that it is possible to run inductions over it, thus establishing properties of the final collection or of all its members.
Of course, the intuitive picture of making a collection bigger by adding items to it is not strictly accurate, because when the set gains more members, it is no longer the same set. The correct description is that the construction results in a sequence of sets, each one having the same members as its predecessor except for the added item (if any). Typically, the list of items to be considered for membership
a1, ..., an, ... etc
is infinite, and consequently the sequence of sets constructed
S0, S1, ..., Sn, ... etc
is also infinite, so the construction cannot be completed in practice. However, this does not matter, because the goal is not the carry out the operations in question, but rather to note that the sequence of them can be defined in principle, so there is such a thing as the (infinite) sequence of larger and larger sets. It follows that there is such a thing as their union Σ:
Σ = Si
We then typically go on to show that Σ has some desirable properties, appealing to the construction to establish these. The clause saying whether the next item ai should be in Si (and therefore in Σ) typically specifies a relationship between ai and Si-1. In many cases, this relationship is something we can compute, but it does not have to be. Even where it is undecidable in principle, it either holds or it does not, so the next set Si is what it is.
Examples
As a non-logical example of iterative construction, consider the set 𝒩 of "novel" integers. We say that an integer is novel (I just made this up) iff it is positive and not the sum of any two distinct novel integers. So 1 and 2 are novel, because there are no smaller positive integers to add up to them, but 3 is not novel because it is the sum of 1 and 2. 4 is novel, 5 and 6 are not, 7 is, and so forth. The obvious incremental construction is:
Let N0, N1, N2, etc. be an infinite sequence of sets constructed incrementally as follows:
N0 = ∅
For each
number i ≥ 0, if
SOME(x: x∈Ni)
SOME(y: y∈Ni)
(x≠y AND x+y = i+1)
then Ni+1
= Ni.
Otherwise Ni+1
= Ni ∪ {i+1}.
𝒩 =
Ni
Another very clear example is found in the proof of completeness via the literal basis lemma. We wish to define a set which contains exactly one of p or its negation NOTp for every atom p in the language. The construction is as follows:
Let the atoms of our propositional language be p1, p2, p3, ..., etc. We build up the required interpretation incrementally by defining a sequence Y0, Y1, Y2, ... of sets of literals, each one extending the previous one by adding the next atom or its negation:
Y0 = ∅
For all i ≥ 0, if X,
Yi, pi+1
⊢ A
then Yi+1 =
Yi ∪
{NOTpi+1}. Otherwise
Yi+1 =
Yi ∪
{pi+1}
Y =
Yi
Clearly, by construction, any literal pi or NOTpi is in Y iff it is in Yi, and iff it is in Yj for every j ≥ i.
This example closely follows the recipe for constructing the set Y incrementally, starting with the empty set and adding atoms or their negations to it one by one. At each stage, the next set is one of two possibilities, the choice depending on whether a derivation exists. We do not have to search for such derivations in order to use the construction: it is enough that derivations either exist or else they do not. Similarly, we do not have to wait for ever for the construction to terminate before we can reason about the set Y: it is enough that there is such a thing as the defined sequence and that Y is its union.
As a third example, this time not taken from the proofs in these notes, suppose we have a set X of formulae, and we wish (for some purpose or other) to define a minimal subset of X with exactly the same logical consequences as X. This time, instead of building up the set incrementally, we incrementally cut it down by removing formulae from it. The construction goes as follows:
Let the formulae in X be enumerated A1, A2, etc. We define a coresponding sequence of sets S0, S1, S2, etc:
S0 = X
If Si \ {Ai+1}
TS Ai+1
then Si+1 =
Si \ {Ai+1}. Otherwise
Si+1 =
Si
Σ =
Si
This time Σ is the intersection of the (possibly infinite) family of sets, rather than their union, but the principle is the same. If we wished to make the incremental construction follow the recipe more closely, we could think of it as building up the set of formulae to be removed, rather than as cutting down the set they are removed from; obviously, the effect is the same.