Cut Elimination More about first order logic

This is a technical aside to the Logic Notes, containing a proof of a fact asserted in the body of the notes. It may be skipped by anyone willing to take that assertion on trust, as it assumes rather more mathematical background than is required to follow the general exposition of logic.

The result is the admissibility of "cut" for the sequent calculus formulation of first-order logic. Those details which pertain to formulae containing quantifiers and terms may simply be omitted to yield a proof for propositional logic.

Before proceeding to the proof, we need to establish a few lemmas. These are not hard to observe, but need to be stated for purposes of the proof.

Lemma 1: Let  XY  be a sequent with a cut-free proof in the sequent calculus formulation of first order logic presented in Chapter 8 of these notes. Let m be a name occurring in the sequent. Then there is a name n not occurring in the sequent, such that the result of replacing every occurrence of m in  XY  by n also has a cut-free proof, no longer than the proof of  XY 

Proof: Let π be a cut-free proof of  XY , let n be a name not occurring anywhere in any sequent in π and let π' be the result of replacing every occurrence of m in π by n. If some of the sequents in π contain infinitely many formulae, they may contain every name in the language of the proof, in which case it is necessary to extend that language by adding a new name to it. Clearly n occurs in π' exactly where m occurs in π, so the difference between the two proofs is merely one of spelling, which is irrelevant to the operation of the rules.

Lemma 2: The rule of weakening (monotonicity) is admissible in the sequent calculus formulation of classical first order logic. That is, if any sequent  X  ⊢  Y  has a cut-free proof containing n inference steps, then so does  X, X'  ⊢  Y, Y'  for any sets X' and Y' of additional formulae.

Proof: by induction on the length of the proof of  X  ⊢  Y . In the base case,  X  ⊢  Y  is an axiom; but then so is  X, X'  ⊢  Y, Y' . For the induction step,  X  ⊢  Y  comes by one of the 12 logical rules from other sequents which have shorter proofs. By the induction hypothesis, their weakenings with X' and Y' also have shorter cut-free proofs, and by inspection of the rules we may note that nothing prevents the application of the same rule to those to produce  X, X'  ⊢  Y, Y'  as required, except in the cases of  ⊢ALL and SOME⊢, where there is a side condition to consider. The side condition could be violated if the name being generalised upon occurs in X' or Y'. However, by lemma 1, there is also a proof of  X  ⊢  Y  with the name m replaced by some n not in X, Y, X' or Y, allowing the weakened conclusion to be derived after all.

Lemma 3: Let a sequent  X'  ⊢  Y'  be the result of substituting a term t for a name m throughout another sequent  X  ⊢  Y . Then if  X  ⊢  Y  is provable, so is  X'  ⊢  Y' , without increase in the length of the proof.

Proof: again by induction on the length of the proof of  X  ⊢  Y . In the base case there is nothing to prove, since if  X  ⊢  Y  is an axiom then so is  X'  ⊢  Y' . If  X  ⊢  Y  comes from sequents with shorter proofs by one of the rules for propositional connectives, again it is easy to observe that the same move will work to give a proof of  X  ⊢  Y . That leaves the case where the last inference in the proof of  X  ⊢  Y  is one of the rules introducing a quantifier on the left or on the right. The rules for the univeral quantifier on the left and the existential on the right pose no problems, as the presence of m or t in the side formulae X and Y is irrelevant to the inference, while the result of making the same substitution in the upper (input) sequent as in the lower (output) one is still an instance of the rule.1 If the last inference introduces a universal quantifier on the right, however, or an existential one on the left, there can be a problem: the name n being generalised may occur inside the term t, blocking the analogous inference under substitution. In that case, it is necessary to rewrite the whole proof using another name o, foreign to the entire context, in place of n. Lemma 1 assures us that this can be done. Then the generalisation step can take place and all will be well.

Theorem: The rule "Cut" is admissible in the given sequent calculus formulation of classical first order logic.

Proof: Suppose Cut were added as a rule to the system, and let π be a proof in which it occurs. We shall show how to eliminate a "top" cut—i.e. a cut with no other cut above it—from π, leaving a cut-free proof of the sequent immediately produced by the cut. Clearly, by iterating this removal process, cuts can be removed one by one from π until the whole derivation is cut-free.

Briefly, the technique will be to show that whatever rules are used immediately above the cut, the inferences could have been done in the reverse order, performing the cut first and then the other moves to reach the same conclusion. By iterating this inversion of inference order, we push the cut up above the preceding steps to the top of the proof tree, where it disappears. In fact, the argument is very case-ridden, as there are many rules which could have been used to get one or other of the inputs to the cut, and each possibility is a separate case. Most of the cases are quite easy, but the proliferation of them makes cut elimination rather intricate.

Recall that a cut look like this:

X1, A  ⊢  Y1 X2  ⊢  A, Y2 Cut
X1, X2  ⊢  Y1, Y2

Formally, what we have to establish is that if the two inputs, or upper sequents, X1, A  ⊢  Y1 and X2  ⊢  Y2, A have cut-free proofs, then there exists a cut-free proof of X1, X2  ⊢  Y1, Y2. Proof is by induction on two measures of the complexity of the cut. We define the degree of a cut as the number of connectives and quantifiers in the cut formula (A in the example above), and we define the left rank and right rank as the lengths of the shortest paths up from the cut to the point at which A was introduced on the left and on the right respectively. The rank of the cut is then the sum of its left and right ranks. Now cut C1 is simpler than C2 if it is of lower degree, or if their degrees are the same and the rank of C1 is lower than that of C2. We show that the given top cut can be replaced by one or more simpler cuts, and hence by induction that the complexity can be reduced eventually to zero, in which case the cut is redundant and disappears. The induction hypothesis, then, is that every sequent provable using cuts of complexity less than the given top cut also has a cut-free proof.

Assume first that the left rank is greater than 0. Then the left upper sequent X1, A  ⊢  Y1 does not result by introduction of A, but rather by introduction of another formula in X1 or in Y1. It could, for instance, have come by the rule ⊢ NOT, introducing a formula NOTB into Y1:

X1, A, B  ⊢  Y1 ⊢ NOT
X1, A  ⊢  NOTB, Y1 X2  ⊢  A, Y2 Cut
X1, X2  ⊢  NOTB, Y1, Y2

The sequent X1, A, B  ⊢  Y1 also has A on the left, so in fact it could have been cut with X2  ⊢  A, Y2 before the introduction of NOTB, thus:

X1, A, B  ⊢  Y1 X2  ⊢  A, Y2 Cut
X1, X2, B  ⊢  Y1, Y2 ⊢ NOT
X1, X2  ⊢  NOTB, Y1, Y2

Note that the degree and right rank of this new cut are unchanged from those of the original, but the left rank has been reduced by one. The case in which NOTB is introduced on the left of the sequent by NOT ⊢ is similar, mutatis mutandis.

If the left upper sequent came by introducing a conjunction B AND C on the left by means of the rule AND ⊢, then again we may invert the rule order similarly, going from this:

X1, A, B, C  ⊢  Y1 AND ⊢
X1, A, B AND C  ⊢  Y1 X2  ⊢  A, Y2 Cut
X1, X2, B AND C  ⊢  Y1, Y2

to this:

X1, A, B, C  ⊢  Y1 X2  ⊢  A, Y2 Cut
X1, X2, B, C  ⊢  Y1, Y2 AND ⊢
X1, X2, B AND C  ⊢  Y1, Y2

If the left upper sequent comes by introducing conjunction on the right, however, things are a little more complicated, because there are two inputs to that rule, and A may occur on the left in either or both of them. That is, the inference just above the cut could be any one of these three:

X1.1, A  ⊢  B, Y1.1 X1.2  ⊢  C, Y1.2 ⊢ AND
X1, A  ⊢  B AND C, Y1
X1.1  ⊢  B, Y1.1 X1.2, A  ⊢  C, Y1.2 ⊢ AND
X1, A  ⊢  B AND C, Y1
X1.1, A  ⊢  B, Y1.1 X1.2, A  ⊢  C, Y1.2 ⊢ AND
X1, A  ⊢  B AND C, Y1

where of course X1 and Y1 are  X1.1X1.2  and  Y1.1Y1.2  respectively. In the first and second of these cases, the cut simply moves above the ⊢ AND step as before. In the third case, however, we need two cuts, to remove the two occurrences of the cut formula A:

X1.1, A  ⊢  B, Y1.1 X2  ⊢  A, Y2 Cut X1.2, A  ⊢  C, Y1.2 X2  ⊢  A, Y2 Cut
X1.1, X2  ⊢  B, Y1.1, Y2 X1.2, X2  ⊢  C, Y1.2, Y2 ⊢ AND
X1, X2  ⊢  B AND C, Y1, Y2

Since each of the new cuts is of lower rank than the original, this is fine for the purposes of the cut elimination proof. However, note that the sub-proof of the right upper sequent   X2  ⊢  A, Y2   is now duplicated. This means that the proof after the cuts are eliminated may be longer than it was before—in fact, its size may blow up exponentially.

The rules for introducing disjunctions are exactly dual to those for conjunction, so the argument in those cases is similar to the above. Implication is also more of the same, though with the slight added twist that the subformulae cross the turnstile, as in the case of negation, as well as the left introduction rule having the potential to duplicate the right upper subproof as in the case of disjunction.

The quantifiers introduce another slight complication, in that some rewriting of subproofs may be necessary in order to meet the side conditions on the rules. Suppose, for example, that the left upper sequent comes by introduction of a universal quantifier on the right:

X1, A  ⊢  B, Y1 ⊢ ALL
X1, A  ⊢  ALLvBvm, Y1 X2  ⊢  A, Y2 Cut
X1, X2  ⊢  ALLvBvm, Y1, Y2

This is subject to the usual condition that the name m does not occur in X1 or Y1. However, nothing prevents m from occurring in X2 or Y2, so after inversion with respect to the cut

X1, A  ⊢  B, Y1 X2  ⊢  A, Y2 Cut
X1, X2  ⊢  B, Y1, Y2 ⊢ ALL
X1, X2  ⊢  ALLvBvm, Y1, Y2

the side condition on the rule  ⊢ ALL  may be violated. The solution is to rewrite the (sub)-proof of the left upper sequent using a different name n in place of m throughout, choosing n so that it does not occur anywhere in X1, X2, Y1 or Y2. This can be done, by Lemma 1. A similar step of rewriting may be required for the case of  SOME ⊢  since that is also subject to the side condition. The other quantifier rules  ALL ⊢  and  ⊢ SOME  pose no such problem.

The symmetry of the classical sequent calculus is such that all of the arguments concerning left rank reduction apply analogously to right rank reduction, so we may reasonably declare the rank reduction cases done, meaning that if the rank of a cut is non-zero, it can always be reduced without increasing the degree. Now we must deal with the cases in which the rank is zero. These are of two sorts: one or more of the upper sequents may be an axiom, or else the left upper comes by introduction of the cut formula on the left while the right upper comes by introduction of the same formula on the right.

Suppose the left upper sequent is an axiom. That is, the sets X1A and Y1 have a formula in common. There are two sub-cases: the common formula is A, or it is something else. If it is something else, then the result of the cut  X1, X2  ⊢  Y1, Y2  is also an axiom. If the common formula is A, then the result of the cut is  X1, X2  ⊢  A, Y1, Y2  which is a weakening of the right upper sequent and therefore has a cut-free proof by Lemma 2. Either way, the cut is not necessary and may simply be omitted. By parity of reasoning, the same argument applies if the right upper is an axiom.

Finally, then, we must consider the case in which the last inference in the proof of the left upper sequent introduces A on the left, and that of the right upper introduces A on the right. There are six sub-cases, depending on the form of A.

Case 1: suppose A is of the form NOTB. Then the proof just above the cut looks like this:

X1  ⊢  B, Y1 NOT⊢ X2, B  ⊢  Y2 ⊢NOT
Cut X1, NOTB  ⊢  Y1 X2  ⊢  NOTB, Y2
X1, X2  ⊢  Y1, Y2

Clearly we could omit all mention of NOTB, since it is introduced only to be cut away again immediately, leaving this:

X1  ⊢  B, Y1 X2, B  ⊢  Y2 Cut
X1, X2  ⊢  Y1, Y2

The new cut will normally have positive rank—that is, its rank is greater than that of the cut it replaces—but this does not matter since the cut formula B is of smaller degree than NOTB, so the new cut is simpler than the old one despite its greater rank.

Case 2: A is a conjunction B AND C. Then the cut look like this:

X1, B, C  ⊢  Y1 AND⊢ X2  ⊢  B, Y2 X2  ⊢  C, Y2 ⊢AND
X1, B AND C  ⊢  Y1 X2  ⊢  B AND C, Y2 Cut
X1, X2  ⊢  Y1, Y2

After transformation to remove the sequents containing the conjunction, we have:

X1, B, C  ⊢  Y1 X2  ⊢  B, Y2 Cut
X1, X2, C  ⊢  Y1, Y2 X2  ⊢  C, Y2 Cut
X1, X2  ⊢  Y1, Y2

Now there are two new cuts, but they are of lower degree than the original. The first one, with cut formula B, is a top cut and can therefore be eliminated. The second, with cut formula C, is not a top cut, because there is the other cut just above it. However, the form of the induction hypothesis is such that we are assured there exists a cut-free proof of X1, X2, C  ⊢  Y1, Y2, so there is a proof in which the cut on C is a top cut, so it too can be eliminated.

Case 3: A is a disjunction B OR C. Similar to case 2.

Case 4: A is an implication B IMP C. Again similar.

Case 5: A is a universal generalisation ALLvBvm. Then the proof before treatment is:

X1, Btm  ⊢  Y1 ALL⊢ X2  ⊢  B, Y2 ⊢ALL
Cut X1, ALLvBvm  ⊢  Y1 X2  ⊢  ALLvBvm, Y2
X1, X2  ⊢  Y1, Y2

where, because of the formation rules defining well-formed formulae, we know that the name m occurs in B but the variable v does not. For what it is worth, we also know that m does not occur in X2 or Y2. As things stand, the cut cannot be pushed above the universal quantifier introduction steps, because  B  and  Btm are not the same formula. Lemma 3 comes to the rescue, however, ensuring that there exists a cut-free proof of the sequent  X2  ⊢  Btm, Y2 Cut can be applied to this and  X1, Btm  ⊢  Y1 yielding a proof of  X1, X2  ⊢  Y1, Y2  with a lower-degree cut as required.

Case 6: A is an existential generalisation SOMEvBvm. Dual to case 5.

There being no more cases, the result stands proved.

1   It is important here that our definition of a formula does not allow free variables. If terms were allowed to contain variables, the definition of substitution would have to be made more complicated in order to avoid terms being "captured" by quantifiers as a side effect. The proof of Lemma 3 would then have to step carefully around this problem.