First order language Expressing generality

First we must be able to name things. In logic a proper name is a simple expression (that is, one we do not further analyse) serving to pick out an individual thing. Examples of proper names include 'Australia', 'Forty-two', 'Real Madrid' and 'René Descartes'. These name respectively a country, a number, a football club and, a dead Frenchman. Note incidentally the use of quotation to form the names of linguistic items. This is a simple enough convention but confusion on the point is common. You should never confuse a name with the thing it names: Australia is not a proper name; it is a country. 'Australia' is the name of that country, and ' 'Australia' ' is the first word of this sentence. So 'Australia' is English, though Australia is certainly not. In our formal notation we use lower case letters from the start of the alphabet, 'a', 'b', 'c'..., for names. So we can name things. Next we need to be able to describe them. For this we use predicates. A predicate is an expression which yields a sentence when appropriately many names are inserted in it. The following are ordinary language examples:

... is larger than New Zealand
... is a multiple of 10
... is larger than ...
... is a multiple of ...
... is between ... and ...

The first two of these require one name to make a sentence, so they are monadic. The next two are dyadic, requiring two names each. The last example is triadic. In logic we use upper case letters to stand for arbitrary predicates in formulas. Traditionally the letters from 'F', 'G', etc. are used for monadic predicates, and 'R', 'S' etc. for dyadic ones, but any upper case letter will do. Each of these predicate symbols is specified as being monadic, dyadic or whatever. This fixed adicity is also a slight idealisation of ordinary usage where predicates like 'are friends' have no special adicity.

With this apparatus and connectives we can already form simple sentences like

Melbourne is a city Cm
Melbourne and Sydney are cities Cm AND Cs
Bob loves Alice Lba
Alice and Bob love each other Lab AND Lba
Alice and Bob love themselves Laa AND Lbb
Alice's love for Bob is not reciprocated Lab AND NOTLba

Note that predicate symbols precede names. So far, however, we cannot generalise. We have no adequate formalisation of 'Everybody loves Lucy' or 'Someone loves Melbourne'. For such cases we need variables and quantifiers.

It would be a mistake to treat expressions like 'someone' and 'everyone' as if they were proper names. Recall Lewis Carroll's fun with 'nobody':

    'Who did you pass on the road?' the King went on, holding out his hand to
the Messenger for some more hay.

    'Nobody,' said the Messenger.

    'Quite right,' said the King: 'this young lady saw him too. So of course
nobody walks slower than you.'

    'I do my best,' the Messenger said in a sullen tone. 'I'm sure nobody
walks much faster than I do!'

    'He can't do that,' said the King, or else he'd have been here first...'

Rather we should think of such expressions as the means of stating what quantity of things of some kind have some property: some of them, all of them, most of them, exactly three of them, none of them, quite a few of them, etc.

In order to express the universal claim that, say, all Vulcans are logicians, we first paraphrase it to the form

Take any Vulcan you please: it is a logician

and then, re-writing the word 'it' as a variable,

Take any Vulcan you please; call it x: x is a logician

or more briefly

For any Vulcan x, x is a logician.

Now a Vulcan x is a thing, x, such that x is a Vulcan. Following our usual notational conventions, we shall write 'x is a Vulcan' and 'x is a logician' as 'Vx' and 'Lx' respectively. So a convenient way of writing the statement that every (or any) Vulcan is a logician is

EVERY(x: Vx) Lx

That is: for EVERY thing x such that Vx, Lx. Here the quantifier is the part up to and including the parentheses, and has three components. First there is a quantity indicator ('EVERY' in the example). Then there is the variable which the quantifier is said to bind. Finally there is the sort indicator, which picks out the set of things over which the variable ranges. The sort indicator 'Vx' is the result of replacing a name in a formula ('Va') by the variable. We shall call such a construction an open formula. Following the quantifier is another open formula using the same variable giving some description of the stated quantity of things of the stated sort. In the example it says of the arbitrary Vulcan x that x is a logician. Naturally, different quantity and sort indicators give different quantifiers:

SOME (x: Lx) Vx Some logicians are Vulcans
MOST (x: Vx) Lx Most Vulcans are logicians
FEW (x: Vx) Ex Few Vulcans are emotional
NO (x: Ex AND Vx) Lx No emotional Vulcans are logicians
TEN (x: Lx) NOT(Vx OR Ex) Ten logicians are neither Vulcans nor emotional

It is convenient to invent short notations for quantity indicators. For instance:

SOME(x: Fx) Gx Some Fs are Gs
ALL(x: Fx) Gx All Fs are Gs
∅(x: Fx) Gx No Fs are Gs
4(x: Fx) Gx Exactly four Fs are Gs
δ(x: Fx) Gx A few Fs are Gs

In what follows we shall be interested mainly in the first two of these, the existential quantifier and the universal quantifier.1 Certain other quantifiers will be introduced later when we have the formal devices necessary to deal with them.

The definition of a formula of first-order or quantificational logic is more elaborate than that given earlier for the propositional calculus because the language is so much richer. First the alphabet must be extended by adding symbols for names, variables and predicates. We define a name to be a lower case letter followed by zero or more "primes"

a, b, c, a' b', c', a'', etc.

The definition of a variable is exactly the same—a lower case letter followed by zero or more primes. Whether the letter is a name or a variable is determined by its position in the formula: if it is bound by a quantifier, then it is a variable; if not, it is a name.

In practice, we shall standardly use letters from the start of the alphabet (a, b, c, ...) as names, and letters from the end of the alphabetbet (x, y, z, ...) as variables, to avoid possible confusion. In principle, however, any letter can function in either role.

A predicate symbol shall be any upper case letter, in principle adorned with a subscript to indicate its adicity (the number of names it needs to make a sentence) and with primes if needed. We generally avoid using the first few letters A,...,E as predicate letters because of our convention of using them as schematic letters to stand for arbitrary formulae, and similarly we avoid the last few X, Y and Z because we use them to stand for sets of formulae. Conventionally we tend to write monadic predicate symbols as F, G, H, etc. and dyadic ones as R, S, T. Nothing turns on this, however, and we may sometimes use other letters for clarity or for mnemonic purposes. We also tend to omit the adicity subscript, as the number of argument places is always clear from the context.

The monadic predicate symbols, then, are

F1, G1, H1, F1', G1', ...

while the dyadic ones are

F2, G2, H2, F2', G2', ...

and so forth. There are even nulladic predicate symbols

F0, G0, H0, F0', G0', ...

which we have already met in propositional logic, where we called them sentence letters and spelled them 'p', 'q', etc.


An important piece of notation which must be introduced at this point is that for representing the result of substituting one expression (a name or a variable, for instance) for another throughout a formula. Where A is a formula containing the name m, we write Avm for the result of putting the variable v in place of all occurrences of m in A. So, e.g., if A is the formula Fa IMP Ga, then Axa is Fx IMP Gx. In general, where e and e' are any two expressions (not necessarily names), and A any larger expression (not necessarily a formula), Ae'e is just like A except that every occurrence of e in it has been replaced by e'. Analogously, where e1, ..., en is a list of n different expressions, and e'1, ..., e'n is a list of n expressions, we may write

Ae'1, ..., e'ne1, ..., en

for the result of simultaneously replacing e1 by e'1 and e2 by e'2 and so on.

This notion of substitution is very simple, but we should note one subtle point: the direction of substitution matters. Suppose a formula B is the result of substituting b for a in another formula A. That is, B = Abain the technical notation. It does not necessarily follow that A is the result of substituting a for b in B. The reason is that one of the two formulae may contain both a and b. Of course, B cannot contain the name a, because all occurrences of a have been turned into b, but it is still possible for A to contain b. For instance, if A is the formula Fa IMP Fb, then B is Fb IMP Fb, which means that Bab is Fa IMP Fa, which is different from A. This point, which may seem a bit pedantic at first, will actually become important when we consider the natural deduction rules for quantifiers. It will matter whether we are substituting a variable for a name or a name for a variable.

Now an atom is just any n-adic predicate symbol followed by n names. The recursive clauses for constructing formulas state that every atom is a formula, that formulas may be compounded under connectives exactly as in the propositional case, and finally that where A and B are formulas containing a name m, and where v is any variable not occurring in either formula, both

ALL (v: Avm) Bvm


SOME (v: Avm) Bvm

are formulas. This clause looks a little complicated, but all it really says is that you can use quantifiers to make compound formulas as in the examples already given, putting variables in place of names as required.

It is worth noting some consequences of the way the clause is worded:

  • Firstly, since v is not allowed to occur in A or in B, no quantifier may occur inside the scope of another quantifier which binds the same variable. Hence no confusion of variable binding is possible. That is, for example, while it is perfectly in order to write   ALL(x: Fx) SOME(y: Gy) Rxy there is no such formula as   ALL(x: Fx) SOME(x: Gx) Rxx.
  • Secondly, the only way of introducing a variable into a formula is by binding it with a quantifier. Hence our formal language has no "free variables". So the only way   Fx   can be a formula is for x to be a name, not a variable.
  • Thirdly, since m is required to occur in both formulas, every quantifier succeeds in binding its variable. Hence we have no "vacuous quantifiers". There is no such formula as   SOME(x: Fa) Gb.

All of these are features, not bugs, in the definition.

To obtain the language in full generality, it is convenient to go a step further and allow several variables to be bound by a quantifier. For example, suppose we want a formula to say that 'smaller' is the opposite of 'larger', we want to say that for any x and y such that x is smaller than y, y is larger than x. Both 'is smaller than' and 'is larger than' are binary relations, so the obvious way to represent them is with two dyadic predicate symbols S and L. That is, we want to write something like

ALL (x,y : Sxy) Lyx

The formation rule providing for this sort of thing is a bit messy, but not really hard to understand: Where A and B are formulas containing names m1 ,...,mn, and where v1,...,vn are any n variables not occurring in either formula, both

ALL (v1, ..., vn:   Av1, ..., vnm1, ..., mn) Bv1, ..., vnm1, ..., mn


SOME (v1, ..., vn:   Av1, ..., vnm1, ..., mn) Bv1, ..., vnm1, ..., mn

are formulas.

1   The notation for these two is standard, 'ALL' and 'SOME' being used by almost everyone. The other symbols ('δ', '4', etc) are not: they were just invented for these notes, to make the point that the theory of quantifiers is more general than the limited set of them normally found in logic textbooks.