7 Cardinality

Now that we know about functions, we return to sets, in particular “measuring their size”. One reason sets were introduced was to “tame” and understand infinity. Indeed, as we will see at the end of course, the study of sets leads us to realise that there are different kinds of infinities.

Bijective functions allowed us to construct a “dictionary” from one set to another: every element in one set is in one-to-one correspondence with the elements in another bijective set. This brings the idea that there’s a fundamental property shared by these two sets. This is the motivation of this section, but first some notation.

Notation:

Let \(A\) be a set, we denoted by \(|A|\) the cardinality of \(A\).

The above notation doesn’t actually define what we mean by the cardinality of a set.

Definition 7.1:

We build our notion of the cardinality of different sets as follows.

  • We define \(|\emptyset| = 0\).

  • For \(n\in\mathbb{Z}_+\) we define \(|\{1,2,\dots,n\}| = n\).

  • If \(A\) and \(B\) are two non-empty sets, we say \(|A| = |B|\) if and only if there exists a bijective \(f:A\to B\).

Again, for an individual infinite set \(A\) this does not define what we mean by \(|A|\), but it does define what we mean by \(|A|=|B|\) for two infinite sets \(A\) and \(B\).

Remarks 7.2:

We need to ensure that our last definition makes sense by checking the following:

  • For any non-empty set \(A\), we expect \(|A|=|A|\). We verify this is true by noting that \(f:A\to A\) defined by \(f(a)=a\) is bijective.

  • For any non-empty sets \(A\), \(B\), we expect that if \(|A|=|B|\) then \(|B|=|A|\). Let us verify this. Suppose \(|A|=|B|\), then there exists a bijective \(f:A\to B\). Since \(f\) is bijective it is invertible, so there exists \(f^{-1}:B\to A\). Since \(f^{-1}\) is invertible, it is bijective, hence \(|B|=|A|\) by definition.

  • For any non-empty sets \(A\), \(B\), \(C\), we expect that if \(|A|=|B|\) and \(|B|=|C|\) then \(|A|=|C|\). Let us verify this. Suppose \(|A|=|B|\) and \(|B|=|C|\), then there exists bijective \(f:A\to B\) and \(g:B\to C\). Consider \((g\circ f):A\to C\), which is bijective by Corollary 6.19, hence \(|A|=|C|\) as required.

Looking forward, we will revisit the three properties above when we look at equivalence relations.

By combining the last two definitions together, we see that if \(f:\{1,2,\dots, n\}\to A\) is bijective, then \(|A|=n\) and we say \(A\) has \(n\) elements. Note that in this case we can enumerate the elements of \(A\) as \(a_1,\dots,a_n\) (where \(a_i=f(i)\)).

Definition 7.3:

We have two cases

  • When \(|A| \in \mathbb{Z}_{\geq 0}\) then we say \(A\) is a finite set.

  • If \(A\) is not a finite set, we say \(A\) is an infinite set.

We take as a fact that \(\mathbb{Z}_+\) is infinite (which makes sense since by the Archimedean Property, \(\mathbb{Z}_+\) is not bounded above).

Etymology:

Finite comes from the Latin finis meaning “end”. A finite set is one that ends. While infinite has the prefix in- meaning “not”, so an infinite set is one that has no end.

Cardinality comes from the Latin cardin meaning “hinges” (which are used to pivot doors). This was used to talk about the “Cardinal virtues”, i.e., the pivotal virtues, the most important one. From there, the word developed to mean “principal”. This was used by mathematician to name the set of whole numbers \(\{0,1,2,3,\dots\}\) to be the Cardinals, as they are the principal numbers (compared to fractions, reals etc.). From there, the cardinality of a set referred to the whole number/the cardinal associated to the set.

Lemma 7.4:

Suppose \(f:X\to Y\) is injective and \(A\subseteq X\). Then \(|A|=|f[A]|\).

Proof.

Let \(B=f[A]\). Restrict \(f\) to \(A\) by defining \(g:A\to B\) via \(g(a)=f(a)\), for all \(a\in A\). By the definition of \(B\), we have that \(B=f[A]=g[A]\), so \(g\) is surjective. Now, suppose that \(a_1,a_2\in A\) are such that \(g(a_1)=g(a_2)\). Then \(f(a_1)=f(a_2)\), and since \(f\) is injective, this means that \(a_1=a_2\). Hence, \(g\) is injective. It follows that \(g\) is bijective, so \(|A|=|g[A]|\). We also know that \(g[A]=B=f[A]\), so \(|A|=|g[A]|=|f[A]|\).

Theorem 7.5:

Let \(A\), \(B\) be two finite sets with \(|A| = n\) and \(|B| = m\). If there exists an injective \(f:A\to B\) then \(n\leq m\).

Proof.

Let \(C = f[A]\), then by the previous lemma we have \(|C|=|A|=n\) and we know \(C\subseteq B\). Hence \(B\) has at least \(n\) elements, so \(m=|B|\geq n\).

Remark:

The contrapositive of the theorem is “If \(|A|>|B|\) then for any \(f:A\to B\) there exist distinct \(a_1,a_2\in A\) such that \(f(a_1)=f(a_2)\)”. Theorem 7.5 is often known as the Pigeonhole Principle as it can be understood as “if you have more pigeons than pigeonholes, then two pigeons will need to share a pigeonhole”.

We extend the above principle to all (finite or infinite) sets.

Definition 7.6:
For any two sets \(A\) and \(B\), if there exists injective \(f:A\to B\) then we say \(|A| \leq |B|\). We say \(|A|<|B|\) if \(|A| \leq |B|\) and \(|A| \neq |B|\).

Note that in particular, for any two sets \(A\) and \(B\) such that \(B\subseteq A\), since we can define an injective function \(f:B\to A\) via \(f(b)=b\) for all \(b\in \mathbb{B}\), we have \(|B|\leq |A|\).

Lemma 7.7:

Let \(A\) and \(B\) be two sets such that \(B\subseteq A\). Then

  1. If \(A\) is finite then \(B\) is finite.

  2. If \(B\) is infinite then \(A\) is infinite.

Proof.

Assume \(|A|=n\). We have \(|B| \leq |A|=n\), i.e. \(|B| = m\) for some \(0\leq m \leq n\). By definition, \(B\) is finite.

Note that part b.) is the contrapositive of part a.)

Theorem 7.8:

If \(X,Y\) are sets with \(|X|\le |Y|\) and \(|Y|\le |X|\) then \(|X|=|Y|\).

In particular, if there exists injective \(f:X\to Y\) and injective \(g:Y\to X\) then there exists bijective \(h:X\to Y\). (Note: we are not claiming \(f\) or \(g\) are bijective).

Proof.

Non-examinable. It is beyond the scope of this course.

Remark:
While we have the tools to prove the theorem in the case when \(X\) or \(Y\) is finite, the proof for when both \(X\) and \(Y\) are infinite is beyond the scope of this course (but see Set Theory in third year). There is an interesting proof of the theorem in the book Algebra by Pierre Grillet, which is available as an electronic book from the University of Bristol library (and in which the theorem is called the Cantor-Bernstein Theorem).
History:

Theorem 7.8 has many names: Cantor-Schröder-Bernstein Theorem; Schröder–Bernstein theorem or (as above) the Cantor-Bernstein theorem. Cantor himself suggested that it should be called the equivalence theorem.

The reasons for so many names is that Georg Cantor (Russian mathematician, 1845-1912) first stated the theorem in 1887, but did not prove it. In 1897, Sergei Bernstein (Ukranian mathematician, 1880-1968) and Ernst Schröder (German mathematician, 1841-1902) independently published different proofs. However, five years later, in 1902, Alwin Korselt (German mathematician, 1864-1947) found a flaw in Schröder’s proof (which was confirmed by Schröder just before he died).

Proposition 7.9:

Let \(X\) be a set and let \(A,B\subseteq X\) be disjoint (\(A\cap B=\emptyset\)) finite sets. Then \(|A\cup B|=|A|+|B|\).

Proof.

Let \(m,n\in\mathbb{Z}_+\) such that \(|A|=m\) and \(|B|=n\). Then \[ A=\{a_1,a_2,\ldots,a_m\}\] where \(a_i=a_j\) only if \(i=j\), for integers \(1\le i,j\le m\). Similarly, we have \[ B=\{b_1,b_2,\ldots,b_n\} \] where \(b_i=b_j\) only if \(i=j\), for integers \(1\le i,j\le n\). Since \(A\cap B=\emptyset\), we know that for any integers \(i,j\) with \(1\le i\le m\) and \(1\le j\le n\), we have \(a_i\not=b_j\). Now, define \(f:\{1,2,\ldots,m+n\}\to A\cup B\) by \[f(k)=\begin{cases} a_k&\text{if $1\le k\le m$,}\\ b_{k-m}&\text{if $m<k\le m+n$.}\end{cases}\] We leave it as an exercise that \(f\) is bijective.

Corollary 7.10:

Let \(X\) be a set and let \(A,B\subseteq X\) be finite sets. Then \(|A\cup B|=|A|+|B|-|A\cap B|\)

Proof.

Exercise.

We will return to the idea of cardinality in Chapter 11, where we see there are different infinite cardinalities.