Chapter 1: Set Theory

Mathematics is fundamentally about studying patterns, structures, quantities, and logical reasoning. In this context, set theory is a part of the foundational language of mathematics, providing an important framework for clearly describing and discussing collections of objects. Understanding sets and their notation is crucial as they form the basis for more complex mathematical structures and reasoning.

Set Basics

Definition of a Set

Definition: A Set

A diagram illustrating a set as a collection of distinct elements ( $x$ , $y$ , $z$ ) within a labeled region, along with a single element ( $w$ ) not in the set.

A set is a well-defined collection of distinct objects, called elements.

Notation: Sets are typically written using curly braces ${}$ and denoted by capital letters from the Latin alphabet, such as $A = {x, y, z}$ .
Well-defined: The objects inside a set, i.e., its elements, must be well-defined, meaning it is always clear whether something belongs to the set or not.
Distinct Elements: A set contains only distinct elements; no duplicates are allowed.
Order Independence: The order of elements in a set does not matter. For example, ${x, y, z}$ and ${z, x, y}$ represent the same set.

The elements $x$ , $y$ , and $z$ are placeholders and can represent anything — numbers, symbols, objects, or even abstract concepts — as long as they are clearly identifiable.

To better solidify the concept of a set, we now present a few illustrative examples.

Example: A Set

Consider the set of vowels in the English alphabet:

$A = {a, e, i, o, u}$

This set clearly lists all the vowels, and it is easy to determine whether a given letter is a vowel or not.

Example: Well-defined

For a set to be meaningful, it must be well-defined. This means it must be clear whether any given object is an element of the set or not. For example, the set of vowels in the word "radio" is well-defined and can be written as:

$A = {a, i, o}$

Similarly, the "set of all days last year with temperatures below $0^{\circ}$ C" is well-defined because it is based on objective, measurable data. However, the "set of all cold days last year" is not well-defined because the term "cold" is subjective and can vary from person to person.

Example: Distinct Elements

The set of vowels in the English alphabet is:

$A = {a, e, i, o, u}$

Note that the following is not a valid set because it contains duplicate entries:

$A = {a, a, e, i, o, u}$

In sets, each element must be distinct, so duplicates are not allowed.

Example: Order Independence

Consider two sets containing the vowels in the English alphabet:

$A = {a, e, i, o, u} and B = {u, o, i, e, a}$

These two sets are identical because they contain the same elements, regardless of the order in which the elements are listed. Thus, we can write:

$A = B$

This example illustrates the concept of order independence in sets, where the arrangement of elements does not define the uniqueness of a set.

Definition: The Empty Set

The empty set is the unique set that contains no elements. It is denoted by $\emptyset$ or simply ${}$ .

Even though it has no elements, it plays a key role in set theory, similar to how $0$ plays a key role in arithmetic.

Representing Sets

Sets can be described using various notations, each suited to different contexts. Choosing the appropriate notation depends on the nature of the set (finite vs. infinite, discrete vs. continuous) and the intended level of clarity of communication. Below, we discuss the most common methods of representing sets, along with their advantages and typical use cases.

Verbal Description

A verbal description uses ordinary language to define a set by explaining its elements or properties. This approach is particularly useful for introducing abstract or unfamiliar sets in an intuitive way or for providing context before formalizing the set with mathematical notation.

Examples: Describing Sets Verbally

"The set of vowels in the English alphabet."
"The set of whole numbers."
"The set of whole numbers strictly smaller than 6."

Roster Form

Roster form explicitly lists all elements of a set enclosed in curly braces ${}$ . This notation is particularly useful for finite sets or infinite sets with clear, recognizable patterns.

Examples: Describing Sets in Roster Form

The set of vowels in the English alphabet: ${a, e, i, o, u}$
The set of whole numbers: ${0, 1, 2, 3, 4, \dots}$
The set of whole numbers strictly smaller than 6: ${0, 1, 2, 3, 4, 5}$

Note: The ellipsis ( $\dots$ ) indicates that the pattern continues indefinitely.

Set-Builder Notation

Set-builder notation provides a precise and compact way to define a set by specifying the properties that its elements must satisfy. The notation takes one of two equivalent forms:

${x ∣ condition on x} or {x : condition on x}$

Both forms are read as "the set of all $x$ such that the given condition holds".

Here, the symbol $x$ represents a generic element of the set, i.e., it does not refer to any particular element but serves as a placeholder for all possible elements that satisfy the condition. The vertical bar ( $∣$ ) or colon ( $:$ ) functions as a divider between the variable and the rule that determines which elements belong to the set.

For example, the condition might express a numerical restriction such as $x > 0$ (meaning $x$ is strictly greater than zero), a combined relationship like $0 < x < 10$ (meaning $x$ lies strictly between zero and ten), or a membership rule such as $x \in A$ (meaning $x$ is an element of the set $A$ ). In each case, the notation highlights the defining property rather than listing individual elements.

Because of this, set-builder notation is preferred when working with infinite sets, continuous intervals, or sets defined by more complex conditions.

Examples: Describing Sets Using Set-Builder Notation

The set of vowels in the English alphabet: $A = {x ∣ x is a vowel in the English alphabet}$
The set of whole numbers: $B = {x ∣ x is a whole number}$
The set of whole numbers strictly smaller than 6: $C = {x ∣ x is a whole number and x < 6}$

Interval Notation

Certain sets appear so frequently in mathematics that they are assigned dedicated symbols. One of the most fundamental is the set of real numbers, denoted by $R$ . This set, also often just called the "reals", includes virtually any number we can think of, such as $2$ , $3$ , $44993$ , $\frac{1}{2}$ , $\frac{2}{3}$ $π$ , $e$ , and $2$ .

Geometrically, $R$ can be visualized as an infinite line, where each point corresponds to a real number. Intervals are contiguous segments of this line, representing subsets of $R$ .

An illustration of the real number line extending from $- \infty$ to $\infty$ , with the origin $0$ at the center. Positive numbers lie to the right of the origin, while negative numbers lie to the left. Every point on the line corresponds to a unique number.

To describe intervals concisely, we use interval notation, which employs brackets $[]$ to indicate inclusive bounds and/or parentheses $()$ to indicate exclusive bounds.

Below is a summary of how interval notation corresponds to sets of real numbers together with their corresponding set-builder notation.

Set	Interval Notation	Set-Builder Notation
All real numbers	$(- \infty, \infty)$	${x ∣ x \in R}$
Open interval	$(a, b)$	${x ∣ a < x < b}$
Closed interval	$[a, b]$	${x ∣ a \leq x \leq b}$
Infinite to the right	$[a, \infty)$	${x ∣ x \geq a}$
Infinite to the right	$(a, \infty)$	${x ∣ x > a}$
Infinite to the left	$(- \infty, b]$	${x ∣ x \leq b}$
Infinite to the left	$(- \infty, b)$	${x ∣ x < b}$
Half-open (left open)	$(a, b]$	${x ∣ a < x \leq b}$
Half-open (right open)	$[a, b)$	${x ∣ a \leq x < b}$

Understanding how elements relate to sets is fundamental — not only when defining a single set, but also when comparing and working with multiple sets. In the next section, we explore these relationships in more detail and introduce their formal notation.

Examples: Describing Sets Using Intervals

Real numbers strictly between $0$ and $1$ : $(0, 1) = {x \in R ∣ 0 < x < 1}$
Real numbers between $2$ and $5$ , including both endpoints: $[2, 5] = {x \in R ∣ 2 \leq x \leq 5}$
Real numbers greater than $3$ : $(3, \infty) = {x \in R ∣ x > 3}$
Real numbers less than or equal to $0$ : $(- \infty, 0] = {x \in R ∣ x \leq 0}$

Set Membership

Set membership describes the fundamental relationship between elements and a set. This relationship is crucial for defining and understanding the contents of sets.

Definition: Set Membership

A diagram showing a set $A$ containing an element $x$ but not an element $y$ , illustrating the membership relation.

Let $x$ be an element and $A$ a set. We say that $x$ is an element (or member) of $A$ , written $x \in A$ , if and only if $x$ belongs to the collection of elements that make up $A$ . If an element $w$ is not in $A$ , we write $w \in / A$ .

Examples: Set Membership

The following examples illustrate how we use the symbols $\in$ (is an element of) and $\in /$ (is not an element of) to describe whether a value belongs to a particular set.

The element $3$ belongs to the set because it appears among its members: $3 \in {1, 2, 3, 4, 5}$
The element $6$ does not belong to the set because it is not included among its elements: $6 \in / {1, 2, 3, 4, 5}$
The number $π$ is a real number, so it belongs to the set of all real numbers: $π \in R$
In this case, the elements of the outer set are themselves sets, so ${1}$ is one of its members: ${1} \in {{1}, {2}, {3}}$
The number $1$ alone is not a member, because the set only contains sets as elements: $1 \in / {{1}, {2}, {3}}$
The fraction $\frac{7}{4}$ (equal to $1.75$ ) is in the interval because $1 \leq \frac{7}{4} < 2$ : $\frac{7}{4} \in [1, 2)$
The number $- 5$ is not in this interval because it is not positive: $- 5 \in / (0, \infty)$

This binary relationship, where each element either belongs to a set or does not, precisely defines a set’s contents and forms the basis for defining equality and more advanced set relations.

Definition: Equality of Sets

Two sets $A$ and $B$ are equal, denoted $A = B$ , if they contain exactly the same elements. This means every element of $A$ is in $B$ , and every element of $B$ is in $A$ . More formally, in predicate logic, we can write this as:

$A = B ⟺ \forall x (x \in A ⟺ x \in B)$

Or in plain words: Two sets $A$ and $B$ are equal if and only if every element $x$ belongs to $A$ exactly when it belongs to $B$ .

Cardinality

Definition: Cardinality

The cardinality of a set $A$ , written $∣ A ∣$ , indicates the number of elements in $A$ . In other words, the cardinality is the size of $A$ .

Examples: Cardinality

If $A = {a, e, i, o, u}$ , then $∣ A ∣ = 5$ . This means that $A$ is a finite set and has five distinct elements.
If $B = {1, 2, 3, 4, 5, 6}$ , then $∣ B ∣ = 6$ . This means that $B$ is a finite set and has six distinct elements.
If $C = {1, 2, 3, \dots}$ , then $∣ C ∣ = ℵ_{0}$ (aleph-null, the cardinality of any countably infinite set). This means that $C$ is countably infinite, i.e., its elements can be listed one by one in an endless sequence (first 1, then 2, then 3, and so on).
If $D = R$ , then $∣ D ∣ = c$ (the cardinality of the continuum). The set of real numbers is uncountably infinite, it is so large that it is impossible to list all elements in any sequence; between any two real numbers, there are infinitely many others.
If $E = \emptyset$ , then $∣ E ∣ = 0$ . The empty set contains no elements, so its cardinality is zero.

Equivalence of Sets

Definition: Equivalence of Sets

Two sets $A$ and $B$ are said to be equivalent, often written $A \sim B$ , if they have the same number of elements.

Formally, $A$ and $B$ are equivalent if there exists a one-to-one correspondence (a bijection) between their elements, i.e., each element of $A$ can be paired with exactly one element of $B$ , and every element of $B$ is matched with exactly one element of $A$ .

In terms of size, this means their cardinalities are equal: $∣ A ∣ = ∣ B ∣$

Examples: Equivalence of Sets

Consider the two sets of numbers: $A = {1, 2, 3}, B = {3, 2, 1} .$

These sets are equal ( $A = B$ ) because they contain exactly the same elements.

Now consider:

$C = {a, b, c}, D = {x, y, z} .$

These sets are not equal, since their contents differ, but they are equivalent ( $C \sim D$ ) because each element of $C$ can be matched with one element of $D$ .

Note: Equivalence in Other Areas of Mathematics

In most cases, we care about equality when we want to check whether two things are exactly the same. But sometimes we only care whether two things share a certain property, for example, having the same size, and that is where equivalence becomes useful.

The same idea appears in many areas of mathematics. Some simple examples are:

Two fractions such as $\frac{1}{2}$ and $\frac{2}{4}$ are equivalent because they represent the same value. Evaluating both gives $0.5$ , but the fractions themselves are written differently.
Two equations like $x + 3 = 5$ and $x = 2$ are equivalent because they have the same solution. The value $x = 2$ satisfies both equations, even though their forms differ.
Two angles are equivalent if they have the same measure, even if they open in different directions.

Two angles with equal measure are equivalent, even if they face different directions.

In geometry, shapes that are the same in size and form, though placed differently, are also considered equivalent through congruence.

Congruent shapes are equivalent because they have the same size and form.

Subsets & Proper Subsets

Subsets and proper subsets describe the relationship between sets in terms of their elements.

Definition: Subset

Diagram illustrating the subset relationship $A \subseteq B$ , where every element of $A$ is also an element of $B$ (including the case $A = B$ ).

Let $A$ and $B$ be sets. We say that $A$ is a subset of $B$ if and only if every element of $A$ is an element of $B$ . We write $A \subseteq B$ to denote the fact that $A$ is a subset of $B$ .

Examples: Subset

Let $S = {3, 5, 8}$ and $T = {5, 3, 8}$ . Since both sets contain the same elements, we have that: $S \subseteq T and T \subseteq S$ Therefore, $S$ and $T$ are equal sets, and each is a subset of the other.
Let $S = \emptyset$ and $T = {5, 3, 8}$ . The empty set contains no elements, so it is a subset of every set: $S \subseteq T$
Let $S = {red, blue}$ and $T = {red, blue, green}$ . Every element of $S$ is in $T$ , so: $S \subseteq T$ Since $T$ has more elements, $S$ is also a proper subset, but we can first identify it as a subset.

Definition: Proper Subset

Diagram illustrating the proper subset relationship $A \subset B$ , where $A$ is contained within $B$ but $A \neq = B$ .

If $A$ is a subset of $B$ , but $A$ is not equal to $B$ ( $A \neq = B$ , meaning $A$ contains fewer elements than $B$ ), then $A$ is called a proper subset of $B$ , which we denote by $A \subset B$ .

If a subset contains all the elements of the original set, it is still considered a subset, but not a proper one.

Examples: Proper Subsets

Let $S = {red, blue}$ and $T = {red, blue, green}$ . Every element of $S$ is in $T$ , but $T$ has one additional element. Therefore: $S \subset T$ That is, $S$ is a proper subset of $T$ .
Similarly, if we let $S = {3, 5, 8}$ and $T = {5, 8, 3, 2, 6}$ , Then all elements of $S$ are contained in $T$ , but $T$ has additional elements ( $2$ and $6$ ). Hence: $S \subset T$ That is, again, $S$ is a proper subset of $T$ .

These examples show that a proper subset is always strictly smaller, that is, a proper subset includes some, but not all, of the elements of the larger set.

The Universe and Set Complement

The universe and set complement relate to what is not contained in a given set. Before exploring complements, we must first understand the universe that defines the context for all sets.

Definition: The Universe

A diagram showing the universe $U$ as a large rectangular region containing all relevant elements, with a set $A$ represented as a subset inside it.

The universe, often denoted as $U$ , refers to a set that contains all the objects or elements relevant to a particular discussion or problem. It serves as the context within which all other sets are defined and interpreted.

Examples: The Universe

Let the universe be the set of all lowercase English letters: $U = {a, b, c, \dots, z} .$ Then we can define:
- $A = {x ∣ x is a vowel}$ , the set of vowels.
- $B = {x ∣ x is a consonant}$ , the set of consonants.
Here, $U$ provides a clear context: $A$ and $B$ together cover all letters in the alphabet.
Let the universe be the set of all real numbers: $U = R .$ Then we can define:
- $A = (0, 1)$ . The set of all real numbers strictly between $0$ and $1$ .
- $B = [2, 5]$ . The set of all real numbers between 2 and 5, including the endpoints.
- $C = (3, \infty)$ . The set of all real numbers greater than 3.
In this case, $U$ defines the entire number line, and each of these sets represents a subset of it.

Definition: Set Difference

Diagram showing two overlapping sets $A$ and $B$ , with $A ∖ B$ highlighted to indicate elements in $A$ that are not in $B$ .

The set difference of two sets $A$ and $B$ , denoted as $A ∖ B$ , is the set of all elements that are in $A$ but not in $B$ .

$A ∖ B = {x \in A ∣ x \in / B}$

In other words, it removes from $A$ all elements that also belong to $B$ .

Examples: Set Difference

Let $A = {a, e, i, o, u}$ and $B = {a, b, c, i}$ . Then: $A ∖ B = {e, o, u} .$ These are the vowels that are not in the set $B$ .
Let $A = {1, 2, 3}$ and $B = {2, 4}$ . The elements in $A$ that are not in $B$ are: $A ∖ B = {1, 3} .$
Let $A = {apple, banana, pear}$ and $B = {pear, peach}$ . The fruits in $A$ that are not in $B$ are: $A ∖ B = {apple, banana} .$
Let $A = {1, 2}$ and $B = {1, 2}$ . Since $A$ and $B$ contain the same elements, we have that: $A ∖ B = \emptyset.$ That is, the difference is the empty set because there is nothing in $A$ that is not in $B$ .
Let $A = [0, 1]$ and $B = (0, 1)$ . Then: $A ∖ B = {0, 1} .$ The difference consists of the endpoints of the closed interval $[0, 1]$ that are not part of the open interval $(0, 1)$ .

These examples show that the set difference identifies what belongs only to the first set and not to the second.

Now that we understand how to subtract one set from another using set difference, we can define the complement of a set as a special case, subtracting from the universe.

Definition: Set Complement

Diagram illustrating the complement $A^{'}$ of a set $A$ within the universe $U$ , highlighting all elements in $U$ that are not in $A$ .

The complement of a set $A$ , denoted as $A^{'}$ , consists of all elements in the universe $U$ that are not in $A$ . In other words:

$A^{'} = U ∖ A$

The complement provides a way to discuss what is not included in a set within the context of a given universe.

Examples: Set Complement

Let $U = {1, 2, 3, 4, 5}$ and $A = {1, 2, 3}$ . The complement of $A$ is: $A^{'} = U ∖ A = {4, 5}$ Here, $A^{'}$ contains the elements of $U$ that are not in $A$ .
Let $A = {a, e, i, o, u}$ be the set of vowels in the English alphabet. If the universe $U$ is the set of all lowercase letters, then the complement $A^{'}$ is: $A^{'} = U ∖ A = all consonants in the English alphabet$ Here, the complement is expressed verbally to save space, but we could list all consonants explicitly if desired.
Let $U$ be a standard deck of playing cards, and let $A$ be the set of all spades. The complement $A^{'}$ is: $A^{'} = U ∖ A = all hearts, diamonds, and clubs.$ In this context, $A^{'}$ represents every card that is not a spade.
Let $U = R$ and $A = {x \in R ∣ x > 10}$ . The complement of $A$ is: $A^{'} = {x \in R ∣ x \leq 10}$ This means $A^{'}$ contains all real numbers less than or equal to 10.

These examples illustrate how the complement operation identifies everything outside a given set, relative to a specified universe $U$ .

Special Number Sets

As mentioned earlier, certain sets of numbers frequently appear and are often represented by special symbols. These sets range from the most basic counting numbers to the broadest set of real numbers.

Below is a summary of some of the fundamental number sets you will encounter, along with their common notations, definitions, and examples.

Fundamental Number Sets

Symbol	Name	Definition / Description
$N$	Natural numbers	${1, 2, 3, \dots}$ The counting numbers.
$N_{0}$	Natural numbers with zero	${0, 1, 2, 3, \dots}$ The natural numbers including $0$ .
$Z$	Integers	${\dots, - 2, - 1, 0, 1, 2, \dots}$ All whole numbers, both positive and negative.
$Q$	Rational numbers	${\frac{a}{b} ∣ a, b \in Z, b \neq = 0}$ The numbers that can be written as fractions.
$I$	Irrational numbers	${x \in R ∣ x \in / Q}$ The real numbers that cannot be written as fractions (e.g., $π$ , $2$ , $e$ ).
$R$	Real numbers	All numbers on the continuous number line. Includes both $Q$ and $I$ .
$C$	Complex numbers	${a + bi ∣ a, b \in R, i^{2} = - 1}$ Numbers with a real part $a$ and an imaginary part $b$ . Includes all real numbers as a subset.

These number sets are not isolated; rather, they form a natural hierarchy where smaller sets are contained within larger ones. The diagram below shows how these sets nest inside one another, for example, all natural numbers are integers, all integers are rational numbers, and all rational numbers are real numbers.

Although irrational numbers are not shown explicitly, they form the part of the real numbers that lies outside the rationals.

Diagram illustrating the nested hierarchy of fundamental number sets, with examples of elements in each set.

The hierarchy can also be expressed symbolically as:

$N \subset N_{0} \subset Z \subset Q \subset R \subset C$

Understanding this hierarchy helps clarify how different number systems extend one another, expanding the kinds of quantities we can represent and reason about.

Additional Set Operations

A few other foundational set operations are commonly used in mathematics and thus data science. While we will not cover these in detail in this course, the table below provides a brief overview. You will most likely encounter and work with these operations throughout your data science degree.

Symbol	Operation	Description
$A \cap B$	Intersection of $A$ and $B$	The set of all elements that are in both $A$ and $B$
$A \cup B$	Union of $A$ and $B$	The set of all elements that are in $A$ , in $B$ , or in both
$A \times B$	Cartesian product of $A$ and $B$	The set of all ordered pairs $(a, b)$ where $a \in A$ and $b \in B$
$P (A)$	The power set	The set of all subsets of $A$ , including the empty set ( $\emptyset$ ) and $A$ itself

Chapter Exercises

Exercise Set: 1

Write the following sets in roster form (if possible). If it is not possible, explain why.

The set of first five positive and even whole numbers.
The real numbers in the interval $(2, 3)$
The set of whole numbers less than 6.
The set of whole numbers that are considered very small.
The set of letters in the word "banana".
The set of even numbers that are typically interesting.
The real numbers in the interval $[2, 2]$

Use set-builder notation to describe the following sets:

${1, 2, 3, 4, 5, 6, 7}$
${1, 10, 100, 1000, 10000}$
${1, \frac{1}{2}, \frac{1}{3}, \frac{1}{4}, \frac{1}{5}, \dots}$
$[7, 7]$ , $(7, 7)$ , $(7, 7]$ and $[7, 7)$

Use interval notion to describe the following sets:

The set of all real numbers between 2 and 5, including both endpoints.
The set of all real numbers strictly greater than $- 1$ .
The set of all real numbers less than or equal to $4$ .
The set of all real numbers greater than $0$ and less than or equal to $10$ .
The empty set (in terms of intervals).

Exercise Set: 2

For each pair of sets below, determine whether they are (a) equal, (b) equivalent but not equal, (c) neither equal nor equivalent.

Let $P = {1, 2, 3, 4}$ and $Q = {3, 2, 1, 4}$
Let $P = {1, 2, 3}$ and $Q = {a, b, c}$
Let $P = {{1, 2}, {1, 3}}$ and $Q = {{1, 2}, {3, 1}}$
Let $P = {}$ and $Q = {\emptyset}$
Let $P = {{1, 2}}$ and $Q = {{1, 2}, {2, 1}}$

Let $X = {0, 1, 2}$ , $Y = {1, 2}$ , and $Z = {1, 2, 3}$ . State whether each is true, false, or meaningless:

$1 \in X$
${1} \in X$
$Y \subseteq X$
$X \subseteq Z$
$\emptyset \in Z$
$\emptyset \subseteq Z$

Exercise Set: 3

List all elements of the following sets:

${x \in N_{0} ∣ x \leq 5}$
${x \in R ∣ x^{2} = 16}$
${x \in Z ∣ - 2 \leq x \leq 2}$
${x \in Z ∣ x = x + 1}$

Let $U = {1, 2, 3, 4, 5, 6, 7, 8}$ , $A = {2, 4, 6}$ , and $B = {1, 2, 3, 4}$ . Find each of the following sets:

$A ∖ B$
$B ∖ A$
$A^{'}$
$B^{'}$
$(A ∖ B) ∖ A^{'}$

Keyboard shortcuts

Mathematics Brush-up for Data Science