1 Language and Logic
1.1 Mathematics Is a Language
Mathematics is the language of natural-scientific model building. For example, the expression \[\begin{equation} F = ma \end{equation}\] corresponds, in the sense of Newton’s second axiom, to a theory of the motion of objects under the action of forces (Newton (1687)). Likewise, the expression \[\begin{equation} \max_{q(z)} \int q(z) \ln \left(\frac{p(y,z)}{q(z)}\right)\,dz \end{equation}\] corresponds, in the sense of variational inference, to contemporary theory about how the brain works (Friston (2005)). Mathematical symbolism serves in particular to communicate scientific insights precisely and aims to describe complex matters exactly and efficiently. As in reflective engagement with any form of language, the question “What is this supposed to mean?” therefore always remains the guiding question when working with mathematical content and symbolism.
As a language system, mathematics has a number of special features. First, its contents are often abstract. This is because mathematics strives for the broadest possible general comprehensibility and applicability. Mathematical approaches to the phenomena of the world are interested in making insights as easy as possible to transfer to other contexts. To make this possible, mathematics tries to work as precisely and understandably as possible, that is, in terms of precise concepts. It proceeds in a strictly hierarchical way, so that concepts introduced later often presuppose a good understanding of the underlying concepts introduced earlier.
The precision of mathematical language implies a high density of information. It is therefore rather sober and leaves out what is superfluous, so that, in the best case, everything in a mathematical text is relevant for communicating an idea. As recipients of mathematical texts, we perceive their information density through the high cognitive energy required when reading such a text. This high energy expenditure calls in particular for calm and slowness when reading with the aim of genuine understanding. The following quotation may serve as a guiding principle for working with mathematical texts: “A mathematical text cannot be read like a novel; one has to work through it” (Unger (2000)). After reading a short mathematical text, one should always ask critically whether one has really understood what has been read or whether further sources should be consulted to clarify the matter. It is also helpful, in the spirit of Richard Feynman’s famous quotation “What I cannot create, I do not understand”, to make one’s own notes and to reconstruct mathematical language systems oneself.
If one wants to gain access to the world of natural-scientific model building, then it is helpful, when dealing with its mathematical expressions and symbolism, to use the same strategies as when learning a foreign language. In addition to immersing oneself in the corresponding language space, that is, constant exposure to mathematical modes of expression, this certainly includes first memorizing concepts and translating texts into everyday language. A deep and secure understanding of mathematical model building then arises in particular through the application of mathematical ways of thinking in written and spoken form.
1.2 Basic Building Blocks
In the following, we introduce three basic building blocks of mathematical communication that will accompany us throughout: the concepts of a definition, a theorem, and a proof.
Definition
A definition is a cornerstone of a mathematical system that is neither justified nor deductively derived within that system. Definitions can only be evaluated according to their usefulness within a mathematical system. A definition is best memorized first and questioned only once one has understood its use in application or is not convinced by it. Some relaxation and calm are generally helpful when dealing with definitions that appear complex at first sight. To indicate that we define a symbol as something, we use the notation “\(:=\)”. For example, the expression “\(a := 2\)” defines the symbol \(a\) as the number two. In this text, definitions always end with the symbol \(\bullet\).
Theorem
A theorem is a mathematical statement that can be recognized as correct (true) by means of a proof. That is, a theorem is always derived from definitions and/or other theorems. In this sense, theorems are the empirical results of mathematics. In applied, data-analytic mathematics, theorems are often helpful for calculations. It is therefore worth memorizing them, because they usually form the basis for data analysis and data interpretation. Equations often appear in theorems. These equations arise from the assumptions of the theorem. To mark equations, we use the equals sign “\(=\)”. For example, in a given context, the expression “\(a = 2\)” says that, because of certain assumptions, the variable \(a\) has the value two. In this text, theorems always end with the symbol \(\circ\).
Proof
A proof is a chain of argumentation that draws on known definitions and theorems to establish the correctness (truth) of a theorem. Short proofs often contribute to understanding a theorem; long proofs tend to do so less. Proofs are therefore, in particular, answers to the question why a mathematical statement holds (“Why is this so?”). Proofs are not memorized. When proofs are short, it is sensible to work through them, because they usually repeat content that is assumed to be known. When they are long, it is more sensible to skip them at first so as not to get lost in details and lose the main path through the mathematical structure. In this text, proofs always end with the symbol \(\Box\).
In addition to definitions, theorems, and proofs, mathematical texts contain other typical building blocks, such as axioms, lemmas, corollaries, and conjectures. We will not use these terms and therefore give only a brief overview of them.
- Axioms are unproved fundamental assumptions in the sense that they serve as starting points for building mathematical systems. The transition between definitions and axioms is often fluid. Because we will not work at a particularly deep mathematical level, we prefer the term definition in almost all cases.
- A lemma is an “auxiliary theorem”, that is, a mathematical statement that is proved but is not as important as a theorem. Because, on the one hand, we focus on important content and, on the other hand, we do not want to discriminate among mathematical statements, we avoid this term and instead use the term theorem throughout.
- A corollary is a mathematical statement that follows from a theorem by a simple proof. Because the “simplicity” of mathematical proofs is a relative property, we avoid this term and instead also use the term theorem throughout.
- Conjectures are mathematical statements for which it is unknown whether they are provable or refutable. Because we work in the area of applied mathematics, we will not encounter conjectures.
1.3 Propositional Logic
Having now become acquainted with some basic building blocks of mathematical language systems, we turn to propositional logic, a simple system that allows relationships between mathematical statements to be established and formalized. Propositional logic plays a central role, for example, in the definition of set operations, in optimization conditions for functions, and in many proofs. In data-analytic applications, propositional logic is the basis of Boolean logic in programming. In mathematical psychology, propositional logic is the basis of the representational theory of measurement.
We begin with the definition of the concept of a mathematical statement.
Definition 1.1 (Statement) A statement is a sentence to which the property true or false can be assigned unambiguously.
The adjective true can also be understood as correct. We abbreviate true by “t” and false by “f”. In the field of real numbers, for example, the statement \(1 + 1 = 2\) is true and the statement \(1 + 1 = 3\) is false. Note that the binarity of the truth value of statements is a basic assumption of propositional logic and therefore to be understood formally and scientifically, not empirically. Truth values do not refer to definitions; definitions fix meanings.
A first way of working with statements is to negate them. This leads to the following definition.
Definition 1.2 (Negation) Let \(A\) be a statement. Then the negation of \(A\) is the statement that is false when \(A\) is true and true when \(A\) is false. The negation of \(A\) is denoted by \(\neg A\), read as “not \(A\)”.
For example, the negation of the statement “The sun is shining” is the statement “The sun is not shining”. The negation of the statement \(1 + 1 = 2\) is the statement \(1 + 1 \neq 2\), and the negation of the statement \(x > 1\) is the statement \(x \le 1\). The definition of the negation of a statement \(A\) is represented in tabular form as follows:
| \(A\) | \(\neg A\) |
|---|---|
| t | f |
| f | t |
Tables of this form are called truth tables. They are a popular tool in propositional logic, and we will use them often in the following.
If one wants to connect two statements logically, the concepts of conjunction and disjunction are available.
Definition 1.3 (Conjunction) Let \(A\) and \(B\) be statements. Then the conjunction of \(A\) and \(B\) is the statement that is true if and only if both \(A\) and \(B\) are true. The conjunction of \(A\) and \(B\) is denoted by \(A \land B\), read as “\(A\) and \(B\)”.
The definition of conjunction implies the following truth table:
| \(A\) | \(B\) | \(A \land B\) |
|---|---|---|
| t | t | t |
| t | f | f |
| f | t | f |
| f | f | f |
As an example, let \(A\) be the statement \(2 \ge 1\) and let \(B\) be the statement \(2 > 1\). Since both \(A\) and \(B\) are true, the statement \((2 \ge 1) \land (2 > 1)\) is also true. As another example, let \(A\) be the statement \(1 \ge 1\) and let \(B\) be the statement \(1 > 1\). Here, only \(A\) is true and \(B\) is false. Thus the statement \((1 \ge 1) \land (1 > 1)\) is false.
Definition 1.4 (Disjunction) Let \(A\) and \(B\) be statements. Then the disjunction of \(A\) and \(B\) is the statement that is true if and only if at least one of the two statements \(A\) and \(B\) is true. The disjunction of \(A\) and \(B\) is denoted by \(A \lor B\), read as “\(A\) or \(B\)”.
The definition of disjunction implies the following truth table:
| \(A\) | \(B\) | \(A \lor B\) |
|---|---|---|
| t | t | t |
| t | f | t |
| f | t | t |
| f | f | f |
\(A \lor B\) is therefore also true when both \(A\) and \(B\) are true. The “or” considered here is thus more precisely an “and/or”. For this reason, disjunction is also called a “non-exclusive or”. As an example, let \(A\) be the statement \(2 \ge 1\) and let \(B\) be the statement \(2 > 1\). \(A\) is true and \(B\) is true. Thus the statement \((2 \ge 1) \lor (2 > 1)\) is true. Now again let \(A\) be the statement \(1 \ge 1\) and let \(B\) be the statement \(1 > 1\). Then \(A\) is true and \(B\) is false. Thus the statement \((1 \ge 1) \lor (1 > 1)\) is true.
One way of placing statements in a mechanical logical relationship is implication. It is defined as follows.
Definition 1.5 (Implication) Let \(A\) and \(B\) be statements. Then the implication, denoted by \(A \Rightarrow B\), is the statement that is false if and only if \(A\) is true and \(B\) is false. \(A\) is called the assumption (premise) and \(B\) the conclusion of the implication. \(A \Rightarrow B\) is read as “from \(A\) follows \(B\)”, “\(A\) implies \(B\)”, or “if \(A\), then \(B\)”.
The definition of implication can be represented by the following truth table:
| \(A\) | \(B\) | \(A \Rightarrow B\) |
|---|---|---|
| t | t | t |
| t | f | f |
| f | t | t |
| f | f | t |
An intuitive understanding of the definition of implication in the sense of the truth table above is obtained most readily by reading it as an attempt to capture and formalize the intuitive idea of a consequence in the context of propositional logic. To understand this, it is best to read the truth states in Table 1.4 in the order truth state of \(A\), truth state of \(A \Rightarrow B\), and finally truth state of \(B\). Read in this way, the truth table shows that if \(A\) is true and \(A \Rightarrow B\) is true, then \(B\) is true. Thus, if one constructs a true implication based on a true statement, for example by transforming equations, it follows that \(B\) is also true. If this is not possible, that is, if \(A\) is true and \(A \Rightarrow B\) is always false, then \(B\) is also false. This is how statements may be refuted. Finally, one sees that if \(A\) is false and \(A \Rightarrow B\) is true, then \(B\) can be either true or false. Only from a true premise does a true implication always yield a true conclusion. In particular, the definition of implication therefore satisfies the principle “anything follows from falsehood (ex falso sequitur quodlibet)”. Thus one cannot infer anything meaningful from false statements by means of implication.
In the context of implication, the concepts of sufficient and necessary conditions arise: If \(A \Rightarrow B\) is true, one says that “\(A\) is sufficient for \(B\)” and that “\(B\) is necessary for \(A\)”. This terminology is explained in the context of implication as follows: If \(A \Rightarrow B\) is true, then if \(A\) is true, \(B\) is also true. The truth of \(A\) is therefore enough for the truth of \(B\). Thus \(A\) is sufficient for \(B\). Furthermore, if \(A \Rightarrow B\) is true, then if \(B\) is false, \(A\) is also false. The truth of \(B\) is therefore necessary for the truth of \(A\).
Finally, to illustrate Table 1.4, we give a concrete everyday example. Let \(A\) be the statement “The bridge is damaged”, let \(B\) be the statement “The bridge is closed”, and let \(A \Rightarrow B\) be the implication “If the bridge is damaged, then the bridge is closed”. We discuss the four cases of Table 1.4.
- \(A\) true, \(B\) true, \(A \Rightarrow B\) true. If \(A\) is true, then the bridge is damaged, and if \(B\) is also true, then the bridge is also closed. In this case, the implication “If the bridge is damaged, then the bridge is closed” is therefore obviously true.
- \(A\) true, \(B\) false, \(A \Rightarrow B\) false. If \(A\) is true, then the bridge is damaged, and if \(B\) is false, then the bridge is not closed. In this case, the implication “If the bridge is damaged, then the bridge is closed” is therefore obviously false.
- \(A\) false, \(B\) true, \(A \Rightarrow B\) true. If \(A\) is false, then the bridge is not damaged, and if \(B\) is true, then the bridge is closed. Since the bridge is not damaged, however, no conclusion can be drawn regarding the implication “If the bridge is damaged, then the bridge is closed”; in particular, one cannot show that it would be false. Thus one assumes that the statement “If the bridge is damaged, then the bridge is closed” is true.
- \(A\) false, \(B\) false, \(A \Rightarrow B\) true. If \(A\) is false, then the bridge is not damaged, and if \(B\) is false, then the bridge is not closed. Since the bridge is not damaged, however, no conclusion can be drawn regarding the implication “If the bridge is damaged, then the bridge is closed”; in particular, one cannot show that it would be false. Thus one again assumes that the statement “If the bridge is damaged, then the bridge is closed” is true.
A very frequently occurring relation between two statements is their equivalence.
Definition 1.6 (Equivalence) Let \(A\) and \(B\) be statements. The equivalence of \(A\) and \(B\) is the statement that is true if and only if \(A\) and \(B\) are both true or if \(A\) and \(B\) are both false. The equivalence of \(A\) and \(B\) is denoted by \(A \Leftrightarrow B\) and read as “\(A\) if and only if \(B\)” or “\(A\) is equivalent to \(B\)”.
The definition of equivalence implies the following truth table:
| \(A\) | \(B\) | \(A \Leftrightarrow B\) |
|---|---|---|
| t | t | t |
| t | f | f |
| f | t | f |
| f | f | t |
The definition of the concept of logical equivalence allows, among other things, the equivalence of two statements to be established by means of implications.
Definition 1.7 (Logical Equivalence) Two statements are called logically equivalent if their truth tables are the same.
As examples of logical equivalences that are frequently used in proof arguments, we show the statements of the following theorem.
Theorem 1.1 (Logical Equivalences) Let \(A\) and \(B\) be two statements. Then the following statements are logically equivalent:
- \(A \Leftrightarrow B\) and \((A \Rightarrow B) \land (B \Rightarrow A)\)
- \(A \Rightarrow B\) and \((\neg B) \Rightarrow (\neg A)\)
Proof. By the definition of logical equivalence, we have to show that the truth tables of the statements under consideration are the same. We show first (1), then (2).
(1) We recall the truth table of \(A \Leftrightarrow B\):
| \(A\) | \(B\) | \(A \Leftrightarrow B\) |
|---|---|---|
| t | t | t |
| t | f | f |
| f | t | f |
| f | f | t |
We further consider the truth table of \((A \Rightarrow B) \land (B \Rightarrow A)\):
| \(A\) | \(B\) | \(A \Rightarrow B\) | \(B \Rightarrow A\) | \((A \Rightarrow B) \land (B \Rightarrow A)\) |
|---|---|---|---|---|
| t | t | t | t | t |
| t | f | f | t | f |
| f | t | t | f | f |
| f | f | t | t | t |
Comparing the truth table of \(A \Leftrightarrow B\) with the first two and the last column of the truth table of \((A \Rightarrow B) \land (B \Rightarrow A)\) shows their equality.
(2) We recall the truth table of \(A \Rightarrow B\):
| \(A\) | \(B\) | \(A \Rightarrow B\) |
|---|---|---|
| t | t | t |
| t | f | f |
| f | t | t |
| f | f | t |
We further consider the truth table of \((\neg B) \Rightarrow (\neg A)\):
| \(A\) | \(B\) | \(\neg B\) | \(\neg A\) | \((\neg B) \Rightarrow (\neg A)\) |
|---|---|---|---|---|
| t | t | f | f | t |
| t | f | t | f | f |
| f | t | f | t | t |
| f | f | t | t | t |
Comparing the truth table of \(A \Rightarrow B\) with the first two and the last column of the truth table of \((\neg B) \Rightarrow (\neg A)\) shows their equality.
The first statement of Theorem 1.1 says that the statement “\(A\) and \(B\) are equivalent” is logically equivalent to the statement “from \(A\) follows \(B\) and from \(B\) follows \(A\)”. This is the basis for many so-called direct proofs using equivalence transformations. The second statement of Theorem 1.1 says that the statement “from \(A\) follows \(B\)” is logically equivalent to the statement “from not \(B\) follows not \(A\)”. This is the basis for the technique of indirect proof. We consider these proof techniques more closely in the following section. First, we summarize the meanings of the symbols introduced in this section once more in the table below.
| Symbol | Meaning | Note |
|---|---|---|
| \(\neg A\) | Not \(A\) | True when \(A\) is false, and vice versa |
| \(A \land B\) | \(A\) and \(B\) | True only when both \(A\) and \(B\) are true |
| \(A \lor B\) | \(A\) and/or \(B\) | True when at least one of the statements is true |
| \(A \Rightarrow B\) | From \(A\) follows \(B\) | \(B\) is necessary for \(A\), \(A\) is sufficient for \(B\) |
| \(A \Leftrightarrow B\) | \(A\) is equivalent to \(B\) | Both \(A \Rightarrow B\) and \(B \Rightarrow A\) hold |
1.4 Equivalence Transformations
Mathematical problems often lead to the case in which information about an unknown variable is represented implicitly by means of an equation or an inequality. To represent the information about the corresponding variable explicitly, that is, to determine its value in the case of equations or to determine the range of numbers in which the value of the variable lies in the case of inequalities, one uses equivalence transformations. Here, we consider only the equivalence transformations of equations and inequalities with respect to real variables that are familiar from school mathematics. Equivalence transformations of equations and inequalities have the property that the truth value of the statement formulated by an equation or inequality does not change when the corresponding transformation is applied. Both sides of the equation or inequality are always transformed. For the application of a mathematical operation that transforms an equation or inequality statement \(A\) into an equation or inequality statement \(B\) to be an equivalence transformation of the form \(A \Leftrightarrow B\), both \(A \Rightarrow B\) and \(B \Rightarrow A\) must hold, as is well known. This implies that equivalence transformations are reversible (invertible) operations.
For equations, admissible equivalence transformations include in particular
- adding a number to both sides of the equation,
- subtracting a number from both sides of the equation,
- multiplying both sides of the equation by a nonzero number,
- dividing both sides of the equation by a nonzero number, and
- applying an invertible function to both sides of the equation.
Example
Anticipating Section 4.3, we consider the statement \[\begin{equation} 2 \exp(x) - 2 = 0. \end{equation}\] Then \[\begin{align} \begin{split} 2 \exp(x) - 2 & = 0 \\ \Leftrightarrow 2 \exp(x) - 2 + 2 & = + 2 \\ \Leftrightarrow 2 \exp(x) & = 2 \\ \Leftrightarrow \frac{1}{2} \cdot 2 \exp(x) & = \frac{1}{2} \cdot 2 \\ \Leftrightarrow \exp(x) & = 1 \\ \Leftrightarrow \ln(\exp(x)) & = \ln(1) \\ \Leftrightarrow x & = 0. \\ \end{split} \end{align}\] In summary, therefore, \[\begin{equation} 2 \exp(x) - 2 = 0 \Leftrightarrow x = 0. \end{equation}\]
For inequalities, admissible equivalence transformations include in particular
- adding a number to both sides of the inequality,
- subtracting a number from both sides of the inequality,
- multiplying both sides of the inequality by a nonzero number, with multiplication by a negative number reversing the inequality sign,
- dividing both sides of the inequality by a nonzero number, with division by a negative number reversing the inequality sign, and
- applying an invertible monotonically increasing function to both sides of the inequality; for monotonically decreasing functions, the inequality sign is reversed.
Example
We consider the statement \[\begin{equation} -5 x - 2 \ge 8. \end{equation}\] Then \[\begin{align} \begin{split} -5x - 2 & \ge 8 \\ \Leftrightarrow -5x - 2 + 2 & \ge 8 + 2 \\ \Leftrightarrow -5x & \ge 10 \\ \Leftrightarrow -\frac{1}{5} \cdot 5x & \ge \frac{1}{5} \cdot 10 \\ \Leftrightarrow -x & \ge 2 \\ \Leftrightarrow x & \le -2. \\ \end{split} \end{align}\] In summary, therefore, \[\begin{equation} -5x - 2 \ge 8 \Leftrightarrow x \le -2. \end{equation}\]
1.5 Proof Techniques
In this section, we outline three fundamental proof techniques: direct and indirect proofs, and proof by contradiction. In what follows, especially the first technique will be used repeatedly to justify theorems.
We have:
- Direct proofs use equivalence transformations to show \(A \Rightarrow B\).
- Indirect proofs use the logical equivalence of \(A \Rightarrow B\) and \((\neg B) \Rightarrow (\neg A)\).
- Proofs by contradiction show that \((\neg B) \land A\) is false.
Equipped with this, we now prove the following theorem by means of a direct proof, an indirect proof, and a proof by contradiction (cf. Arens et al. (2018)).
Theorem 1.2 (Squares of Positive Numbers) Let \(a\) and \(b\) be two positive numbers. Then \(a^2 < b^2 \Rightarrow a < b\).
Proof. We first give a direct proof. Let \(a^2 < b^2\) be the statement \(A\) and let \(a < b\) be the statement \(B\). Then \[\begin{equation} a^2 < b^2 \Leftrightarrow 0 < b^2 - a^2 \Leftrightarrow 0 < (b+a)(b-a) \Leftrightarrow 0 < (b-a) \Leftrightarrow a < b. \end{equation}\] We now give an indirect proof. Let \(a^2 \ge b^2\) be the statement \(\neg A\). Furthermore, let \(a \ge b\) be the statement \(\neg B\). Then \[\begin{equation} a \ge b \Leftrightarrow a^2 \ge ab \land ab \ge b^2 \Leftrightarrow a^2 \ge b^2. \end{equation}\] Finally, we give a proof by contradiction. To do so, we show that the assumption \((\neg B) \land A\) leads to a false statement. We have \[\begin{equation} a \ge b \land a^2 < b^2 \Leftrightarrow a^2 \ge ab \land a^2 < b^2 \Leftrightarrow ab \le a^2 < b^2. \end{equation}\] Furthermore, \[\begin{equation} a \ge b \land a^2 < b^2 \Leftrightarrow ab \ge b^2 \land a^2 < b^2 \Leftrightarrow a^2 < b^2 \le ab. \end{equation}\] Overall, this yields the false statement \[\begin{equation} ab \le a^2 < b^2 \le ab \Leftrightarrow ab < ab. \end{equation}\]