Skip to content

Commit

Permalink
Small edits Chapter 2
Browse files Browse the repository at this point in the history
  • Loading branch information
FabsOliveira committed Jan 22, 2025
1 parent 8ca38c0 commit d2b368f
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 15 deletions.
30 changes: 15 additions & 15 deletions src/chapters/chapter_2/chapter_1-2.tex
Original file line number Diff line number Diff line change
Expand Up @@ -17,13 +17,13 @@ \section{Basics of linear problems}
Let $A$ be a square $n \times n$ matrix. $A^{-1}$ is the inverse matrix of $A$ if it exists and $AA^{-1} = I$, where $I$ is the ($n \times n$) identity matrix.
\end{definition}
%
Matrix inversion is the ``kingpin'' of linear (and nonlinear) optimisation. As we will see later on, performing efficient matrix inversion operations (in reality, operations that are equivalent to matrix inversion but that can exploit the matrix structure to be made faster) is of utmost importance for developing a linear optimisation solver.
Matrix inversion is the ``kingpin'' of linear (and nonlinear) optimisation. As we will see later on, performing efficient matrix inversion operations (in reality, operations that are equivalent to matrix inversion but that can exploit the matrix structure to be made more efficient from a computational standpoint) is of utmost importance for developing a linear optimisation solver.

Another important concept is the notion of \emph{linear independence}. We formally state when a collection of vectors is said to be linearly independent (or dependent) in Definition \ref{p1c2:def:linear_independence}.

%
\begin{definition}[Linearly independent vectors] \label{p1c2:def:linear_independence}
The vectors $\braces{x_i}_{i=1}^k \in \reals^n$ are linearly dependent if there exist real numbers $\braces{a_i}_{i=1}^k$ with $a_i \neq 0$ for at least one $i \in \braces{1,\dots, k}$ such that
The vectors $\braces{x_i}_{i=1}^k$ with $x_i \in \reals^n, ~\forall i \in [k]$, are linearly dependent if there exist real numbers $\braces{a_i}_{i=1}^k$ with $a_i \neq 0$ for at least one $i \in [k]$ such that
$$
\sum_{i=1}^k a_i x_i= 0;
$$
Expand Down Expand Up @@ -63,7 +63,7 @@ \section{Basics of linear problems}
\end{enumerate}
\end{theorem}
%
Notice that Theorem \ref{p1c2:thm:fundamental_linear_algebra} establishes important relationships between the geometry of the matrix $A$ (its rows and columns) and consequences it has to our ability to calculate its inverse $A^{-1}$ and, consequently, solve the system $Ax = b$, to which the solution is obtained as $x = A^{-1}b$. Solving linear systems of equations will turn out to be the most important operation in the simplex method.
Notice that Theorem \ref{p1c2:thm:fundamental_linear_algebra} establishes important relationships between the geometry of the matrix $A$ (its rows and columns) and consequences it has to our ability to calculate its inverse $A^{-1}$ and, consequently, solve the system $Ax = b$, to which the solution is obtained as $x = A^{-1}b$. As we will see, solving linear systems of equations is the most important operation in the simplex method.


\subsection{Subspaces and bases}
Expand All @@ -77,7 +77,7 @@ \subsection{Subspaces and bases}
A related concept is the notion of a \emph{span}. A span of a collection of vectors $\braces{x_i}_{i=1}^k \in \reals^n$ is the subspace of $\reals^n$ formed by all linear combinations of such vectors, i.e.,
%
\begin{equation*}
\spans(x_1,\dots, x_k) = \braces{y = \sum_{i=1}^k a_ix_i : a_i \in \reals, i \in \braces{1,\dots, k}}.
\spans(x_1,\dots, x_k) = \braces{y = \sum_{i=1}^k a_ix_i : a_i \in \reals, i \in [k]}.
\end{equation*}
%
Notice how the two concepts are related: the span of a collection of vectors forms a subspace. Therefore, a subspace can be characterised by the collection of vectors whose span forms it. In other words, the span of a set of vectors is the subspace formed by all points we can represent by some linear combination of these vectors.
Expand All @@ -90,7 +90,7 @@ \subsection{Subspaces and bases}

\begin{enumerate}
\item All bases of a given subspace $S$ have the same dimension. Any extra vector would be linearly dependent to those vectors that span $S$. In that case, we say that the subspace has size (or dimension) $k$, the number of linearly independent vectors forming the basis of the subspace. We can overload the notation $\dim(S)$ to represent the dimension of the subspace $S$.
\item If the subspace $S \subset \reals^n$ is formed by a basis of size $m < n$, we say that $S$ is a proper subspace with $\dim(S)=m$, because it is not the whole $\reals^n$ itself, but a space contained within $\reals^n$. For example, two linearly independent vectors form (i.e., span) a hyperplane in $\reals^3$; this hyperplane is a proper subspace since $\dim(S)=m=2 < 3=n$.
\item If the subspace $S \subset \reals^n$ is formed by a basis of size $m < n$, we say that $S$ is a proper subspace with $\dim(S)=m$, because it is not the whole $\reals^n$ itself, but is contained within $\reals^n$. For example, two linearly independent vectors form (i.e., span) a hyperplane in $\reals^3$; this hyperplane is a proper subspace since $\dim(S)=m=2 < 3=n$.
\item If a proper subspace has dimension $m < n$, then it means that there are $n-m$ directions in $\reals^n$ that are perpendicular to the subspace and to each other. That is, there are nonzero vectors $a_i$ that are orthogonal to each other and to $S$. Or, equivalently, $a_i^\top x = 0$ for $i = n-m + 1, ..., n$. Referring to the $\reals^3$, if $m=2$, then there is a third direction that is perpendicular to (or not in) $S$. Figure \ref{p1c2:fig:proper_subpaces} can be used to illustrate this idea. Notice how one can find a vector, say $x_3$ that is perpendicular to $S$. This is because the whole space is $\reals^3$, but $S$ has dimension $m=2$ (or $\dim(S)=2$).
\end{enumerate}

Expand Down Expand Up @@ -153,7 +153,7 @@ \subsection{Affine subspaces}
S = \braces{x \in \reals^n : Ax = b}.
\end{equation}
%
As we will see, the feasible set of any linear programming problem can be represented as an equality-constrained equivalent of the form of \eqref{p1c2:eq:equality_constraint_feasible_set} by adding slack variables to the inequality constraints, meaning that we will always have that $m < n$. Now, assume that $x_0 \in \reals^n$ is such that $Ax_0 = b$. Then, we have that
As we will see, the feasible set of any linear programming problem can be represented as an equality-constrained equivalent of the form of \eqref{p1c2:eq:equality_constraint_feasible_set} by adding slack variables to the inequality constraints, meaning that it will always lead to us having $m < n$. Now, assume that $x_0 \in \reals^n$ is such that $Ax_0 = b$. Then, we have that
%
\begin{equation*}
Ax = Ax_0 = b \Rightarrow A(x - x_0) = 0.
Expand Down Expand Up @@ -199,10 +199,10 @@ \subsection{Hyperplanes, half-spaces and polyhedral sets}
One important thing to notice is that polyhedral sets, as defined in Definition \ref{p1c2:def:polyhedral_sets}, as formed by the intersection multiple half-spaces. Specifically, let $\braces{a_i}_{i=1}^m$ be the rows of $A$. Then, the set $S$ can be described as
%
\begin{equation}
S = \braces{x \in \reals^n : a_i^\top x \geq b_i, i = 1,\dots, m},
S = \braces{x \in \reals^n : a_i^\top x \geq b_i, ~\forall i \in [m]},
\end{equation}
%
which represents exactly the intersection of the half-spaces $a_i^\top x \geq b_i$. Furthermore, notice that the hyperplanes $a_i^\top x = b_i$, $\forall i \in \braces{1,\dots, m}$, are the boundaries of each hyperplane, and thus describe one of the facets of the polyhedral set. Figure \ref{p1c2:fig:hyperplanes_and_polyhedral_set} illustrates a hyperplane forming two half-spaces (also polyhedral sets) and how the intersection of five half-spaces form a (bounded) polyhedral set.
which represents exactly the intersection of the half-spaces $a_i^\top x \geq b_i$. Furthermore, notice that the hyperplanes $a_i^\top x = b_i$, $\forall i \in [m]$, are the boundaries of each hyperplane, and thus describe one of the facets of the polyhedral set. Figure \ref{p1c2:fig:hyperplanes_and_polyhedral_set} illustrates a hyperplane forming two half-spaces (also polyhedral sets) and how the intersection of five half-spaces forms a (bounded) polyhedral set.

\begin{figure}[h]
\begin{tikzpicture}
Expand All @@ -228,12 +228,12 @@ \subsection{Hyperplanes, half-spaces and polyhedral sets}

You might find authors referring to bounded polyhedral sets as polytopes. However, this is not used consistently across references, sometimes with switched meanings (for example, using polytope to refer to a set defined as in Definition \ref{p1c2:def:polyhedral_sets} and using polyhedron to refer to a bounded version of $S$). In this text, we will only use the term polyhedral set to refer to sets defined as in Definition \ref{p1c2:def:polyhedral_sets} and use the term bounded whenever applicable.

Also, it may be useful to formally define some elements in polyhedral sets. For that, let us consider a hyperplane $H = \braces{x \in \reals^{n} : a^\top x = b}$, with $a \in \reals^n$ and $b \in \reals$. Now consider the set $F = H \cap S$. This set is known as a \emph{face} of a polyhedral set. If the face $F$ has dimension zero, then $F$ is called a vertex. Analogously, if $\dim(F)=1$, then $F$ is called an edge. Finally, if $\dim(F) = dim(S)-1$, then $F$ is called a facet. Notice that in $\reals^3$, facets and faces are the same, whenever the face is not an edge or a vertex.
Also, it may be useful to formally define some elements in polyhedral sets. For that, let us consider a hyperplane $H = \braces{x \in \reals^{n} : a^\top x = b}$, with $a \in \reals^n$ and $b \in \reals$. Now assume that $H$ is such that the set $F = H \cap S$ is not empty, and it only contains points that are in the boundary of $S$. This set is known as a \emph{face} of a polyhedral set. If the face $F$ has dimension zero, then $F$ is called a vertex. Analogously, if $\dim(F)=1$, then $F$ is called an edge. Finally, if $\dim(F) = dim(S)-1$, then $F$ is called a facet. Notice that in $\reals^3$, facets and faces are the same whenever the face is not an edge or a vertex.


\subsection{Convexity of polyhedral sets}

As will see in more detail in Part 2 of this book, convexity plays a crucial role in optimisation, being the ``watershed'' between easy and hard optimisation problems. One of the main reasons why we can solve challenging linear programming problems is due to the inherent convexity of polyhedral sets.
Convexity plays a crucial role in optimisation, being the ``watershed'' between easy and hard optimisation problems. One of the main reasons why we can solve challenging linear programming problems is due to the inherent convexity of polyhedral sets.

Let us first define the notion of convexity for sets, which is stated in Definition \ref{p1c2:def:convex_set}

Expand Down Expand Up @@ -319,7 +319,7 @@ \subsection{Convexity of polyhedral sets}

\item Let $a \in \reals^n$ and $b \in \reals$. Let $x,y \in \reals^n$, such that $a^\top x \geq b$ and $a^\top y \geq b$. Let $\lambda \in [0,1]$. Then $a^\top (\lambda x + (1-\lambda)y) \geq \lambda b + (1-\lambda)b = b$, showing that half-spaces are convex. The result follows from combining this with (1).

\item By induction. Let $S$ be a convex set and assume that the convex combination of $x_1, \dots, x_k \in S$ also belongs to $S$. Consider $k+1$ elements $x_1, \dots, x_{k+1} \in S$ and $\lambda_1, \dots, \lambda_{k+1}$ with $\lambda_i \in [0,1]$ for $i = 1,\dots, k+1$ and $\sum_{i=1}^{k+1}\lambda_i = 1$ and $\lambda_{k+1} \neq 1$ (without loss of generality). Then
\item By induction. Let $S$ be a convex set and assume that the convex combination of $x_1, \dots, x_k \in S$ also belongs to $S$. Consider $k+1$ elements $x_1, \dots, x_{k+1} \in S$ and $\lambda_1, \dots, \lambda_{k+1}$ with $\lambda_i \in [0,1]$ for $i \in [k+1]$ and $\sum_{i=1}^{k+1}\lambda_i = 1$ and $\lambda_{k+1} \neq 1$ (without loss of generality). Then

\begin{equation}
\sum_{i=1}^{k+1}\lambda_i x_i = \lambda_{k+1}x_{k+1} + (1 - \lambda_{k+1}) \sum_{i=1}^k \frac{\lambda_i}{1 - \lambda_{k+1}}x_i. \label{p1c2:eq:induction}
Expand Down Expand Up @@ -348,7 +348,7 @@ \subsection{Convexity of polyhedral sets}
\caption{Illustration of statement 1 (left), 2 (centre), and 3 and 4 (right)} \label{p1c2:fig:convexity_theorem_examples}
\end{figure}

We will halt our discussion about convexity for now and return to it in deeper detail in Part 2. We finish by showing a simple yet very powerful result, which states that the presence of convexity is what allows us to conclude that a locally optimal solution returned by an optimisation algorithm applied to a linear programming problem is indeed optimal for the problem at hand. It so turns out that, in the context of linear programming, convexity is a given since linear functions are convex by definition and the feasibility set of linear programming is also convex (as we have just shown in \ref{p1c2:thm:convexity}).
We will halt our discussion about convexity for now, as we have covered the key facts we will need to prove that the simplex method converges to an optimal point. The last result missing to achieve this objective is to show a simple yet very powerful result, which states that the presence of convexity is what allows us to conclude that a locally optimal solution returned by an optimisation algorithm applied to a linear programming problem is indeed optimal for the problem at hand. It so turns out that, in the context of linear programming, convexity is a given since linear functions are convex by definition and the feasibility set of linear programming is also convex (as we have just shown in \ref{p1c2:thm:convexity}).

\begin{theorem}[Global optimality for convex problems] \label{p1c2:thm:convexity_and_optimality}
Let $f: \reals^n \to \reals$ be a convex function, that is, $f(\lambda x_1 + (1-\lambda)x_2) \le \lambda f(x_1) + (1-\lambda)f(x_2), \ \lambda \in [0,1]$, and let $S \subset \reals^n$ be a convex set. Let $x^*$ be an element of $S$. Suppose that $x^*$ is a local optimum for the problem of minimising $f(x)$ over $S$. That is, there exists some $\epsilon > 0$ such that $f(x^*) \leq f(x)$ for all $x \in S$ for which $\|x - x^*\| \leq \epsilon$. Then, $x^*$ is globally optimal, meaning that $f(x^*) \leq f(x)$ for all $x \in S$.
Expand All @@ -366,7 +366,7 @@ \subsection{Convexity of polyhedral sets}

\section{Extreme points, vertices, and basic feasible solutions}

Now we focus on the algebraic representation of the most relevant geometric elements in the optimisation of linear programming problems. As we have seen in the graphical example in the previous chapter, the optimum of linear programming problems is generally located at the vertices of the feasible set. Furthermore, such vertices are formed by the intersection of $n$ constraints (in a $n$-dimensional space, which comprises constraints that are active (or satisfied at the boundary of the half-space of said constraints).
We focus on the algebraic representation of the most relevant geometric elements in the optimisation of linear programming problems. As we have seen in the graphical example in the previous chapter, the optimum of linear programming problems is generally located at the vertices of the feasible set. Furthermore, such vertices are formed by the intersection of $n$ constraints (in a $n$-dimensional space), which comprises constraints that are active (or satisfied at the boundary of the half-space of said constraints).

First, let us formally define the notions of vertex and extreme point. Although in general these can refer to different objects, we will see that in the case of linear programming problems, if a point is a vertex, then it is an extreme point as well, the converse also being true.

Expand Down Expand Up @@ -401,7 +401,7 @@ \section{Extreme points, vertices, and basic feasible solutions}
\caption{Representation of a vertex (left) and a extreme point (right)} \label{p1c2:fig:vertex_and_extreme_point}
\end{figure}

Definition \ref{p1c2:def:vertex} also hints an important consequence for linear programming problems. As we seen from Theorem \ref{p1c2:thm:convexity}, $P$ is convex, which guarantees that $P$ is contained in the half-space $c^\top y > c^\top x$. This implies that $c^\top x \leq c^\top y, \forall y \in P$, which is precisely the condition that $x$ must satisfy to be the minimum for the problem $\mini_x\braces{c^\top x :x \in P}$.
Definition \ref{p1c2:def:vertex} also hints at an important consequence for linear programming problems. As we can see from Theorem \ref{p1c2:thm:convexity}, $P$ is convex, which guarantees that $P$ is contained in the half-space $c^\top y > c^\top x$. This implies that $c^\top x \leq c^\top y$, $\forall y \in P$, which is precisely the condition that $x$ must satisfy to be the minimum for the problem $\mathop{\mini}_x\braces{c^\top x :x \in P}$.

Now we focus on the description of active constraints from an algebraic standpoint. For that, let us first generalise our setting by considering all possible types of linear constraints. That is, let us consider the convex polyhedral set $P \subset \reals^n$, formed by the set of inequalities and equalities:
%
Expand All @@ -415,7 +415,7 @@ \section{Extreme points, vertices, and basic feasible solutions}
If a vector $\overline{x}$ satisfies $a_i^\top \overline{x} = b_i$ for some $i \in M_1, M_2$, or $M_3$, we say that the corresponding constraints are active (or binding).
\end{definition}

Definition \ref{p1c2:fig:active_constraint} formalises the notion of active constraints. This is illustrated in Figure \ref{p1c2:fig:active_constraints}, where the polyhedral set $P = \braces{x \in \reals^3 : x_1 + x_2 + x_3 = 1, x_i \geq 0, i =1,2,3}$ is represented. Notice that, while points $A$, $B$, $C$ and $D$ have 3 active constraints, $E$ only has 2 active constraints ($x_2 = 0$ and $x_1 + x_2 + x_3 = 1$).
Definition \ref{p1c2:fig:active_constraint} formalises the notion of active constraints. This is illustrated in Figure \ref{p1c2:fig:active_constraints}, where the polyhedral set $P = \braces{x \in \reals^3 : x_1 + x_2 + x_3 = 1, ~x_i \geq 0, ~i =1,2,3}$ is represented. Notice that, while points $A$, $B$, $C$ and $D$ have 3 active constraints, $E$ only has 2 active constraints ($x_2 = 0$ and $x_1 + x_2 + x_3 = 1$).

\begin{figure}[h]
\begin{tikzpicture}
Expand Down
Binary file modified src/linopt-notes.pdf
Binary file not shown.

0 comments on commit d2b368f

Please sign in to comment.