\( \newcommand{\bm}[1]{\boldsymbol{#1}} \newcommand{\textm}[1]{\textsf{#1}} \def\T{{\mkern-2mu\raise-1mu\mathsf{T}}} \newcommand{\R}{\mathbb{R}} % real numbers \newcommand{\E}{{\rm I\kern-.2em E}} \newcommand{\w}{\bm{w}} % bold w \newcommand{\bmu}{\bm{\mu}} % bold mu \newcommand{\bSigma}{\bm{\Sigma}} % bold mu \newcommand{\bigO}{O} %\mathcal{O} \renewcommand{\d}[1]{\operatorname{d}\!{#1}} \)

A.2 Convex sets

We now provide a concise introduction to convex sets. For more detailed information, the reader is referred to (S. P. Boyd and Vandenberghe, 2004, Chapter 2).

A.2.1 Definitions

We start with some basic definitions. The equation describing a line passing through two points \(\bm{x},\bm{y} \in \R^n\) is \(\theta \bm{x}+(1-\theta)\bm{y}\), where \(\theta \in \R\). If \(\theta\) is constrained to be between 0 and 1, then \(\theta \bm{x}+(1-\theta)\bm{y}\) describes the line segment between \(\bm{x}\) and \(\bm{y}\).

Definition A.1 (Convex set) A set \(\mathcal{C}\in\R^n\) is convex if the line segment between any two points in \(\mathcal{C}\) lies in \(\mathcal{C}\), i.e., if for any \(\bm{x},\bm{y} \in \mathcal{C}\) and \(\theta\) with \(0\leq\theta\leq1\), we have \[\begin{equation} \theta \bm{x}+\left(1-\theta\right)\bm{y} \in \mathcal{C}. \tag{A.2} \end{equation}\]

A convex combination of the points \(\bm{x}_{1},\dots,\bm{x}_{k}\) is of the form \(\theta_{1}\bm{x}_1+\theta_{2}\bm{x}_2+\cdots+\theta_{k}\bm{x}_k\), where \(\theta_{1}+\cdots+\theta_{k}=1\) and \(\theta_{i}\geq0,\,i=1,\dots,k,\) . It can be shown that a set is convex if and only if it contains every convex combination of its points.

The convex hull of a set \(\mathcal{C}\) is the set of all convex combinations of points in \(\mathcal{C}\). As the name suggests, the convex hull of \(\mathcal{C}\) is always convex. In fact, it is the smallest convex set that contains \(\mathcal{C}\).

A set \(\mathcal{C}\) is called a cone if for every \(\bm{x} \in \mathcal{C}\) and \(\theta \ge 0\) we have \(\theta \bm{x} \in \mathcal{C}\). A set \(\mathcal{C}\) is a convex cone if it is convex and a cone, i.e., for any \(\bm{x}_1,\bm{x}_2 \in \mathcal{C}\) and \(\theta_1,\theta_2 \ge 0\), we have \[ \theta_1 \bm{x}_1+\theta_2\bm{x}_2 \in \mathcal{C}. \]

A.2.2 Elementary convex sets

We describe next some important simple examples of convex sets.

Hyperplanes and halfspaces

A hyperplane is a set of the form \[ \left\{\bm{x} \mid \bm{a}^\T\bm{x} = b \right\}, \] where \(\bm{a}\in\R^n,\,b\in\R\). Geometrically, it can be interpreted as the set of points orthogonal to the vector \(\bm{a}\) with an offset by rewriting it as \(\left\{ \bm{x} \mid \bm{a}^\T(\bm{x} - \bm{x}_0) = 0 \right\}\).

A hyperplane divides the space \(\R^n\) into two halfspaces. A (closed) halfspace is a set of the form \[ \left\{ \bm{x} \mid \bm{a}^\T\bm{x}\leq b\right\}. \]

Polyhedra

A polyhedron is defined as the solution set of a finite number of linear equalities and inequalities: \[ \mathcal{P} = \left\{\bm{x} \mid \bm{A}\bm{x} \leq \bm{b},\, \bm{C}\bm{x} = \bm{d}\right\}, \] where \(A\in\R^{m\times n},\,C\in\R^{p\times n},\,b\in\R^{m},\,d\in\R^{p}\).

Two important examples of polyhedra are the unit simplex, defined as \[ \left\{ \bm{x} \mid \bm{x}\ge\bm{0},\bm{1}^\T \bm{x} \le 1\right\}, \] and the probability simplex, defined as \[ \left\{ \bm{x} \mid \bm{x}\ge\bm{0},\bm{1}^\T \bm{x} = 1\right\}. \]

Balls and ellipsoids

A Euclidean ball (or just a ball) with center \(\bm{x}_{c}\) and radius \(r\) has the form \[ \mathcal{B}(\bm{x}_{c},r) = \left\{\bm{x} \mid \left\Vert \bm{x} - \bm{x}_{c}\right\Vert_2 \leq r\right\} = \left\{\bm{x} \mid (\bm{x} - \bm{x}_{c})^\T (\bm{x} - \bm{x}_{c})\leq r^2\right\}, \] where \(\left\Vert \cdot \right\Vert_2\) denotes the Euclidean norm or \(\ell_2\)-norm. Another common representation for the Euclidean ball is \[ \mathcal{B}(\bm{x}_{c},r) = \left\{\bm{x}_{c } +r\bm{u} \mid \left\Vert \bm{u}\right\Vert _{2}\leq1\right\}. \]

A set related to a ball is an ellipsoid, defined as \[ \begin{aligned} \mathcal{E}(\bm{x}_{c},\bm{P}) = & \left\{\bm{x} \mid (\bm{x} - \bm{x}_{c})^\T\bm{P}^{-1}(\bm{x} - \bm{x}_{c})\leq1\right\} \\ = & \left\{\bm{x}_{c} + \bm{A}\bm{u} \mid \left\Vert \bm{u}\right\Vert _{2}\leq1\right\}, \end{aligned} \] with \(\bm{P} = \bm{P}^\T \in\R^{n\times n}\succ \bm{0}\), i.e., \(\bm{P}\) is symmetric and positive definite, and \(\bm{A}\) is the square-root matrix \(\bm{P}^{1/2}\) satisfying \(\bm{A}^\T \bm{A} = \bm{P}\). The matrix \(\bm{P}\) (or \(\bm{A}\)) determines how far the ellipsoid extends in every direction from the center \(\bm{x}_c\). A ball is an ellipsoid with the particular choice \(\bm{P} = r^2I\).

Norm balls and norm cones

Suppose \(\left\Vert \cdot \right\Vert\) is any norm on \(\R^n\) (not necessarily the Euclidean norm). A norm ball with center \(\bm{x}_{c}\) and radius \(r\) is defined as \[ \mathcal{B}(\bm{x}_{c},r) = \left\{\bm{x} \mid \left\Vert \bm{x} - \bm{x}_{c}\right\Vert \leq r\right\}. \]

A norm cone \(\mathcal{C}\subseteq\R^{n+1}\) is defined as the convex set \[ \mathcal{C} = \left\{(\bm{x},t) \in \R^{n+1} \mid \left\Vert \bm{x}\right\Vert \leq t\right\}. \]

One particular case of interest is the second-order cone (a.k.a. ice-cream cone): \[ \mathcal{C} = \left\{(\bm{x},t) \in \R^{n+1} \mid \left\Vert\bm{x}\right\Vert_2 \leq t\right\}, \] where the norm is the Euclidean norm \(\left\Vert \cdot \right\Vert_2\).

Positive semidefinite cone

The set of symmetric positive semidefinite matrices \[ \mathbb{S}_{+}^{n}=\left\{\bm{X}\in\R^{n\times n} \mid \bm{X}=\bm{X}^\T\succeq\bm{0}\right\} \] is a convex cone.

A.2.3 Operations that preserve convexity

To establish that a set is convex, one can directly use the definition of convexity in (A.2). However, it can be cumbersome to prove that for any two points in the set the line segment is also contained in the set. A more interesting way to establish convexity in most practical cases is by showing that the set can be obtained from simple convex sets (e.g., hyperplanes, hyperspaces, balls, ellipsoids, cones) by operations that preserve convexity of sets, i.e., a calculus of convex sets.

Some simple operations that preserve convexity of sets include: intersection of sets, composition with affine functions, and the perspective function.

Intersection

Convexity is preserved under intersection: if \(\mathcal{S}_1\) and \(\mathcal{S}_2\) are convex, then \(\mathcal{S}_1 \cap \mathcal{S}_1\) is convex. This property extends to the case of intersection of multiple sets (even an infinite number of sets).

One trivial example is a polyhedron, which is the intersection of halfspaces and hyperplanes and therefore convex.

A more sophisticated example is the set \[ \mathcal{S}=\left\{\bm{x}\in\R^{n} \mid \left|p_{\bm{x}}\left(t\right)\right|\leq1\textm{ for }\left|t\right|\leq\pi/3\right\}, \] where \(p_{\bm{x}}(t) = x_{1}\textm{cos}(t) + x_{2}\textm{cos}(2t) + \cdots + x_{n}\textm{cos}(nt)\). Note that this set is an intersection of an infinite number of sets indexed by \(t\).

Affine composition

A function is affine if it has the form \(f(\bm{x}) = \bm{A}\bm{x} + \bm{b}\), where \(\bm{A}\in\R^{m\times n}\) and \(\bm{b}\in\R^{m}\), i.e., the sum of a linear function and a constant.

Suppose \(\mathcal{S}\subseteq\R^n\) is a convex set and \(f:\R^n\longrightarrow\R\) is an affine function. Then the image of \(\mathcal{S}\) under \(f\), \[ f\left(\mathcal{S}\right)=\left\{f(\bm{x}) \mid \bm{x} \in \mathcal{S}\right\}, \] is convex.

Two trivial examples are scaling and translation. Another simple example is the projection of a convex set onto some of its coordinates: if \(\mathcal{S}\subseteq \R^m \times \R^n\) is convex, then \[ \left\{ \bm{x}_1 \in \R^m \mid (\bm{x}_1, \bm{x}_2) \in \mathcal{S}\textm{ for some }\bm{x}_2\in\R^n \right\} \] is convex.

A useful example is the affine composition of the norm cone \(\left\{(\bm{x},t) \in \R^{n+1} \mid \left\Vert \bm{x}\right\Vert \leq t\right\}\): \[ \left\{\bm{x}\in\R^{n} \mid \left\Vert\bm{A}\bm{x} + \bm{b}\right\Vert \leq \bm{c}^\T\bm{x} + d\right\}. \]

Perspective function

The perspective function scales or normalizes vectors so the last component is one, and then drops the last component.

Mathematically, we define the perspective function \(P:\R^{n+1}\longrightarrow\R^{n}\), with domain \(\textm{dom}\,P=\R^n \times \R_{++}\) as \(P(\bm{x},t)=\bm{x}/t.\)

Images and inverse images of convex sets under perspective functions are convex.

References

Boyd, S. P., and Vandenberghe, L. (2004). Convex optimization. Cambridge University Press.