p.49. Example 2.19: Separation of an affine and a convex set: Suppose \(C\) is convex and \(D\) is affine, i.e., \(D = \{F u + g | u ∈ \mathbb{R}^m\}\), where \(F ∈ \mathbb{R}^{n×m}\). Suppose C and D are disjoint, so by the separating hyperplane theorem there are \(a \neq 0\) and \(b\) such that \(a^\top x ≤ b\) for all \(x ∈ C\) and \(a^\top x ≥ b\) for all \(x ∈ D\). Now \(a^\top x ≥ b\) for all x ∈ D means \(a^\top F u ≥ b − aT g\) for all \(u ∈ \mathbb{R}^m\). But a linear function is bounded below on \(\mathbb{R}^m\) only when it is zero, so we conclude \(a^\top F = 0\) (and hence, \(b ≤ a^\top g\)).
A geometric way to look at this is to think of \(D\) as the range of the linear map \(F\) plus an offset vector \(g\), or the column space of a matrix \(F\) plus \(g\). The only way to draw a hyperplane in space that doesn't intersect \(D\) is for it to be parallel to \(D\), i.e. the normal vector \(a\) needs to be perpendicular to the column space of \(F\), which requires \(a\) to be in the left-null space of \(F\), so \(a^\top F = 0\). This and the requirement that \(a^\top F u ≥ b − a^\top g\) give a constraint on the legal choices of \(b\), i.e. \(b ≤ a^\top g\). Then we also have \(a^\top x \leq b \leq a^\top g\) for all \(x \in C\).