The standard, Wikipedia definition of a Lipschitz continuous function is the following:

Given two metric spaces \((X, d_X)\) and \((Y, d_Y)\), where \(d_X\) denotes the metric on the set \(X\) and \(d_Y\) is the metric on set \(Y\), a function \(f : X → Y\) is called Lipschitz continuous if there exists a real constant \(K ≥ 0\) such that, for all \(x_1\) and \(x_2\) in \(X\),

$$ d_Y(f(x_1), f(x_2)) \leq K d_X(x_1, x_2) $$Any such \(K\) is referred to as a Lipschitz constant for the function \(f\). The smallest constant \(K^*\) is sometimes called the (best) Lipschitz constant...

From the above definition, it's not immediately clear to me that \(K^*\) exists (it's possible that given any Lipschitz constant \(K\), we can always find another Lipschitz constant \(K_2\) with \(K_2 < K\)). The definition of \(K^*\) in my textbook attemps to avoid this potential problem:

Given a Lipschitz continuous function \(f: X \to Y\), define the set of all Lipschitz constants by

$$ C = \{K > 0| \forall x_1, x_2 \in X, d_Y(f(x_1), f(x_2)) \leq K d_X(x_1, x_2) \}, $$and define the (best) Lipschitz constant by:$$ K^* = \inf C $$

but stops short of showing that \(K^* \in C\), i.e., \(K^*\) is actually *a* Lipschitz constant.

Here's a nice way to see this is indeed the case. Consider the following supremum

We know that \(S >0\), and \(S\) is finite because \(f\) is Lipschitz continuous. Moreover,

from the definition of \(S\), and the above inequality also holds when \(x_1 = x_2\), so \(S \in C\) is a Lipschitz constant. It's also easy to verify from first principles that we in fact have

therefore

We see that

gives an equivalent (and in some sense, dual) definition of the (best) Lipschitz constant.

It's sometimes easier to work with this more direct definition of the Lipschitz constant. For example, let \(f, g: X \to Y\) be Lipschitz with the (best) Lipschitz constants \(K_1, K_2\), where \((X, d_X)\) is a metric space as usual, and \((Y, \| \cdot \|)\) is a normed space. Then it's easy to see that \(f+g\) is also Lipschitz continuous, and if \(K\) is the (best) Lipschitz constant of \(f+g\), we can characterize it in terms of \(K_1\) and \(K_2\):