So, what solution do we pick? A classic example is to find the x1 and x2 that give the best values in a ``least squares'' sense:
This scoring scheme is reasonable because it is optimized for a perfect fit, and the worse the predictions, the worse the score.
Note: You can think of this as being a generalization of the
notion of an average. Recall that