Clustering algorithms typically group points based on some similarity criterion, but without reference to an underlying random process to make clustering algorithms rigorously predictive. In fact, there exists a probabilistic theory of clustering in the context of random labeled point sets in which clustering error is defined in terms of the process.
In the present paper, given an underlying point process we develop a general analytic procedure for finding an optimal clustering operator, the Bayes clusterer, that corresponds to the Bayes classifier in classification theory. We provide detailed solutions under Gaussian models. Owing to computational complexity we also develop approximations of the Bayes clusterer.