Model selection for unsupervised learning
Model selection is a crucial task.
Let there be some data, a learning task, and a set of candidate models that aim at solving the task.
It is desirable to have a prioritization of the candidate models that enables one to select the best model given the task and given the data.
The simplest such selection problem in unsupervised learning (most intuitive but not easy to solve!) is selecting the model order of a model.
For instance, one must select the number of clusters in clustering, the number of factors in factor analysis, or the number of principal components in PCA.
We approach the modelorder selection problem with the minimum transfer cost principle, a method that enables one to use crossvalidation for unsupervised learning.
In [1], we describe this method and apply it to a number of unsupervised learning problems such as kmeans, SVD/PCA,
Gaussian mixture models (for GMM, crossvalidation is in fact applicable without our method), correlation clustering,
and Boolean matrix factorization (in context of the role mining problem).
In [2] and [3], we use the framework of approximation set coding for model selection.
This is an informationtheoretic principle for regularized optimization.
In [2], we apply it to SVD, in [3] we apply it to GMM and Boolean matrix factorization via multiassignment clustering.
The latter paper is optimal to start with if you haven't heard about approximation set coding earlier.
Relevant publications:

Mario Frank, Morteza Haghir Chehreghani and Joachim M. Buhmann
"The Minimum Transfer Cost Principle for ModelOrder Selection".
ECML PKDD 2011: European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases
[ bib  pdf  abstract  doi ]

Mario Frank and Joachim M. Buhmann
"Selecting the rank of truncated SVD by Maximum Approximation Capacity".
ISIT 2011: IEEE International Symposium on Information Theory
[ bib  pdf @ arXiv  abstract  doi ]

Joachim M. Buhmann, Morteza Haghir Chehreghani, Mario Frank and Andreas P. Streich
"Information Theoretic Model Selection for Pattern Analysis".
in JMLR Workshop and Conference Proceedings 7, 18: ICML 2011 Workshop on Unsupervised and Transfer Learning
[ bib  pdf  slides  abstract ]