Advances in Statistical Modeling of High Dimensional Data:
Variable selection, and Challenges in Image Analysis


Fabian Theis, Helmholtzzentrum München

Independent subspace analysis and extraction


Matrix factorization algorithms, in particular nonnegative matrix factorization, principal and independent component analysis (ICA), have recently found successful, important applications in the analysis of biological recordings such as microarray data and fluoresence image stacks. Here we focus on separation based on statistical independence. Separation using independence may only be applied to data following the generative ICA model in order to guarantee algorithm-independent and theoretically valid results. Subspace ICA models generalize the assumption of component independence to independence between groups of components. They are attractive candidates for dimensionality reduction methods, however are currently limited by the assumption of equal group sizes or less general semi-parametric models. By introducing the concept of irreducible independent subspaces or components, we present a generalization to a parameter-free mixture model, and prove separability. More generally, we ask how to identify and extract subspaces in data based on statistical properties such as non-Gaussianity or signal color (autocorrelations). In the first part of my talk, I will review some matrix factorization techniques and results with a focus towards ICA. Then I will focus on subspace extraction for dimension reduction and finally for independent subspace analysis itself.

[Back to the Schedule]