Advances in Statistical Modeling of High Dimensional Data:
Variable selection, and Challenges in Image Analysis


Julien Gagneur, EMBL Heidelberg

Minimal Gene Set enrichment


Gene group enrichment analyses often return a large number of groups making their interpretation difficult. Beyond issues of multiple testing, one reason is that these groups share genes, so that if one group turns out significant, further groups with many genes in common with it may also be significant. This is particularly relevant for the Gene Ontology which consists of nested groups and for which heuristics exploiting this structure have been previously proposed. Here we tackle the problem by turning the question differently. Instead of searching for all significantly enriched groups, we search for a minimal set of groups that can explain the data. We model the experimental observation by a set of "active" groups. Our model penalizes the number of active groups thus naturally providing parsimonious solutions.

[Back to the Schedule]