Second Generation: Functional Class Scoring (FCS) Approaches
Functional Class Scoring(FCA)的推测设想认为虽然强烈的单个基因的改变可以影响到pathways,但是微弱的相互协同的功能相关基因的变化也可以拥有这种影响,所以这种方法的输入数据是一个基因水平的统计数据(标准化后食用更佳),随后把gene-level的数据输入到pathway-level进行统计,现有方法包括Kolmogorov-Smirnov statistic, sum, mean, or median of gene-level statistic, the Wilcoxon rank sum, and the maxmean statistic等,最后再做一个显著性检验。相对于ORA,FCS完善了三个缺陷:
Gene Set Enrichment Analysis (基因集富集分析)用来评估一个预先定义的基因集的基因在与表型相关度排序的基因表中的分布趋势,从而判断其对表型的贡献。其输入数据包含两部分,一是已知功能的基因集 (可以是GO注释、MsigDB的注释或其它符合格式的基因集定义),一是表达矩阵,软件会对基因根据其于表型的关联度(可以理解为表达值的变化)从大到小排序,然后判断基因集内每条注释下的基因是否富集于表型相关度排序后基因表的上部或下部,从而判断此基因集内基因的协同变化对表型变化的影响。
The gene sets are defined based on prior biological knowledge, e.g., published information about biochemical pathways or coexpression in previous experiments. The goal of GSEA is to determine whether members of a gene set S tend to occur toward the top (or bottom) of the list L, in which case the gene set is correlated with the phenotypic class distinction.
Khatri, P., Sirota, M., & Butte, A. J. (2012). Ten years of pathway analysis: Current approaches and outstanding challenges. PLoS Computational Biology, 8(2).
Subramanian A, Tamayo P, Mootha V K, et al. Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles[J]. Proceedings of the National Academy of Sciences, 2005, 102(43): 15545-15550.