Name Mode Size
.github 040000
R 040000
data-raw 040000
data 040000
inst 040000
man 040000
src 040000
tests 040000
vignettes 040000
.Rbuildignore 100644 0 kb
.gitattributes 100644 0 kb
.gitignore 100644 0 kb
DESCRIPTION 100644 1 kb
LICENCE 100644 1 kb
NAMESPACE 100644 1 kb
NEWS 100644 3 kb
README.md 100644 4 kb
fgsea.Rproj 100644 0 kb
test.R 100644 0 kb
README.md
[![R-CMD-check](https://github.com/ctlab/fgsea/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/ctlab/fgsea/actions/workflows/R-CMD-check.yaml) # fgsea `fgsea` is an R-package for fast preranked gene set enrichment analysis (GSEA). This package allows to quickly and accurately calculate arbitrarily low GSEA P-values for a collection of gene sets. P-value estimation is based on an adaptive multi-level split Monte-Carlo scheme. See [the preprint](https://www.biorxiv.org/content/10.1101/060012v3) for algorithmic details. Full vignette can be found here: http://bioconductor.org/packages/devel/bioc/vignettes/fgsea/inst/doc/fgsea-tutorial.html ## Installation `fgsea` is a part of R/Bioconductor and is availble on Linux, macOS and Windows platforms. For the installation instructions and more details please refer to https://bioconductor.org/packages/release/bioc/html/fgsea.html The latest version of `fgsea` can be installed from GitHub using `devtools` package, which can take up to a few minutes to install all the dependencies: ```{r} library(devtools) install_github("ctlab/fgsea") ``` ## Quick run Loading libraries ```{r} library(data.table) library(fgsea) library(ggplot2) ``` Loading example pathways and gene-level statistics: ```{r} data(examplePathways) data(exampleRanks) ``` Running fgsea (should take about 10 seconds): ```{r} fgseaRes <- fgsea(pathways = examplePathways, stats = exampleRanks, minSize = 15, maxSize = 500) ``` The head of resulting table sorted by p-value: ``` pathway pval padj log2err ES NES size 5990979_Cell_Cycle,_Mitotic 1e-10 4e-09 NA 0.5595 2.7437 317 5990980_Cell_Cycle 1e-10 4e-09 NA 0.5388 2.6876 369 5990981_DNA_Replication 1e-10 4e-09 NA 0.6440 2.6390 82 5990987_Synthesis_of_DNA 1e-10 4e-09 NA 0.6479 2.6290 78 5990988_S_Phase 1e-10 4e-09 NA 0.6013 2.5069 98 5990990_G1_S_Transition 1e-10 4e-09 NA 0.6233 2.5625 84 5990991_Mitotic_G1-G1_S_phases 1e-10 4e-09 NA 0.6285 2.6256 101 5991209_RHO_GTPase_Effectors 1e-10 4e-09 NA 0.5249 2.3712 157 5991454_M_Phase 1e-10 4e-09 NA 0.5576 2.5491 173 5991502_Mitotic_Metaphase_and_Anaphase 1e-10 4e-09 NA 0.6053 2.6331 123 ``` As you can see `fgsea` has a default lower bound `eps=1e-10` for estimating P-values. If you need to estimate P-value more accurately, you can set the `eps` argument to zero in the `fgsea` function. ```{r} fgseaRes <- fgsea(pathways = examplePathways, stats = exampleRanks, eps = 0.0, minSize = 15, maxSize = 500) head(fgseaRes[order(pval), ]) ``` ``` pathway pval padj log2err ES NES size 5990979_Cell_Cycle,_Mitotic 4.44e-26 1.70e-23 1.3267 0.5595 2.7414 317 5990980_Cell_Cycle 5.80e-26 1.70e-23 1.3189 0.5388 2.6747 369 5991851_Mitotic_Prometaphase 8.50e-19 1.66e-16 1.1239 0.7253 2.9674 82 5992217_Resolution_of_Sister_Chromatid_Cohesion 1.50e-17 2.19e-15 1.0769 0.7348 2.9482 74 5991454_M_Phase 1.10e-14 1.29e-12 0.9865 0.5576 2.5436 173 5991599_Separation_of_Sister_Chromatids 3.01e-14 2.94e-12 0.9653 0.6165 2.6630 116 ``` One can make an enrichment plot for a pathway: ```{r} plotEnrichment(examplePathways[["5991130_Programmed_Cell_Death"]], exampleRanks) + labs(title="Programmed Cell Death") ``` ![enrichment.png](https://www.dropbox.com/s/zusn9pju7f608sn/enrichment.png?raw=1) Or make a table plot for a bunch of selected pathways: ```{r} topPathwaysUp <- fgseaRes[ES > 0][head(order(pval), n=10), pathway] topPathwaysDown <- fgseaRes[ES < 0][head(order(pval), n=10), pathway] topPathways <- c(topPathwaysUp, rev(topPathwaysDown)) plotGseaTable(examplePathways[topPathways], exampleRanks, fgseaRes, gseaParam=0.5) ``` <img src="https://alserglab.wustl.edu/files/fgsea/readme_enrichmentPlot.png">