Name Mode Size
.github 040000
R 040000
inst 040000
man 040000
src 040000
tests 040000
vignettes 040000
.Rbuildignore 100644 0 kb
.gitignore 100644 0 kb
BioQC.Rproj 100644 0 kb
DESCRIPTION 100644 2 kb
LICENSE 100644 34 kb
Makefile 100644 2 kb
NAMESPACE 100644 2 kb
NEWS 100644 4 kb
README.md 100644 5 kb
_pkgdown.yml 100644
codecov.yml 100644 0 kb
README.md
[![R-CMD-check-bioc](https://github.com/Accio/BioQC/workflows/R-CMD-check-bioc/badge.svg)](https://github.com/Accio/BioQC/actions) [![codecov](https://codecov.io/gh/Accio/BioQC/branch/master/graph/badge.svg)](https://codecov.io/gh/Accio/BioQC) [![install with bioconda](https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?style=flat)](http://bioconda.github.io/recipes/bioconductor-bioqc/README.html) [![license](https://img.shields.io/badge/license-GPLv3-green.svg)](https://github.com/accio/BioQC/blob/master/LICENSE.md) [![docs](https://img.shields.io/badge/docs-pkgdown-blue.svg)](https://accio.github.io/BioQC) BioQC is a is a R/[Bioconductor](https://bioconductor.org/packages/release/bioc/html/BioQC.html) package to detect tissue heterogeneity in gene expression data. Tissue heterogeneity is a consequence of unintended profiling of cells of other origins than the tissue of interest and can have both technical (*e.g.* imperfect disection) or biological (*e.g.* immune infiltration) reasons. We [demonstrated](https://www.biorxiv.org/content/10.1101/2020.12.02.407809v1) that tissue heterogeneity is prevalent in 5-15% of all gene expression studies. Ignoring tissue heterogeneity reduces statistical power of data analysis and can, in the worst case, invalidate the conclusions of a study. Therefore, we propose applying BioQC as a routine step in every gene-expression analysis pipeline. The BioQC method is described in > Zhang, Jitao David, Klas Hatje, Gregor Sturm, Clemens Broger, Martin Ebeling, Martine Burtin, Fabiola Terzi, Silvia Ines Pomposiello, and Laura Badi. > “Detect Tissue Heterogeneity in Gene Expression Data with BioQC.” BMC Genomics 18 (2017): 277. [doi:10.1186/s12864-017-3661-2](https://doi.org/10.1186/s12864-017-3661-2). ## Basic Usage BioQC implements a computationally efficient Wilcoxon-Mann-Whitney test for testing for enrichment of tissue signatures. A database of 150 tissue signatures derived from large-scale transcriptomics studies is shipped with the BioQC package. To apply BioQC to a `genes x samples` gene expression matrix, run: ```R library(BioQC) # load the tissue signatures gmtFile <- system.file("extdata/exp.tissuemark.affy.roche.symbols.gmt", package="BioQC") gmt <- readGmt(gmtFile) # perform BioQC enrichment test on a gene expression matrix bioqc_res = wmwTest(expr_mat, gmt) bioqc_scores = absLog10p(bioqc_res) ``` The following figure shows the BioQC scores from the [kidney example](https://accio.github.io/BioQC/bioqc.html) visualized as heatmap. We note that in samples 23-25 *adipose* and *pancreas* signatures have been detected, hinting at a containation with those tissues. For this dataset, we could validate the contamination with qPCR. ![example heatmap](man/figures/kidney_heatmap.png) For a more detailed example explaining how to use other data structures or custom signatures see * [Introduction to BioQC](https://accio.github.io/BioQC/articles/bioqc-introduction.html). * [Applying BioQC to a real-world kidney dataset](https://accio.github.io/BioQC/articles/BioQC.html). For advanced usages, check out: * [Single sample gene set enrichment analysis with BioQC](https://accio.github.io/BioQC/articles/bioqc-signedGenesets.html) For benchmarks and details about the algorithm, see: * [BioQC-benchmark: Testing Efficiency, Sensitivity and Specificity of BioQC on simulated and real-world data](https://accio.github.io/BioQC/articles/bioqc-simulation.html) * [Comparing the Wilcoxon-Mann-Whitney to alternative statistical tests](https://accio.github.io/BioQC/articles/bioqc-wmw-test-performance.html) * [The BioQC algorithm: speeding up the WMW-test](https://accio.github.io/BioQC/articles/bioqc-efficiency.html) ## Installation ### Bioconductor BioQC is available [from Bioconductor](https://www.bioconductor.org/packages/release/bioc/html/BioQC.html). You can install it using ```R if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("BioQC") ``` ### Bioconda Alternatively, you can use the [conda](https://docs.conda.io/en/latest/miniconda.html) package manager. 1. Make sure you [set-up the Bioconda channel correctly](https://bioconda.github.io/user/install.html#set-up-channels). The order of the channels is important! 2. (Optional) Create and activate an environment for BioQC ```bash conda create -n bioqc conda activate bioqc ``` 3. Install the `bioconductor-bioqc` package in your current environment ```bash conda install bioconductor-bioqc ``` ### From Github The easiest way to install the development version from GitHub is using the `remotes` package: ```R install.packages("remotes") remotes::install_github("accio/BioQC") ``` ## Contact If you have questions regarding BioQC or want to report a bug, please use the [issue tracker](https://github.com/accio/BioQC/issues). Alternatively you can reach out to Jitao David Zhang via [e-mail](jitao_david.zhang@roche.com).