Name Mode Size
.github 040000
R 040000
inst 040000
man 040000
src 040000
tests 040000
vignettes 040000
.Rbuildignore 100644 0 kb
.gitignore 100644 0 kb
BioQC.Rproj 100644 0 kb
DESCRIPTION 100644 2 kb
LICENSE 100644 34 kb
Makefile 100644 2 kb
NAMESPACE 100644 2 kb
NEWS 100644 4 kb 100644 5 kb
_pkgdown.yml 100644
codecov.yml 100644 0 kb
[![R-CMD-check-bioc](]( [![codecov](]( [![install with bioconda](]( [![license](]( [![docs](]( BioQC is a is a R/[Bioconductor]( package to detect tissue heterogeneity in gene expression data. Tissue heterogeneity is a consequence of unintended profiling of cells of other origins than the tissue of interest and can have both technical (*e.g.* imperfect disection) or biological (*e.g.* immune infiltration) reasons. We [demonstrated]( that tissue heterogeneity is prevalent in 5-15% of all gene expression studies. Ignoring tissue heterogeneity reduces statistical power of data analysis and can, in the worst case, invalidate the conclusions of a study. Therefore, we propose applying BioQC as a routine step in every gene-expression analysis pipeline. The BioQC method is described in > Zhang, Jitao David, Klas Hatje, Gregor Sturm, Clemens Broger, Martin Ebeling, Martine Burtin, Fabiola Terzi, Silvia Ines Pomposiello, and Laura Badi. > “Detect Tissue Heterogeneity in Gene Expression Data with BioQC.” BMC Genomics 18 (2017): 277. [doi:10.1186/s12864-017-3661-2]( ## Basic Usage BioQC implements a computationally efficient Wilcoxon-Mann-Whitney test for testing for enrichment of tissue signatures. A database of 150 tissue signatures derived from large-scale transcriptomics studies is shipped with the BioQC package. To apply BioQC to a `genes x samples` gene expression matrix, run: ```R library(BioQC) # load the tissue signatures gmtFile <- system.file("extdata/exp.tissuemark.affy.roche.symbols.gmt", package="BioQC") gmt <- readGmt(gmtFile) # perform BioQC enrichment test on a gene expression matrix bioqc_res = wmwTest(expr_mat, gmt) bioqc_scores = absLog10p(bioqc_res) ``` The following figure shows the BioQC scores from the [kidney example]( visualized as heatmap. We note that in samples 23-25 *adipose* and *pancreas* signatures have been detected, hinting at a containation with those tissues. For this dataset, we could validate the contamination with qPCR. ![example heatmap](man/figures/kidney_heatmap.png) For a more detailed example explaining how to use other data structures or custom signatures see * [Introduction to BioQC]( * [Applying BioQC to a real-world kidney dataset]( For advanced usages, check out: * [Single sample gene set enrichment analysis with BioQC]( For benchmarks and details about the algorithm, see: * [BioQC-benchmark: Testing Efficiency, Sensitivity and Specificity of BioQC on simulated and real-world data]( * [Comparing the Wilcoxon-Mann-Whitney to alternative statistical tests]( * [The BioQC algorithm: speeding up the WMW-test]( ## Installation ### Bioconductor BioQC is available [from Bioconductor]( You can install it using ```R if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("BioQC") ``` ### Bioconda Alternatively, you can use the [conda]( package manager. 1. Make sure you [set-up the Bioconda channel correctly]( The order of the channels is important! 2. (Optional) Create and activate an environment for BioQC ```bash conda create -n bioqc conda activate bioqc ``` 3. Install the `bioconductor-bioqc` package in your current environment ```bash conda install bioconductor-bioqc ``` ### From Github The easiest way to install the development version from GitHub is using the `remotes` package: ```R install.packages("remotes") remotes::install_github("accio/BioQC") ``` ## Contact If you have questions regarding BioQC or want to report a bug, please use the [issue tracker]( Alternatively you can reach out to Jitao David Zhang via [e-mail](