Name Mode Size
R 040000
inst 040000
man 040000
src 040000
tests 040000
vignettes 040000
.Rbuildignore 100644 0 kb
.gitignore 100644 0 kb
.travis.yml 100644 2 kb
DESCRIPTION 100644 1 kb
NAMESPACE 100755 1 kb 100644 3 kb 100644 1 kb 100644 0 kb
make.R 100644 1 kb
mkdocs.yml 100644 1 kb
ready.yml 100644 0 kb 100644 2 kb
# qckitfastq: A comprehensive quality control R package for Next Generation Sequencing FASTQ data [![Travis](]( [![coverage](]( [![Docs](]( # Overview This R package contains tools for comprehensive quality control of FASTQ format data. We hope to replicate existing tools for FASTQ quality control as well as advance FASTQ metrics where data is truncated for the analysis. We enable efficient processing of FASTQ format data by implementing efficient C++ functions using `Rcpp`. The metrics that `qckitfastq` provides are as following: 1. data dimension 2. per base sequence content 3. per base quality score statisitcs 4. per read GC content 5. per read mean quality score 6. overrepresented sequence 7. per base kmer count 8. overrepresented kmer The above metrices include both analysis results tables and visualizations of results. # Getting started ## Prerequisites `qckitfastq` has dependencies on both CRAN packages and Bioconductor packages. Commands to install all prerequisites from R are given below: ```r install.packages(c('magrittr','ggplot2','dplyr','testthat','data.table','reshape2','grDevices','graphics','stats','utils','Rcpp','kableExtra','rlang','knitr','rmarkdown')) if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install(c("RSeqAn","seqTools","zlibbioc") ``` ## Installing ### From Bioconductor `qckitfastq` release version is on Bioconductor. To install from, follow instructions on the [package page]( ### From Github repo This repository contains the development version. You will need `devtools` to install. ```{r} devtools::install_github("compbiocore/qckitfastq",build_vignettes=TRUE) library(qckitfastq) ``` ## Usage The simplest way to run `qckitfastq` and its intended usage is by executing `run_all`, a single command that will produce a report of all of the included metrics in a user-provided directory with some default parameters and default filenames. These default parameters and filenames cannot be changed. An example using `tempdir()` and an example `fq.gz` file is given below: ```r library(qckitfastq) infile <- system.file("extdata","10^5_reads_test.fq.gz",package="qckitfastq") testfolder <- tempdir() run_all(infile,testfolder) ``` However, each metric can also be run separately for closer examination, parameter tuning, or if the user wishes to save reports with a different filename. In those cases, we recommend taking a look at the [`qckitfastq` vignette]( to get started. The vignette can also be viewed in RStudio with the following commands: ```{r} library(qckitfastq) browseVignettes("qckitfastq") ``` ## Release history See [`NEWS`]( for changes. ## Authors * August Guang, creator and maintainer. * Wenyue Xing, creator.