% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/sample_filtering.R
\name{metric_sample_filter}
\alias{metric_sample_filter}
\title{Metric-based Sample Filtering: Function to filter single-cell RNA-Seq 
libraries.}
\usage{
metric_sample_filter(expr, nreads = colSums(expr), ralign = NULL,
  gene_filter = NULL, pos_controls = NULL, scale. = FALSE, glen = NULL,
  AUC_range = c(0, 15), zcut = 1, mixture = TRUE, dip_thresh = 0.05,
  hard_nreads = 25000, hard_ralign = 15, hard_breadth = 0.2,
  hard_auc = 10, suff_nreads = NULL, suff_ralign = NULL,
  suff_breadth = NULL, suff_auc = NULL, plot = FALSE, hist_breaks = 10,
  ...)
}
\arguments{
\item{expr}{matrix The data matrix (genes in rows, cells in columns).}

\item{nreads}{A numeric vector representing number of reads in each library.
Default to `colSums` of `expr`.}

\item{ralign}{A numeric vector representing the proportion of reads aligned
to the reference genome in each library. If NULL, filtered_ralign will be
returned NA.}

\item{gene_filter}{A logical vector indexing genes that will be used to 
compute library transcriptome breadth. If NULL, filtered_breadth will be 
returned NA.}

\item{pos_controls}{A logical, numeric, or character vector indicating 
positive control genes that will be used to compute false-negative rate 
characteristics. If NULL, filtered_fnr will be returned NA.}

\item{scale.}{logical. Will expression be scaled by total expression for FNR 
computation? Default = FALSE}

\item{glen}{Gene lengths for gene-length normalization (normalized data used 
in FNR computation).}

\item{AUC_range}{An array of two values, representing range over which FNR 
AUC will be computed (log(expr_units)). Default c(0,15)}

\item{zcut}{A numeric value determining threshold Z-score for sd, mad, and 
mixture sub-criteria. Default 1. If NULL, only hard threshold sub-criteria 
will be applied.}

\item{mixture}{A logical value determining whether mixture modeling 
sub-criterion will be applied per primary criterion (metric). If true, a
dip test will be applied to each metric. If a metric is multimodal, it is
fit to a two-component normal mixture model. Samples deviating zcut sd's
from optimal mean (in the inferior direction), have failed this 
sub-criterion.}

\item{dip_thresh}{A numeric value determining dip test p-value threshold. 
Default 0.05.}

\item{hard_nreads}{numeric. Hard (lower bound on) nreads threshold. Default 
25000.}

\item{hard_ralign}{numeric. Hard (lower bound on) ralign threshold. Default 
15.}

\item{hard_breadth}{numeric. Hard (lower bound on) breadth threshold. Default
0.2.}

\item{hard_auc}{numeric. Hard (upper bound on) fnr auc threshold. Default 10.}

\item{suff_nreads}{numeric. If not null, serves as an overriding upper bound 
on nreads threshold.}

\item{suff_ralign}{numeric. If not null, serves as an overriding upper bound 
on ralign threshold.}

\item{suff_breadth}{numeric. If not null, serves as an overriding upper bound
on breadth threshold.}

\item{suff_auc}{numeric. If not null, serves as an overriding lower bound on 
fnr auc threshold.}

\item{plot}{logical. Should a plot be produced?}

\item{hist_breaks}{hist() breaks argument. Ignored if `plot=FALSE`.}

\item{...}{Arguments to be passed to methods.}
}
\value{
A list with the following elements: \itemize{ \item{filtered_nreads}{
 Logical. Sample has too few reads.} \item{filtered_ralign}{ Logical. Sample
 has too few reads aligned.} \item{filtered_breadth}{ Logical. Samples has 
 too few genes detected (low breadth).} \item{filtered_fnr}{ Logical. Sample
 has a high FNR AUC.} }
}
\description{
This function returns a sample-filtering report for each cell in the input 
expression matrix, describing which filtering criteria are satisfied.
}
\details{
For each primary criterion (metric), a sample is evaluated based on
 4 sub-criteria: 1) Hard (encoded) threshold 2) Adaptive thresholding via
 sd's from the mean 3) Adaptive thresholding via mad's from the median 4)
 Adaptive thresholding via sd's from the mean (after mixture modeling) A
 sample must pass all sub-criteria to pass the primary criterion.
}
\examples{
mat <- matrix(rpois(1000, lambda = 5), ncol=10)
colnames(mat) <- paste("X", 1:ncol(mat), sep="")
qc = as.matrix(cbind(colSums(mat),colSums(mat > 0)))
rownames(qc) = colnames(mat)
colnames(qc) = c("NCOUNTS","NGENES")
mfilt = metric_sample_filter(expr = mat,nreads = qc[,"NCOUNTS"],
   plot = TRUE, hard_nreads = 0)

}