% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/data.R
\docType{data}
\name{control_genes}
\alias{control_genes}
\alias{cortical_markers}
\alias{housekeeping}
\alias{housekeeping_revised}
\alias{cellcycle_genes}
\title{Data: Positive and Negative Control Genes}
\description{
Sets of "positive" and "negative" control genes, useful arguments for 
\code{\link{scone}}.
}
\details{
These gene sets can be used as negative or positive controls, either for RUV
factor normalization or for evaluation and ranking of the normalization
workflows.

Gene set datasets are in the form of \code{data.frame}, with the
  first column containing the gene symbols and an (optional) second column
  containing additional information (such as cortical layer or cell cycle
  phase).
  

Note that the gene symbols follow the mouse conventions (i.e. 
  capitalized) or the human conventions (i.e, all upper-case), based on the
  original publication. One can use the \code{\link[base]{toupper}}, 
  \code{\link[base]{tolower}}, and \code{\link[tools]{toTitleCase}} 
  functions to alter symbol conventions.
  

Mouse gene symbols in \code{cortical_markers} are transcribed from
  Figure 3 of Molyneaux et al. (2007): "laminar-specific expression of 66
  genes within the neocortex."
  

Human gene symbols in \code{housekeeping} are derived from the list
  of "housekeeping" genes from the cDNA microarray analysis of Eisenberg
  and Levanon (2003): "[HK genes] belong to the class of genes that are
  EXPRESSED in all tissues." "... from 47 different human tissues and cell
  lines."
  

Human gene symbols in \code{housekeeping_revised} from Eisenberg
  and Levanon (2013): "This list provided ... is based on analysis of
  next-generation sequencing (RNA-seq) data. At least one variant of these
  genes is expressed in all tissues uniformly... The RefSeq transcript
  according to which we deemed the gene 'housekeeping' is given."
  Housekeeping exons satisfy "(i) expression observed in all tissues; (ii)
  low variance over tissues: standard-deviation [log2(RPKM)]<1; and (iii) no
  exceptional expression in any single tissue; that is, no log-expression
  value differed from the averaged log2(RPKM) by two (fourfold) or more."
  "We define a housekeeping gene as a gene for which at least one RefSeq
  transcript has more than half of its exons meeting the previous criteria
  (thus being housekeeping exons)."
  

Human gene symbols in \code{cellcycle_genes} from Macosko et al.
  (2015) and represent a set of genes marking G1/S, S, G2/M, M, and M/G1 
  phases.
}
\examples{
data(housekeeping)
data(housekeeping_revised)
data(cellcycle_genes)
data(cortical_markers)
}
\references{
Molyneaux, B.J., Arlotta, P., Menezes, J.R. and Macklis, J.D.. 
  Neuronal subtype specification in the cerebral cortex. Nature Reviews 
  Neuroscience, 2007, 8(6):427-437.

Eisenberg E, Levanon EY. Human housekeeping genes are compact. 
  Trends in Genetics, 2003, 19(7):362-5.

Eisenberg E, Levanon EY. Human housekeeping genes, revisited. 
  Trends in Genetics, 2013, 29(10):569-74.

Macosko, E. Z., et al. Highly parallel genome-wide expression 
  profiling of individual cells using nanoliter droplets. Cell, 2015, 
  161.5:1202-1214.
}