% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/AllGenerics.R, R/deSet-methods.R
\title{Non-Parametric Jackstraw for Principal Component Analysis (PCA)}
apply_jackstraw(object, r1 = NULL, r = NULL, s = NULL, B = NULL,
  covariate = NULL, verbose = TRUE, seed = NULL)

\S4method{apply_jackstraw}{deSet}(object, r1 = NULL, r = NULL, s = NULL,
  B = NULL, covariate = NULL, verbose = TRUE, seed = NULL)
\item{object}{\code{S4 object}: \code{\linkS4class{deSet}}}

\item{r1}{a numeric vector of principal components of interest. Choose a subset of r significant PCs to be used.}

\item{r}{a number (a positive integer) of significant principal components.}

\item{s}{a number (a positive integer) of synthetic null variables. Out of m variables, s variables are independently permuted.}

\item{B}{a number (a positive integer) of resampling iterations. There will be a total of s*B null statistics.}

\item{covariate}{a data matrix of covariates with corresponding n observations.}

\item{verbose}{a logical indicator as to whether to print the progress.}

\item{seed}{a seed for the random number generator.}
\code{apply_jackstraw} returns a \code{list} containing the following
\item{\code{p.value} the m p-values of association tests between variables
and their principal components}
\item{\code{obs.stat} the observed F-test statistics}
\item{\code{null.stat} the s*B null F-test statistics}
Estimates statistical significance of association between variables and
their principal components (PCs).
This function computes m p-values of linear association between m variables
and their PCs. Its resampling strategy accounts for the over-fitting
characteristics due to direct computation of PCs from the observed data
and protects against an anti-conservative bias.

Provide the \code{\linkS4class{deSet}},
with m variables as rows and n observations as columns. Given that there are
r significant PCs, this function tests for linear association between m
varibles and their r PCs.

You could specify a subset of significant PCs
that you are interested in r1. If PC is given, then this function computes
statistical significance of association between m variables and PC, while
adjusting for other PCs (i.e., significant PCs that are not your interest).
For example, if you want to identify variables associated with 1st and 2nd
PCs, when your data contains three significant PCs, set r=3 and r1=c(1,2). 

Please take a careful look at your data and use appropriate graphical and
statistical criteria to determine a number of significant PCs, r. The number
of significant PCs depends on the data structure and the context. In a case
when you fail to specify r, it will be estimated from a permutation test
(Buja and Eyuboglu, 1992) using a function \code{\link{permutationPA}}.

If s is not supplied, s is set to about 10% of m variables. If B is not
supplied, B is set to m*10/s.
age <- kidney$age
sex <- kidney$sex
kidexpr <- kidney$kidexpr
cov <- data.frame(sex = sex, age = age)
# create models
null_model <- ~sex
full_model <- ~sex + ns(age, df = 4)
# create deSet object from data
de_obj <- build_models(data = kidexpr, cov = cov, null.model = null_model,
                      full.model = full_model)
## apply the jackstraw
out = apply_jackstraw(de_obj, r1=1, r=1)
## Use optional arguments
## For example, set s and B for a balance between speed of the algorithm and accuracy of p-values
## out = apply_jackstraw(dat, r1=1, r=1, s=10, B=1000, seed=5678)

Neo Christopher Chung \email{nc@princeton.edu}
Chung and Storey (2013) Statistical Significance of
Variables Driving Systematic Variation in
High-Dimensional Data. arXiv:1308.6013 [stat.ME]

More information available at \url{http://ncc.name/}