\title{Advanced Rank Sum Analysis}
\description{The function performs the Rank Sum method to
identify differentially expressed genes. It is possible to do either a
one-class or two-class analysis. It is also possible to combine data from
different studies (e.g. datasets generated by different laboratories.
This function has been kept only to guarantee
backward compatibility, in fact the same results can be obtained by
\usage{RSadvance(data, cl, origin, num.perm = 100, logged = TRUE, na.rm = TRUE, 
gene.names = NULL, plot = FALSE, rand = NULL, huge = FALSE, fast = TRUE,
tail.time = 0.05)}
\item{data}{the data set that should be analyzed. Every row of this dataset must
correspond to a gene}
\item{cl}{a vector containing the class labels of the samples.
In the two class unpaired case, the label of a sample
is either 0 (e.g., control group) or 1 (e.g., case group).
For one class data, the label for each sample should be 1}
\item{origin}{a vector containing the origin labels of the samples. The label is
the same for samples within one lab and different for samples from different
\item{num.perm}{in this version of the package, this parameter is not used any
more, but it is kept for backward compatibility}
\item{logged}{if "TRUE" data have been previously log transformed. Otherwise it 
should be set as "FALSE"}
\item{na.rm}{if "FALSE", the NA value will not be used in
computing rank. If "TRUE" (default), the missing values will be replaced by
the genewise median of the non-missing values.
Gene with a number of missing values greater than 50\% are still
not considered in the analysis}
\item{gene.names}{if "NULL", no gene name will be attached to the outputs,
otherwise it contains the vector of gene names}
\item{plot}{if "TRUE", plot the estimated pfp vs the rank of each gene}
\item{rand}{if specified, the random number generator will
be put in a reproducible state}
\item{huge}{if "TRUE" not all the outputs are evaluated
in order to save space}
\item{fast}{if "FALSE" the exact p-values for the Rank Sum are evaluated for
any size of the dataset.
Otherwise (default), if the size of the dataset is too big, only the p-values
that can be computed in "tail.time" minutes (starting from the tail) are
evaluated with the exact method. The others are estimated with the Gaussian
approximation. If calculateProduct="TRUE" this parameter is ignored}
\item{tail.time}{the time (default 0.05 min) dedicated to evaluate the exact
p-values for the Rank Sum.
If calculateProduct="TRUE" this parameter is ignored}
A result of identifying differentially expressed genes between two classes. The 
identification consists of two parts, the identification of up-regulated
and down-regulated genes in class 2 compared to class 1, respectively.

\item{pfp}{Estimated percentage of false positive predictions (pfp) up to the 
position of each gene under two identificaiton each}

\item{pval}{Estimated pvalues for each gene being up- and down-regulated}

\item{RSs}{Rank-sum (average rank) of each genes}

\item{RSrank}{Rank of the rank sum of each gene in ascending order}

\item{Orirank}{Ranks in each possible pairing, in this version of the function
this is not used to compute rank sum.
It is here only for backward compatibility}

\item{AveFC}{Fold change of average expression under class 1 over that
under class 2, if multiple origin, than avraged across all origin.
Log-fold change if data is in log scaled, original fold change
if data is unlogged}

\item{allrank1}{Fold change of class 1/class 2 under each origin.
Log-fold change if data is in log scaled}

\item{allrank2}{Fold change of class 2/class 1 under each origin.
Log-fold change if data is in log scaled}

\item{nrep}{Total number of replicates considering all the different origins}

\item{groups}{Vector of labels (as cl).}
Breitling, R., Armengaud, P., Amtmann, A., and Herzyk, P.(2004) Rank Products: A 
simple, yet powerful, new method to detect differentially regulated genes in 
replicated microarray experiments, FEBS Letter, 57383-92 
Francesco Del Carratore,
\cr Andris Janckevics, \email{andris.jankevics@gmail.com}
Percentage of false prediction (pfp), in theory,
is equivalent of false discovery rate (FDR),
and it is possible to be large than 1.

The function looks for up- and down- regulated genes in two seperate steps, thus 
two pfps are computed and used to identify gene that belong to each group.

The function is able to deal with single or multiple-orgin studies.
It is similar to funcion RP.advance expect a rank sum is computed instead of
rank product. This method is more sensitive to individual rank values, while
rank product is more robust to outliers (refer RankProd vignette for details) 

\code{\link{topGene}} \code{\link{RP}} \code{\link{RPadvance}}
\code{\link{plotRP}} \code{\link{RP.advance}} \code{\link{RankProducts}}

#Suppose we want to check the consistence of the data 
#sets generated in two different 
#labs. For example, we would look for genes that were \
# measured to be up-regulated in 
#class 2 at lab 1, but down-regulated in class 2 at lab 2.\
arab.cl2 <- arab.cl

arab.cl2[arab.cl==0 &arab.origin==2] <- 1

arab.cl2[arab.cl==1 &arab.origin==2] <- 0

##[1] 0 0 0 1 1 1 1 1 0 0

#look for genes differentially expressed
#between hypothetical class 1 and 2
arab.sub=arab[1:500,] ##using subset for fast computation
Rsum.adv.out <- RSadvance(arab.sub,arab.cl2,arab.origin,