Browse code

- classifier and selectionMethod parameters of crossValidate set to "auto" by default. Different algorithms chosen depending on whether input data has cateogrical or survival outcome. - rfsrcTrainInterface corrected to extract the time and event columns in the correct order. - DESCRIPTION file updated to have Authors@R instead of Author and Maintainer fields for compatibility with pkgdown. - Description of prepareData input data filtering added to introduction vignette. - Placeholders added for performance evaluation, multi-view methods and registration of new functions articles. - Restored parallel processing in runTests which was inadvertently commented-out during recent testing.

Dario Strbenac authored on 09/11/2022 04:30:08
Showing 69 changed files

... ...
@@ -3,10 +3,19 @@ Type: Package
3 3
 Title: A framework for cross-validated classification problems, with
4 4
        applications to differential variability and differential
5 5
        distribution testing
6
-Version: 3.3.1
7
-Date: 2022-10-25
8
-Author: Dario Strbenac, Ellis Patrick, Sourish Iyengar, Harry Robertson, Andy Tran, John Ormerod, Graham Mann, Jean Yang
9
-Maintainer: Dario Strbenac <dario.strbenac@sydney.edu.au>
6
+Version: 3.3.2
7
+Date: 2022-11-09
8
+Authors@R: 
9
+    c(
10
+    person(given = "Dario", family = "Strbenac", email = "dario.strbenac@sydney.edu.au", role = c("aut", "cre")),
11
+    person(given = "Ellis", family = "Patrick", role = "aut"),
12
+    person(given = "Sourish", family = "Iyengar", role = "aut"),
13
+    person(given = "Harry", family = "Robertson", role = "aut"),
14
+    person(given = "Andy", family = "Tran", role = "aut"),
15
+    person(given = "John", family = "Ormerod", role = "aut"),
16
+    person(given = "Graham", family = "Mann", role = "aut"),
17
+    person(given = "Jean", family = "Yang", email = "jean.yang@sydney.edu.au", role = "aut")
18
+    )
10 19
 VignetteBuilder: knitr
11 20
 Encoding: UTF-8
12 21
 biocViews: Classification, Survival
... ...
@@ -27,8 +27,6 @@ export(distribution)
27 27
 export(edgesToHubNetworks)
28 28
 export(featureSetSummary)
29 29
 export(finalModel)
30
-export(generateCrossValParams)
31
-export(generateModellingParams)
32 30
 export(interactorDifferences)
33 31
 export(models)
34 32
 export(performance)
... ...
@@ -11,22 +11,26 @@
11 11
 #' same length as the number of samples in \code{measurements} or a character vector of length 1 containing the
12 12
 #' column name in \code{measurements} if it is a \code{\link{DataFrame}}. Or a \code{\link{Surv}} object or a character vector of
13 13
 #' length 2 or 3 specifying the time and event columns in \code{measurements} for survival outcome. If \code{measurements} is a
14
-#' \code{\link{MultiAssayExperiment}}, the column name(s) in \code{colData(measurements)} representing the outcome.
14
+#' \code{\link{MultiAssayExperiment}}, the column name(s) in \code{colData(measurements)} representing the outcome.  If column names
15
+#' of survival information, time must be in first column and event status in the second.
15 16
 #' @param outcomeTrain For the \code{train} function, either a factor vector of classes, a \code{\link{Surv}} object, or
16 17
 #' a character string, or vector of such strings, containing column name(s) of column(s)
17
-#' containing either classes or time and event information about survival.
18
+#' containing either classes or time and event information about survival. If column names
19
+#' of survival information, time must be in first column and event status in the second.
18 20
 #' @param ... Parameters passed into \code{\link{prepareData}} which control subsetting and filtering of input data.
19 21
 #' @param nFeatures The number of features to be used for classification. If this is a single number, the same number of features will be used for all comparisons
20 22
 #' or assays. If a numeric vector these will be optimised over using \code{selectionOptimisation}. If a named vector with the same names of multiple assays, 
21 23
 #' a different number of features will be used for each assay. If a named list of vectors, the respective number of features will be optimised over. 
22 24
 #' Set to NULL or "all" if all features should be used.
23
-#' @param selectionMethod A character vector of feature selection methods to compare. If a named character vector with names corresponding to different assays, 
24
-#' and performing multiview classification, the respective classification methods will be used on each assay.
25
+#' @param selectionMethod Default: "auto". A character vector of feature selection methods to compare. If a named character vector with names corresponding to different assays, 
26
+#' and performing multiview classification, the respective classification methods will be used on each assay. If \code{"auto"} t-test (two categories) / F-test (three or more categories) ranking
27
+#' and top \code{nFeatures} optimisation is done. Otherwise, the ranking method is per-feature Cox proportional hazards p-value.
25 28
 #' @param selectionOptimisation A character of "Resubstitution", "Nested CV" or "none" specifying the approach used to optimise \code{nFeatures}.
26
-#' @param performanceType Default: \code{"auto"}. If \code{"auto"}, then balanced accuracy for classification or C-index for survival. Any one of the
29
+#' @param performanceType Default: \code{"auto"}. If \code{"auto"}, then balanced accuracy for classification or C-index for survival. Otherwise, any one of the
27 30
 #' options described in \code{\link{calcPerformance}} may otherwise be specified.
28
-#' @param classifier A character vector of classification methods to compare. If a named character vector with names corresponding to different assays, 
29
-#' and performing multiview classification, the respective classification methods will be used on each assay.
31
+#' @param classifier Default: \code{"auto"}. A character vector of classification methods to compare. If a named character vector with names corresponding to different assays, 
32
+#' and performing multiview classification, the respective classification methods will be used on each assay. If \code{"auto"}, then a random forest is used for a classification
33
+#' task or Cox proportional hazards model for a survival task.
30 34
 #' @param multiViewMethod A character vector specifying the multiview method or data integration approach to use.
31 35
 #' @param assayCombinations A character vector or list of character vectors proposing the assays or, in the case of a list, combination of assays to use
32 36
 #' with each element being a vector of assays to combine. Special value \code{"all"} means all possible subsets of assays.
... ...
@@ -108,12 +112,14 @@ setMethod("crossValidate", "DataFrame",
108 112
               if(!performanceType %in% c("auto", .ClassifyRenvir[["performanceTypes"]]))
109 113
                 stop(paste("performanceType must be one of", paste(c("auto", .ClassifyRenvir[["performanceTypes"]]), collapse = ", "), "but is", performanceType))
110 114
               
115
+              isCategorical <- is.character(outcome) && (length(outcome) == 1 || length(outcome) == nrow(measurements)) || is.factor(outcome)
111 116
               if(performanceType == "auto")
112
-              {
113
-                if(is.character(outcome) && (length(outcome) == 1 || length(outcome) == nrow(measurements)) || is.factor(outcome))
114
-                  performanceType <- "Balanced Accuracy"
115
-                else performanceType <- "C-index"
116
-              }
117
+                if(isCategorical) performanceType <- "Balanced Accuracy" else performanceType <- "C-index"
118
+              if(length(selectionMethod) == 1 && selectionMethod == "auto")
119
+                if(isCategorical) selectionMethod <- "t-test" else selectionMethod <- "CoxPH"
120
+              if(length(classifier) == 1 && classifier == "auto")
121
+                if(isCategorical) classifier <- "randomForest" else classifier <- "CoxPH"
122
+              
117 123
               
118 124
               # Which data-types or data-views are present?
119 125
               assayIDs <- unique(S4Vectors::mcols(measurements)$assay)
... ...
@@ -522,7 +528,6 @@ Using an ordinary GLM instead.")
522 528
 #' @inheritParams crossValidate
523 529
 #'
524 530
 #' @return CrossValParams object
525
-#' @export
526 531
 #'
527 532
 #' @examples
528 533
 #' CVparams <- generateCrossValParams(nRepeats = 20, nFolds = 5, nCores = 8, selectionOptimisation = "none")
... ...
@@ -558,7 +563,6 @@ generateCrossValParams <- function(nRepeats, nFolds, nCores, selectionOptimisati
558 563
 #' @param assayIDs A vector of data set identifiers as long at the number of data sets.
559 564
 #'
560 565
 #' @return ModellingParams object
561
-#' @export
562 566
 #'
563 567
 #' @examples
564 568
 #' data(asthma)
... ...
@@ -5,9 +5,11 @@ rfsrcTrainInterface <- function(measurementsTrain, survivalTrain, mTryProportion
5 5
     stop("The package 'randomForestSRC' could not be found. Please install it.")
6 6
   if(verbose == 3)
7 7
     message("Fitting rfsrc classifier to training data and making predictions on test data.")
8
-    
9
-  bindedMeasurements <- cbind(measurementsTrain, event = survivalTrain[, 1], time = survivalTrain[, 2])
8
+
9
+  # Surv objects store survival information as a two-column table, time and event, in that order.    
10
+  bindedMeasurements <- cbind(measurementsTrain, time = survivalTrain[, 1], event = survivalTrain[, 2])
10 11
   mtry <- round(mTryProportion * ncol(measurementsTrain)) # Number of features to try.
12
+  browser()
11 13
   randomForestSRC::rfsrc(Surv(time, event) ~ ., data = as.data.frame(bindedMeasurements), mtry = mtry,
12 14
                           var.used = "all.trees", importance = TRUE, ...)
13 15
 }
... ...
@@ -13,7 +13,8 @@
13 13
 #' are features.
14 14
 #' @param outcome Either a factor vector of classes, a \code{\link{Surv}} object, or
15 15
 #' a character string, or vector of such strings, containing column name(s) of column(s)
16
-#' containing either classes or time and event information about survival.
16
+#' containing either classes or time and event information about survival. If column names
17
+#' of survival information, time must be in first column and event status in the second.
17 18
 #' @param outcomeColumns If \code{measurements} is a \code{MultiAssayExperiment}, the
18 19
 #' names of the column (class) or columns (survival) in the table extracted by \code{colData(data)}
19 20
 #' that contain(s) the each individual's outcome to use for prediction.
... ...
@@ -4,7 +4,7 @@ coxphRanking <- function(measurementsTrain, survivalTrain, verbose = 3) # Clinic
4 4
   
5 5
   pValues <- rep(NA, ncol(measurementsTrain))
6 6
   names(pValues) <- colnames(measurementsTrain)
7
-  
7
+
8 8
   isCat <- sapply(measurementsTrain, class) %in% c("character", "factor")
9 9
   if(any(isCat))
10 10
   {
... ...
@@ -16,7 +16,8 @@
16 16
 #' \code{matrix} or \code{\link{DataFrame}}, the rows are samples, and the columns are features.
17 17
 #' @param outcomeTrain Either a factor vector of classes, a \code{\link{Surv}} object, or
18 18
 #' a character string, or vector of such strings, containing column name(s) of column(s)
19
-#' containing either classes or time and event information about survival.
19
+#' containing either classes or time and event information about survival. If column names
20
+#' of survival information, time must be in first column and event status in the second.
20 21
 #' @param measurementsTest Same data type as \code{measurementsTrain}, but only the test
21 22
 #' samples.
22 23
 #' @param outcomeTest Same data type as \code{outcomeTrain}, but for only the test
... ...
@@ -16,7 +16,8 @@
16 16
 #' containing either classes or time and event information about survival. If
17 17
 #' \code{measurements} is a \code{MultiAssayExperiment}, the names of the column (class) or
18 18
 #' columns (survival) in the table extracted by \code{colData(data)} that contain(s) the samples'
19
-#' outcome to use for prediction.
19
+#' outcome to use for prediction. If column names of survival information, time must be in first
20
+#' column and event status in the second.
20 21
 #' @param crossValParams An object of class \code{\link{CrossValParams}},
21 22
 #' specifying the kind of cross-validation to be done.
22 23
 #' @param modellingParams An object of class \code{\link{ModellingParams}},
... ...
@@ -1,4 +1,15 @@
1 1
 url: https://sydneybiox.github.io/ClassifyR/
2
+urls:
3
+  reference: https://sydneybiox.github.io/ClassifyR/reference
4
+  article: https://sydneybiox.github.io/ClassifyR/articles
2 5
 template:
3 6
   bootstrap: 5
4
-
7
+articles:
8
+- title: Menu
9
+  contents:
10
+    - introduction
11
+    - performanceEvaluation
12
+    - multiViewMethods
13
+    - incorporateNew
14
+    - ClassifyR
15
+    - DevelopersGuide
... ...
@@ -24,7 +24,7 @@
24 24
     
25 25
     <a class="navbar-brand me-2" href="https://sydneybiox.github.io/ClassifyR/index.html">ClassifyR</a>
26 26
 
27
-    <small class="nav-text text-muted me-auto" data-bs-toggle="tooltip" data-bs-placement="bottom" title="">3.3.1</small>
27
+    <small class="nav-text text-muted me-auto" data-bs-toggle="tooltip" data-bs-placement="bottom" title="">3.3.2</small>
28 28
 
29 29
     
30 30
     <button class="navbar-toggler" type="button" data-bs-toggle="collapse" data-bs-target="#navbar" aria-controls="navbar" aria-expanded="false" aria-label="Toggle navigation">
... ...
@@ -39,15 +39,8 @@
39 39
 <li class="nav-item">
40 40
   <a class="nav-link" href="https://sydneybiox.github.io/ClassifyR/reference/index.html">Reference</a>
41 41
 </li>
42
-<li class="nav-item dropdown">
43
-  <a href="https://sydneybiox.github.io/ClassifyR/#" class="nav-link dropdown-toggle" data-bs-toggle="dropdown" role="button" aria-expanded="false" aria-haspopup="true" id="dropdown-articles">Articles</a>
44
-  <div class="dropdown-menu" aria-labelledby="dropdown-articles">
45
-    <a class="dropdown-item" href="https://sydneybiox.github.io/ClassifyR/articles/DevelopersGuide.html">**ClassifyR** Developer's Guide</a>
46
-    <a class="dropdown-item" href="https://sydneybiox.github.io/ClassifyR/articles/incorporateNew.html">Creating a Wrapper for New Functionality and Registering It</a>
47
-    <a class="dropdown-item" href="https://sydneybiox.github.io/ClassifyR/articles/introduction.html">Introduction to the Concepts of ClassifyR</a>
48
-    <a class="dropdown-item" href="https://sydneybiox.github.io/ClassifyR/articles/multiViewMethods.html">Multi-view Methods for Modelling of Multiple Data Views</a>
49
-    <a class="dropdown-item" href="https://sydneybiox.github.io/ClassifyR/articles/performanceEvaluation.html">Performance Evaluation of Fitted Models</a>
50
-  </div>
42
+<li class="nav-item">
43
+  <a class="nav-link" href="https://sydneybiox.github.io/ClassifyR/articles/index.html">Articles</a>
51 44
 </li>
52 45
       </ul>
53 46
 <form class="form-inline my-2 my-lg-0" role="search">
... ...
@@ -74,7 +67,7 @@ Content not found. Please use links in the navbar.
74 67
 
75 68
     <footer><div class="pkgdown-footer-left">
76 69
   <p></p>
77
-<p>Developed by Dario Strbenac.</p>
70
+<p>Developed by Dario Strbenac, Ellis Patrick, Sourish Iyengar, Harry Robertson, Andy Tran, John Ormerod, Graham Mann, Jean Yang.</p>
78 71
 </div>
79 72
 
80 73
 <div class="pkgdown-footer-right">
... ...
@@ -1,8 +1,24 @@
1 1
 <!DOCTYPE html>
2
-<!-- Generated by pkgdown: do not edit by hand --><html lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8"><meta charset="utf-8"><meta http-equiv="X-UA-Compatible" content="IE=edge"><meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"><meta name="description" content="ClassifyR"><title>An Introduction to **ClassifyR** • ClassifyR</title><script src="../deps/jquery-3.6.0/jquery-3.6.0.min.js"></script><meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no"><link href="../deps/bootstrap-5.1.3/bootstrap.min.css" rel="stylesheet"><script src="../deps/bootstrap-5.1.3/bootstrap.bundle.min.js"></script><!-- Font Awesome icons --><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.12.1/css/all.min.css" integrity="sha256-mmgLkCYLUQbXn0B1SRqzHar6dCnv9oZFPEC1g1cwlkk=" crossorigin="anonymous"><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.12.1/css/v4-shims.min.css" integrity="sha256-wZjR52fzng1pJHwx4aV2AO3yyTOXrcDW7jBpJtTwVxw=" crossorigin="anonymous"><!-- bootstrap-toc --><script src="https://cdn.rawgit.com/afeld/bootstrap-toc/v1.0.1/dist/bootstrap-toc.min.js"></script><!-- headroom.js --><script src="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.11.0/headroom.min.js" integrity="sha256-AsUX4SJE1+yuDu5+mAVzJbuYNPHj/WroHuZ8Ir/CkE0=" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.11.0/jQuery.headroom.min.js" integrity="sha256-ZX/yNShbjqsohH1k95liqY9Gd8uOiE1S4vZc+9KQ1K4=" crossorigin="anonymous"></script><!-- clipboard.js --><script src="https://cdnjs.cloudflare.com/ajax/libs/clipboard.js/2.0.6/clipboard.min.js" integrity="sha256-inc5kl9MA1hkeYUt+EC3BhlIgyp/2jDIyBLS6k3UxPI=" crossorigin="anonymous"></script><!-- search --><script src="https://cdnjs.cloudflare.com/ajax/libs/fuse.js/6.4.6/fuse.js" integrity="sha512-zv6Ywkjyktsohkbp9bb45V6tEMoWhzFzXis+LrMehmJZZSys19Yxf1dopHx7WzIKxr5tK2dVcYmaCk2uqdjF4A==" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/autocomplete.js/0.38.0/autocomplete.jquery.min.js" integrity="sha512-GU9ayf+66Xx2TmpxqJpliWbT5PiGYxpaG8rfnBEk1LL8l1KGkRShhngwdXK1UgqhAzWpZHSiYPc09/NwDQIGyg==" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/mark.js/8.11.1/mark.min.js" integrity="sha512-5CYOlHXGh6QpOFA/TeTylKLWfB3ftPsde7AnmhuitiTX4K5SqCLBeKro6sPS8ilsz1Q4NRx3v8Ko2IBiszzdww==" crossorigin="anonymous"></script><!-- pkgdown --><script src="../pkgdown.js"></script><meta property="og:title" content="An Introduction to **ClassifyR**"><meta property="og:description" content="ClassifyR"><!-- mathjax --><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js" integrity="sha256-nvJJv9wWKEm88qvoQl9ekL2J+k/RWIsaSScxxlsrv8k=" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/config/TeX-AMS-MML_HTMLorMML.js" integrity="sha256-84DKXVJXs0/F8OTMzX4UR909+jtl4G7SPypPavF+GfA=" crossorigin="anonymous"></script><!--[if lt IE 9]>
2
+<!-- Generated by pkgdown: do not edit by hand --><html lang="en">
3
+<head>
4
+<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
5
+<meta charset="utf-8">
6
+<meta http-equiv="X-UA-Compatible" content="IE=edge">
7
+<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
8
+<meta name="description" content="ClassifyR">
9
+<title>An Introduction to ClassifyR</title>
10
+<script src="../deps/jquery-3.6.0/jquery-3.6.0.min.js"></script><meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
11
+<link href="../deps/bootstrap-5.1.3/bootstrap.min.css" rel="stylesheet">
12
+<script src="../deps/bootstrap-5.1.3/bootstrap.bundle.min.js"></script><!-- Font Awesome icons --><link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.12.1/css/all.min.css" integrity="sha256-mmgLkCYLUQbXn0B1SRqzHar6dCnv9oZFPEC1g1cwlkk=" crossorigin="anonymous">
13
+<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/5.12.1/css/v4-shims.min.css" integrity="sha256-wZjR52fzng1pJHwx4aV2AO3yyTOXrcDW7jBpJtTwVxw=" crossorigin="anonymous">
14
+<!-- bootstrap-toc --><script src="https://cdn.rawgit.com/afeld/bootstrap-toc/v1.0.1/dist/bootstrap-toc.min.js"></script><!-- headroom.js --><script src="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.11.0/headroom.min.js" integrity="sha256-AsUX4SJE1+yuDu5+mAVzJbuYNPHj/WroHuZ8Ir/CkE0=" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/headroom/0.11.0/jQuery.headroom.min.js" integrity="sha256-ZX/yNShbjqsohH1k95liqY9Gd8uOiE1S4vZc+9KQ1K4=" crossorigin="anonymous"></script><!-- clipboard.js --><script src="https://cdnjs.cloudflare.com/ajax/libs/clipboard.js/2.0.6/clipboard.min.js" integrity="sha256-inc5kl9MA1hkeYUt+EC3BhlIgyp/2jDIyBLS6k3UxPI=" crossorigin="anonymous"></script><!-- search --><script src="https://cdnjs.cloudflare.com/ajax/libs/fuse.js/6.4.6/fuse.js" integrity="sha512-zv6Ywkjyktsohkbp9bb45V6tEMoWhzFzXis+LrMehmJZZSys19Yxf1dopHx7WzIKxr5tK2dVcYmaCk2uqdjF4A==" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/autocomplete.js/0.38.0/autocomplete.jquery.min.js" integrity="sha512-GU9ayf+66Xx2TmpxqJpliWbT5PiGYxpaG8rfnBEk1LL8l1KGkRShhngwdXK1UgqhAzWpZHSiYPc09/NwDQIGyg==" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/mark.js/8.11.1/mark.min.js" integrity="sha512-5CYOlHXGh6QpOFA/TeTylKLWfB3ftPsde7AnmhuitiTX4K5SqCLBeKro6sPS8ilsz1Q4NRx3v8Ko2IBiszzdww==" crossorigin="anonymous"></script><!-- pkgdown --><script src="../pkgdown.js"></script><meta property="og:title" content="An Introduction to ClassifyR">
15
+<meta property="og:description" content="ClassifyR">
16
+<!-- mathjax --><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js" integrity="sha256-nvJJv9wWKEm88qvoQl9ekL2J+k/RWIsaSScxxlsrv8k=" crossorigin="anonymous"></script><script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/config/TeX-AMS-MML_HTMLorMML.js" integrity="sha256-84DKXVJXs0/F8OTMzX4UR909+jtl4G7SPypPavF+GfA=" crossorigin="anonymous"></script><!--[if lt IE 9]>
3 17
 <script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script>
4 18
 <script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
5
-<![endif]--></head><body>
19
+<![endif]-->
20
+</head>
21
+<body>
6 22
     <a href="#main" class="visually-hidden-focusable">Skip to contents</a>
7 23
     
8 24
 
... ...
@@ -10,7 +26,7 @@
10 26
     
11 27
     <a class="navbar-brand me-2" href="../index.html">ClassifyR</a>
12 28
 
13
-    <small class="nav-text text-muted me-auto" data-bs-toggle="tooltip" data-bs-placement="bottom" title="">3.3.1</small>
29
+    <small class="nav-text text-muted me-auto" data-bs-toggle="tooltip" data-bs-placement="bottom" title="">3.3.2</small>
14 30
 
15 31
     
16 32
     <button class="navbar-toggler" type="button" data-bs-toggle="collapse" data-bs-target="#navbar" aria-controls="navbar" aria-expanded="false" aria-label="Toggle navigation">
... ...
@@ -18,26 +34,23 @@
18 34
     </button>
19 35
 
20 36
     <div id="navbar" class="collapse navbar-collapse ms-3">
21
-      <ul class="navbar-nav me-auto"><li class="active nav-item">
37
+      <ul class="navbar-nav me-auto">
38
+<li class="active nav-item">
22 39
   <a class="nav-link" href="../articles/ClassifyR.html">Get started</a>
23 40
 </li>
24 41
 <li class="nav-item">
25 42
   <a class="nav-link" href="../reference/index.html">Reference</a>
26 43
 </li>
27
-<li class="nav-item dropdown">
28
-  <a href="#" class="nav-link dropdown-toggle" data-bs-toggle="dropdown" role="button" aria-expanded="false" aria-haspopup="true" id="dropdown-articles">Articles</a>
29
-  <div class="dropdown-menu" aria-labelledby="dropdown-articles">
30
-    <a class="dropdown-item" href="../articles/DevelopersGuide.html">**ClassifyR** Developer's Guide</a>
31
-    <a class="dropdown-item" href="../articles/incorporateNew.html">Creating a Wrapper for New Functionality and Registering It</a>
32
-    <a class="dropdown-item" href="../articles/introduction.html">Introduction to the Concepts of ClassifyR</a>
33
-    <a class="dropdown-item" href="../articles/multiViewMethods.html">Multi-view Methods for Modelling of Multiple Data Views</a>
34
-    <a class="dropdown-item" href="../articles/performanceEvaluation.html">Performance Evaluation of Fitted Models</a>
35
-  </div>
44
+<li class="nav-item">
45
+  <a class="nav-link" href="../articles/index.html">Articles</a>
36 46
 </li>
37
-      </ul><form class="form-inline my-2 my-lg-0" role="search">
38
-        <input type="search" class="form-control me-sm-2" aria-label="Toggle navigation" name="search-input" data-search-index="../search.json" id="search-input" placeholder="Search for" autocomplete="off"></form>
47
+      </ul>
48
+<form class="form-inline my-2 my-lg-0" role="search">
49
+        <input type="search" class="form-control me-sm-2" aria-label="Toggle navigation" name="search-input" data-search-index="../search.json" id="search-input" placeholder="Search for" autocomplete="off">
50
+</form>
39 51
 
40
-      <ul class="navbar-nav"></ul></div>
52
+      <ul class="navbar-nav"></ul>
53
+</div>
41 54
 
42 55
     
43 56
   </div>
... ...
@@ -45,13 +58,10 @@
45 58
 
46 59
 
47 60
 
48
-
49
-<div class="row">
61
+<script src="ClassifyR_files/accessible-code-block-0.0.1/empty-anchor.js"></script><div class="row">
50 62
   <main id="main" class="col-md-9"><div class="page-header">
51
-      <img src="" class="logo" alt=""><h1>An Introduction to **ClassifyR**</h1>
52
-                        <h4 data-toc-skip class="author">Dario Strbenac,
53
-Ellis Patrick, Graham Mann, Jean Yang, John Ormerod <br> The University
54
-of Sydney, Australia.</h4>
63
+      <img src="" class="logo" alt=""><h1>An Introduction to ClassifyR</h1>
64
+                        <h4 data-toc-skip class="author">Dario Strbenac, Ellis Patrick, Graham Mann, Jean Yang, John Ormerod <br> The University of Sydney, Australia.</h4>
55 65
             
56 66
       
57 67
       
... ...
@@ -60,254 +70,166 @@ of Sydney, Australia.</h4>
60 70
 
61 71
     
62 72
     
63
-<div id="installation" class="section level2">
64
-<h2>Installation</h2>
65
-<p>Typically, each feature selection method or classifier originates
66
-from a different R package, which <strong>ClassifyR</strong> provides a
67
-wrapper around. By default, only high-performance t-test/F-test and
68
-random forest are installed. If you intend to compare between numerous
69
-different modelling methods, you should install all suggested packages
70
-at once by using the command
71
-<code>BiocManager::install("ClassifyR", dependencies = TRUE)</code>.
72
-This will take a few minutes, particularly on Linux, because each
73
-package will be compiled from source code.</p>
73
+<div class="section level2">
74
+<h2 id="installation">Installation<a class="anchor" aria-label="anchor" href="#installation"></a>
75
+</h2>
76
+<p>Typically, each feature selection method or classifier originates from a different R package, which <strong>ClassifyR</strong> provides a wrapper around. By default, only high-performance t-test/F-test and random forest are installed. If you intend to compare between numerous different modelling methods, you should install all suggested packages at once by using the command <code>BiocManager::install("ClassifyR", dependencies = TRUE)</code>. This will take a few minutes, particularly on Linux, because each package will be compiled from source code.</p>
74 77
 </div>
75
-<div id="overview" class="section level2">
76
-<h2>Overview</h2>
77
-<p><strong>ClassifyR</strong> provides a structured pipeline for
78
-cross-validated classification. Classification is viewed in terms of
79
-four stages, data transformation, feature selection, classifier
80
-training, and prediction. The driver functions <em>crossValidate</em>
81
-and <em>runTests</em> implements varieties of cross-validation. They
82
-are:</p>
78
+<div class="section level2">
79
+<h2 id="overview">Overview<a class="anchor" aria-label="anchor" href="#overview"></a>
80
+</h2>
81
+<p><strong>ClassifyR</strong> provides a structured pipeline for cross-validated classification. Classification is viewed in terms of four stages, data transformation, feature selection, classifier training, and prediction. The driver functions <em>crossValidate</em> and <em>runTests</em> implements varieties of cross-validation. They are:</p>
83 82
 <ul>
84
-<li>Permutation of the order of samples followed by k-fold
85
-cross-validation (runTests only)</li>
83
+<li>Permutation of the order of samples followed by k-fold cross-validation (runTests only)</li>
86 84
 <li>Repeated x% test set cross-validation</li>
87 85
 <li>leave-k-out cross-validation</li>
88 86
 </ul>
89
-<p>Driver functions can use parallel processing capabilities in R to
90
-speed up cross-validations when many CPUs are available. The output of
91
-the driver functions is a <em>ClassifyResult</em> object which can be
92
-directly used by the performance evaluation functions. The process of
93
-classification is summarised by a flowchart.</p>
94
-<img src="" style="margin-left: auto;margin-right: auto"/>
95
-<p>Importantly, ClassifyR implements a number of methods for
96
-classification using different kinds of changes in measurements between
97
-classes. Most classifiers work with features where the means are
98
-different. In addition to changes in means (DM),
99
-<strong>ClassifyR</strong> also allows for classification using
100
-differential variability (DV; changes in scale) and differential
101
-distribution (DD; changes in location and/or scale).</p>
102
-<div id="case-study-diagnosing-asthma" class="section level3">
103
-<h3>Case Study: Diagnosing Asthma</h3>
104
-<p>To demonstrate some key features of ClassifyR, a data set consisting
105
-of the 2000 most variably expressed genes and 190 people will be used to
106
-quickly obtain results. The journal article corresponding to the data
107
-set was published in <em>Scientific Reports</em> in 2018 and is titled
108
-<a href="http://www.nature.com/articles/s41598-018-27189-4">A Nasal
109
-Brush-based Classifier of Asthma Identified by Machine Learning Analysis
110
-of Nasal RNA Sequence Data</a>.</p>
87
+<p>Driver functions can use parallel processing capabilities in R to speed up cross-validations when many CPUs are available. The output of the driver functions is a <em>ClassifyResult</em> object which can be directly used by the performance evaluation functions. The process of classification is summarised by a flowchart.</p>
88
+<img src="" style="margin-left: auto;margin-right: auto"><p>Importantly, ClassifyR implements a number of methods for classification using different kinds of changes in measurements between classes. Most classifiers work with features where the means are different. In addition to changes in means (DM), <strong>ClassifyR</strong> also allows for classification using differential variability (DV; changes in scale) and differential distribution (DD; changes in location and/or scale).</p>
89
+<div class="section level3">
90
+<h3 id="case-study-diagnosing-asthma">Case Study: Diagnosing Asthma<a class="anchor" aria-label="anchor" href="#case-study-diagnosing-asthma"></a>
91
+</h3>
92
+<p>To demonstrate some key features of ClassifyR, a data set consisting of the 2000 most variably expressed genes and 190 people will be used to quickly obtain results. The journal article corresponding to the data set was published in <em>Scientific Reports</em> in 2018 and is titled <a href="http://www.nature.com/articles/s41598-018-27189-4" class="external-link">A Nasal Brush-based Classifier of Asthma Identified by Machine Learning Analysis of Nasal RNA Sequence Data</a>.</p>
111 93
 <p>Load the package.</p>
112
-<div class="sourceCode" id="cb1"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="fu">library</span>(ClassifyR)</span></code></pre></div>
113
-<pre><code>## Warning: multiple methods tables found for &#39;aperm&#39;</code></pre>
114
-<pre><code>## Warning: replacing previous import &#39;BiocGenerics::aperm&#39; by &#39;DelayedArray::aperm&#39; when loading &#39;SummarizedExperiment&#39;</code></pre>
94
+<div class="sourceCode" id="cb1"><pre class="downlit sourceCode r">
95
+<code class="sourceCode R"><span><span class="kw"><a href="https://rdrr.io/r/base/library.html" class="external-link">library</a></span><span class="op">(</span><span class="va"><a href="https://sydneybiox.github.io/ClassifyR/">ClassifyR</a></span><span class="op">)</span></span></code></pre></div>
96
+<pre><code><span><span class="co">## Warning: multiple methods tables found for 'aperm'</span></span></code></pre>
97
+<pre><code><span><span class="co">## Warning: replacing previous import 'BiocGenerics::aperm' by 'DelayedArray::aperm' when loading 'SummarizedExperiment'</span></span></code></pre>
115 98
 <p>A glimpse at the RNA measurements and sample classes.</p>
116
-<div class="sourceCode" id="cb4"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a><span class="fu">data</span>(asthma) <span class="co"># Contains measurements and classes variables.</span></span>
117
-<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a>measurements[<span class="dv">1</span><span class="sc">:</span><span class="dv">5</span>, <span class="dv">1</span><span class="sc">:</span><span class="dv">5</span>]</span></code></pre></div>
118
-<pre><code>##            HBB BPIFA1  XIST FCGR3B HBA2
119
-## Sample 1  9.72  14.06 12.28  11.42 7.83
120
-## Sample 2 11.98  13.89  6.35  13.25 9.42
121
-## Sample 3 12.15  17.44 10.21   7.87 9.68
122
-## Sample 4 10.60  11.87  6.27  14.75 8.96
123
-## Sample 5  8.18  15.01 11.21   6.77 6.43</code></pre>
124
-<div class="sourceCode" id="cb6"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a><span class="fu">head</span>(classes)</span></code></pre></div>
125
-<pre><code>## [1] No  No  No  No  Yes No 
126
-## Levels: No Yes</code></pre>
127
-<p>The numeric matrix variable <em>measurements</em> stores the
128
-normalised values of the RNA gene abundances for each sample and the
129
-factor vector <em>classes</em> identifies which class the samples belong
130
-to. The measurements were normalised using <strong>DESeq2</strong>’s
131
-<em>varianceStabilizingTransformation</em> function, which produces
132
-<span class="math inline">\(log_2\)</span>-like data.</p>
133
-<p>For more complex data sets with multiple kinds of experiments
134
-(e.g. DNA methylation, copy number, gene expression on the same set of
135
-samples) a <a
136
-href="https://bioconductor.org/packages/release/bioc/html/MultiAssayExperiment.html"><em>MultiAssayExperiment</em></a>
137
-is recommended for data storage and supported by
138
-<strong>ClassifyR</strong>’s methods.</p>
99
+<div class="sourceCode" id="cb4"><pre class="downlit sourceCode r">
100
+<code class="sourceCode R"><span><span class="fu"><a href="https://rdrr.io/r/utils/data.html" class="external-link">data</a></span><span class="op">(</span><span class="va">asthma</span><span class="op">)</span> <span class="co"># Contains measurements and classes variables.</span></span>
101
+<span><span class="va">measurements</span><span class="op">[</span><span class="fl">1</span><span class="op">:</span><span class="fl">5</span>, <span class="fl">1</span><span class="op">:</span><span class="fl">5</span><span class="op">]</span></span></code></pre></div>
102
+<pre><code><span><span class="co">##            HBB BPIFA1  XIST FCGR3B HBA2</span></span>
103
+<span><span class="co">## Sample 1  9.72  14.06 12.28  11.42 7.83</span></span>
104
+<span><span class="co">## Sample 2 11.98  13.89  6.35  13.25 9.42</span></span>
105
+<span><span class="co">## Sample 3 12.15  17.44 10.21   7.87 9.68</span></span>
106
+<span><span class="co">## Sample 4 10.60  11.87  6.27  14.75 8.96</span></span>
107
+<span><span class="co">## Sample 5  8.18  15.01 11.21   6.77 6.43</span></span></code></pre>
108
+<div class="sourceCode" id="cb6"><pre class="downlit sourceCode r">
109
+<code class="sourceCode R"><span><span class="fu"><a href="https://rdrr.io/r/utils/head.html" class="external-link">head</a></span><span class="op">(</span><span class="va">classes</span><span class="op">)</span></span></code></pre></div>
110
+<pre><code><span><span class="co">## [1] No  No  No  No  Yes No </span></span>
111
+<span><span class="co">## Levels: No Yes</span></span></code></pre>
112
+<p>The numeric matrix variable <em>measurements</em> stores the normalised values of the RNA gene abundances for each sample and the factor vector <em>classes</em> identifies which class the samples belong to. The measurements were normalised using <strong>DESeq2</strong>’s <em>varianceStabilizingTransformation</em> function, which produces <span class="math inline">\(log_2\)</span>-like data.</p>
113
+<p>For more complex data sets with multiple kinds of experiments (e.g. DNA methylation, copy number, gene expression on the same set of samples) a <a href="https://bioconductor.org/packages/release/bioc/html/MultiAssayExperiment.html" class="external-link"><em>MultiAssayExperiment</em></a> is recommended for data storage and supported by <strong>ClassifyR</strong>’s methods.</p>
139 114
 </div>
140 115
 </div>
141
-<div id="quick-start-crossvalidate-function" class="section level2">
142
-<h2>Quick Start: <em>crossValidate</em> Function</h2>
143
-<p>The <em>crossValidate</em> function offers a quick and simple way to
144
-start analysing a dataset in ClassifyR. It is a wrapper for
145
-<em>runTests</em>, the core model building and testing function of
146
-ClassifyR. <em>crossValidate</em> must be supplied with
147
-<em>measurements</em>, a simple tabular data container or a list-like
148
-structure of such related tabular data on common samples. The classes of
149
-it may be <em>matrix</em>, <em>data.frame</em>, <em>DataFrame</em>,
150
-<em>MultiAssayExperiment</em> or <em>list</em> of <em>data.frames</em>.
151
-For a dataset with <span class="math inline">\(n\)</span> observations
152
-and <span class="math inline">\(p\)</span> variables, the
153
-<em>crossValidate</em> function will accept inputs of the following
154
-shapes:</p>
155
-<table>
156
-<colgroup>
157
-<col width="25%" />
158
-<col width="37%" />
159
-<col width="37%" />
160
-</colgroup>
161
-<thead>
162
-<tr class="header">
116
+<div class="section level2">
117
+<h2 id="quick-start-crossvalidate-function">Quick Start: <em>crossValidate</em> Function<a class="anchor" aria-label="anchor" href="#quick-start-crossvalidate-function"></a>
118
+</h2>
119
+<p>The <em>crossValidate</em> function offers a quick and simple way to start analysing a dataset in ClassifyR. It is a wrapper for <em>runTests</em>, the core model building and testing function of ClassifyR. <em>crossValidate</em> must be supplied with <em>measurements</em>, a simple tabular data container or a list-like structure of such related tabular data on common samples. The classes of it may be <em>matrix</em>, <em>data.frame</em>, <em>DataFrame</em>, <em>MultiAssayExperiment</em> or <em>list</em> of <em>data.frames</em>. For a dataset with <span class="math inline">\(n\)</span> observations and <span class="math inline">\(p\)</span> variables, the <em>crossValidate</em> function will accept inputs of the following shapes:</p>
120
+<table class="table">
121
+<thead><tr class="header">
163 122
 <th>Data Type</th>
164 123
 <th align="center"><span class="math inline">\(n \times p\)</span></th>
165 124
 <th align="center"><span class="math inline">\(p \times n\)</span></th>
166
-</tr>
167
-</thead>
125
+</tr></thead>
168 126
 <tbody>
169 127
 <tr class="odd">
170
-<td><span
171
-style="font-family: &#39;Courier New&#39;, monospace;">matrix</span></td>
128
+<td><span style="font-family: 'Courier New', monospace;">matrix</span></td>
172 129
 <td align="center">✔</td>
173 130
 <td align="center"></td>
174 131
 </tr>
175 132
 <tr class="even">
176
-<td><span
177
-style="font-family: &#39;Courier New&#39;, monospace;">data.frame</span></td>
133
+<td><span style="font-family: 'Courier New', monospace;">data.frame</span></td>
178 134
 <td align="center">✔</td>
179 135
 <td align="center"></td>
180 136
 </tr>
181 137
 <tr class="odd">
182
-<td><span
183
-style="font-family: &#39;Courier New&#39;, monospace;">DataFrame</span></td>
138
+<td><span style="font-family: 'Courier New', monospace;">DataFrame</span></td>
184 139
 <td align="center">✔</td>
185 140
 <td align="center"></td>
186 141
 </tr>
187 142
 <tr class="even">
188
-<td><span
189
-style="font-family: &#39;Courier New&#39;, monospace;">MultiAssayExperiment</span></td>
143
+<td><span style="font-family: 'Courier New', monospace;">MultiAssayExperiment</span></td>
190 144
 <td align="center"></td>
191 145
 <td align="center">✔</td>
192 146
 </tr>
193 147
 <tr class="odd">
194
-<td><span
195
-style="font-family: &#39;Courier New&#39;, monospace;">list</span> of
196
-<span
197
-style="font-family: &#39;Courier New&#39;, monospace;">data.frame</span>s</td>
148
+<td>
149
+<span style="font-family: 'Courier New', monospace;">list</span> of <span style="font-family: 'Courier New', monospace;">data.frame</span>s</td>
198 150
 <td align="center">✔</td>
199 151
 <td align="center"></td>
200 152
 </tr>
201 153
 </tbody>
202 154
 </table>
203
-<p><em>crossValidate</em> must also be supplied with <em>outcome</em>,
204
-which represents the prediction to be made in a variety of possible
205
-ways.</p>
155
+<p><em>crossValidate</em> must also be supplied with <em>outcome</em>, which represents the prediction to be made in a variety of possible ways.</p>
206 156
 <ul>
207
-<li>A <em>factor</em> that contains the class label for each
208
-observation. <em>classes</em> must be of length <span
209
-class="math inline">\(n\)</span>.</li>
210
-<li>A <em>character</em> of length 1 that matches a column name in a
211
-data frame which holds the classes. The classes will automatically be
212
-removed before training is done.</li>
213
-<li>A <em>Surv</em> object of the same length as the number of samples
214
-in the data which contains information about the time and censoring of
215
-the samples.</li>
216
-<li>A <em>character</em> vector of length 2 or 3 that each match a
217
-column name in a data frame which holds information about the time and
218
-censoring of the samples. The time-to-event columns will automatically
219
-be removed before training is done.</li>
157
+<li>A <em>factor</em> that contains the class label for each observation. <em>classes</em> must be of length <span class="math inline">\(n\)</span>.</li>
158
+<li>A <em>character</em> of length 1 that matches a column name in a data frame which holds the classes. The classes will automatically be removed before training is done.</li>
159
+<li>A <em>Surv</em> object of the same length as the number of samples in the data which contains information about the time and censoring of the samples.</li>
160
+<li>A <em>character</em> vector of length 2 or 3 that each match a column name in a data frame which holds information about the time and censoring of the samples. The time-to-event columns will automatically be removed before training is done.</li>
220 161
 </ul>
221
-<p>The type of classifier used can be changed with the
222
-<em>classifier</em> argument. The default is a random forest, which
223
-seamlessly handles categorical and numerical data. A full list of
224
-classifiers can be seen by running <em>?crossValidate</em>. A feature
225
-selection step can be performed before classification using
226
-<em>nFeatures</em> and <em>selectionMethod</em>, which is a t-test by
227
-default. Similarly, the number of folds and number of repeats for cross
228
-validation can be changed with the <em>nFolds</em> and <em>nRepeats</em>
229
-arguments. If wanted, <em>nCores</em> can be specified to run the cross
230
-validation in parallel. To perform 5-fold cross-validation of a Support
231
-Vector Machine with 2 repeats:</p>
232
-<div class="sourceCode" id="cb8"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb8-1"><a href="#cb8-1" aria-hidden="true" tabindex="-1"></a>result <span class="ot">&lt;-</span> <span class="fu">crossValidate</span>(measurements, classes, <span class="at">classifier =</span> <span class="st">&quot;SVM&quot;</span>,</span>
233
-<span id="cb8-2"><a href="#cb8-2" aria-hidden="true" tabindex="-1"></a>                        <span class="at">nFeatures =</span> <span class="dv">20</span>, <span class="at">nFolds =</span> <span class="dv">5</span>, <span class="at">nRepeats =</span> <span class="dv">2</span>, <span class="at">nCores =</span> <span class="dv">1</span>)</span></code></pre></div>
234
-<pre><code>## Processing sample set 10.</code></pre>
235
-<div class="sourceCode" id="cb10"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb10-1"><a href="#cb10-1" aria-hidden="true" tabindex="-1"></a><span class="fu">performancePlot</span>(result)</span></code></pre></div>
236
-<pre><code>## Warning in .local(results, ...): Balanced Accuracy not found in all elements of results. Calculating it now.</code></pre>
237
-<p><img src="ClassifyR_files/figure-html/unnamed-chunk-5-1.png" width="700" /></p>
238
-<div id="data-integration-with-crossvalidate" class="section level3">
239
-<h3>Data Integration with crossValidate</h3>
240
-<p><em>crossValidate</em> also allows data from multiple sources to be
241
-integrated into a single model. The integration method can be specified
242
-with <em>multiViewMethod</em> argument. In this example, suppose the
243
-first 10 variables in the asthma data set are from a certain source and
244
-the remaining 1990 variables are from a second source. To integrate
245
-multiple data sets, each variable must be labeled with the data set it
246
-came from. This is done in a different manner depending on the data type
247
-of <em>measurements</em>.</p>
248
-<p>If using Bioconductor’s <em>DataFrame</em>, this can be specified
249
-using <em>mcols</em>. In the column metadata, each feature must have an
250
-<em>assay</em> and a <em>feature</em> name.</p>
251
-<div class="sourceCode" id="cb12"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb12-1"><a href="#cb12-1" aria-hidden="true" tabindex="-1"></a>measurementsDF <span class="ot">&lt;-</span> <span class="fu">DataFrame</span>(measurements)</span>
252
-<span id="cb12-2"><a href="#cb12-2" aria-hidden="true" tabindex="-1"></a><span class="fu">mcols</span>(measurementsDF) <span class="ot">&lt;-</span> <span class="fu">data.frame</span>(</span>
253
-<span id="cb12-3"><a href="#cb12-3" aria-hidden="true" tabindex="-1"></a>  <span class="at">assay =</span> <span class="fu">rep</span>(<span class="fu">c</span>(<span class="st">&quot;assay_1&quot;</span>, <span class="st">&quot;assay_2&quot;</span>), <span class="at">times =</span> <span class="fu">c</span>(<span class="dv">10</span>, <span class="dv">1990</span>)),</span>
254
-<span id="cb12-4"><a href="#cb12-4" aria-hidden="true" tabindex="-1"></a>  <span class="at">feature =</span> <span class="fu">colnames</span>(measurementsDF)</span>
255
-<span id="cb12-5"><a href="#cb12-5" aria-hidden="true" tabindex="-1"></a>)</span>
256
-<span id="cb12-6"><a href="#cb12-6" aria-hidden="true" tabindex="-1"></a></span>
257
-<span id="cb12-7"><a href="#cb12-7" aria-hidden="true" tabindex="-1"></a>result <span class="ot">&lt;-</span> <span class="fu">crossValidate</span>(measurementsDF, classes, <span class="at">classifier =</span> <span class="st">&quot;SVM&quot;</span>, <span class="at">nFolds =</span> <span class="dv">5</span>,</span>
258
-<span id="cb12-8"><a href="#cb12-8" aria-hidden="true" tabindex="-1"></a>                        <span class="at">nRepeats =</span> <span class="dv">3</span>, <span class="at">multiViewMethod =</span> <span class="st">&quot;merge&quot;</span>)</span></code></pre></div>
259
-<pre><code>## Processing sample set 10.
260
-## Processing sample set 10.
261
-## Processing sample set 10.</code></pre>
262
-<div class="sourceCode" id="cb14"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb14-1"><a href="#cb14-1" aria-hidden="true" tabindex="-1"></a><span class="fu">performancePlot</span>(result, <span class="at">characteristicsList =</span> <span class="fu">list</span>(<span class="at">x =</span> <span class="st">&quot;Assay Name&quot;</span>))</span></code></pre></div>
263
-<pre><code>## Warning in .local(results, ...): Balanced Accuracy not found in all elements of results. Calculating it now.</code></pre>
264
-<p><img src="ClassifyR_files/figure-html/unnamed-chunk-6-1.png" width="700" /></p>
265
-<p>If using a list of <em>data.frame</em>s, the name of each element in
266
-the list will be used as the assay name.</p>
267
-<div class="sourceCode" id="cb16"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb16-1"><a href="#cb16-1" aria-hidden="true" tabindex="-1"></a><span class="co"># Assigns first 10 variables to dataset_1, and the rest to dataset_2</span></span>
268
-<span id="cb16-2"><a href="#cb16-2" aria-hidden="true" tabindex="-1"></a>measurementsList <span class="ot">&lt;-</span> <span class="fu">list</span>(</span>
269
-<span id="cb16-3"><a href="#cb16-3" aria-hidden="true" tabindex="-1"></a>  (measurements <span class="sc">|&gt;</span> <span class="fu">as.data.frame</span>())[<span class="dv">1</span><span class="sc">:</span><span class="dv">10</span>],</span>
270
-<span id="cb16-4"><a href="#cb16-4" aria-hidden="true" tabindex="-1"></a>  (measurements <span class="sc">|&gt;</span> <span class="fu">as.data.frame</span>())[<span class="dv">11</span><span class="sc">:</span><span class="dv">2000</span>]</span>
271
-<span id="cb16-5"><a href="#cb16-5" aria-hidden="true" tabindex="-1"></a>)</span>
272
-<span id="cb16-6"><a href="#cb16-6" aria-hidden="true" tabindex="-1"></a><span class="fu">names</span>(measurementsList) <span class="ot">&lt;-</span> <span class="fu">c</span>(<span class="st">&quot;assay_1&quot;</span>, <span class="st">&quot;assay_2&quot;</span>)</span>
273
-<span id="cb16-7"><a href="#cb16-7" aria-hidden="true" tabindex="-1"></a></span>
274
-<span id="cb16-8"><a href="#cb16-8" aria-hidden="true" tabindex="-1"></a>result <span class="ot">&lt;-</span> <span class="fu">crossValidate</span>(measurementsList, classes, <span class="at">classifier =</span> <span class="st">&quot;SVM&quot;</span>, <span class="at">nFolds =</span> <span class="dv">5</span>,</span>
275
-<span id="cb16-9"><a href="#cb16-9" aria-hidden="true" tabindex="-1"></a>                        <span class="at">nRepeats =</span> <span class="dv">3</span>, <span class="at">multiViewMethod =</span> <span class="st">&quot;merge&quot;</span>)</span></code></pre></div>
276
-<pre><code>## Processing sample set 10.
277
-## Processing sample set 10.
278
-## Processing sample set 10.</code></pre>
279
-<div class="sourceCode" id="cb18"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb18-1"><a href="#cb18-1" aria-hidden="true" tabindex="-1"></a><span class="fu">performancePlot</span>(result, <span class="at">characteristicsList =</span> <span class="fu">list</span>(<span class="at">x =</span> <span class="st">&quot;Assay Name&quot;</span>))</span></code></pre></div>
280
-<pre><code>## Warning in .local(results, ...): Balanced Accuracy not found in all elements of results. Calculating it now.</code></pre>
281
-<p><img src="ClassifyR_files/figure-html/unnamed-chunk-7-1.png" width="700" /></p>
162
+<p>The type of classifier used can be changed with the <em>classifier</em> argument. The default is a random forest, which seamlessly handles categorical and numerical data. A full list of classifiers can be seen by running <em>?crossValidate</em>. A feature selection step can be performed before classification using <em>nFeatures</em> and <em>selectionMethod</em>, which is a t-test by default. Similarly, the number of folds and number of repeats for cross validation can be changed with the <em>nFolds</em> and <em>nRepeats</em> arguments. If wanted, <em>nCores</em> can be specified to run the cross validation in parallel. To perform 5-fold cross-validation of a Support Vector Machine with 2 repeats:</p>
163
+<div class="sourceCode" id="cb8"><pre class="downlit sourceCode r">
164
+<code class="sourceCode R"><span><span class="va">result</span> <span class="op">&lt;-</span> <span class="fu"><a href="../reference/crossValidate.html">crossValidate</a></span><span class="op">(</span><span class="va">measurements</span>, <span class="va">classes</span>, classifier <span class="op">=</span> <span class="st">"SVM"</span>,</span>
165
+<span>                        nFeatures <span class="op">=</span> <span class="fl">20</span>, nFolds <span class="op">=</span> <span class="fl">5</span>, nRepeats <span class="op">=</span> <span class="fl">2</span>, nCores <span class="op">=</span> <span class="fl">1</span><span class="op">)</span></span></code></pre></div>
166
+<pre><code><span><span class="co">## Processing sample set 10.</span></span></code></pre>
167
+<div class="sourceCode" id="cb10"><pre class="downlit sourceCode r">
168
+<code class="sourceCode R"><span><span class="fu"><a href="../reference/performancePlot.html">performancePlot</a></span><span class="op">(</span><span class="va">result</span><span class="op">)</span></span></code></pre></div>
169
+<pre><code><span><span class="co">## Warning in .local(results, ...): Balanced Accuracy not found in all elements of results. Calculating it now.</span></span></code></pre>
170
+<p><img src="ClassifyR_files/figure-html/unnamed-chunk-5-1.png" width="700"></p>
171
+<div class="section level3">
172
+<h3 id="data-integration-with-crossvalidate">Data Integration with crossValidate<a class="anchor" aria-label="anchor" href="#data-integration-with-crossvalidate"></a>
173
+</h3>
174
+<p><em>crossValidate</em> also allows data from multiple sources to be integrated into a single model. The integration method can be specified with <em>multiViewMethod</em> argument. In this example, suppose the first 10 variables in the asthma data set are from a certain source and the remaining 1990 variables are from a second source. To integrate multiple data sets, each variable must be labeled with the data set it came from. This is done in a different manner depending on the data type of <em>measurements</em>.</p>
175
+<p>If using Bioconductor’s <em>DataFrame</em>, this can be specified using <em>mcols</em>. In the column metadata, each feature must have an <em>assay</em> and a <em>feature</em> name.</p>
176
+<div class="sourceCode" id="cb12"><pre class="downlit sourceCode r">
177
+<code class="sourceCode R"><span><span class="va">measurementsDF</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/pkg/S4Vectors/man/DataFrame-class.html" class="external-link">DataFrame</a></span><span class="op">(</span><span class="va">measurements</span><span class="op">)</span></span>
178
+<span><span class="fu"><a href="https://rdrr.io/pkg/S4Vectors/man/Vector-class.html" class="external-link">mcols</a></span><span class="op">(</span><span class="va">measurementsDF</span><span class="op">)</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/r/base/data.frame.html" class="external-link">data.frame</a></span><span class="op">(</span></span>
179
+<span>  assay <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/r/base/rep.html" class="external-link">rep</a></span><span class="op">(</span><span class="fu"><a href="https://rdrr.io/r/base/c.html" class="external-link">c</a></span><span class="op">(</span><span class="st">"assay_1"</span>, <span class="st">"assay_2"</span><span class="op">)</span>, times <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/r/base/c.html" class="external-link">c</a></span><span class="op">(</span><span class="fl">10</span>, <span class="fl">1990</span><span class="op">)</span><span class="op">)</span>,</span>
180
+<span>  feature <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/r/base/colnames.html" class="external-link">colnames</a></span><span class="op">(</span><span class="va">measurementsDF</span><span class="op">)</span></span>
181
+<span><span class="op">)</span></span>
182
+<span></span>
183
+<span><span class="va">result</span> <span class="op">&lt;-</span> <span class="fu"><a href="../reference/crossValidate.html">crossValidate</a></span><span class="op">(</span><span class="va">measurementsDF</span>, <span class="va">classes</span>, classifier <span class="op">=</span> <span class="st">"SVM"</span>, nFolds <span class="op">=</span> <span class="fl">5</span>,</span>
184
+<span>                        nRepeats <span class="op">=</span> <span class="fl">3</span>, multiViewMethod <span class="op">=</span> <span class="st">"merge"</span><span class="op">)</span></span></code></pre></div>
185
+<pre><code><span><span class="co">## Processing sample set 10.</span></span>
186
+<span><span class="co">## Processing sample set 10.</span></span>
187
+<span><span class="co">## Processing sample set 10.</span></span></code></pre>
188
+<div class="sourceCode" id="cb14"><pre class="downlit sourceCode r">
189
+<code class="sourceCode R"><span><span class="fu"><a href="../reference/performancePlot.html">performancePlot</a></span><span class="op">(</span><span class="va">result</span>, characteristicsList <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/r/base/list.html" class="external-link">list</a></span><span class="op">(</span>x <span class="op">=</span> <span class="st">"Assay Name"</span><span class="op">)</span><span class="op">)</span></span></code></pre></div>
190
+<pre><code><span><span class="co">## Warning in .local(results, ...): Balanced Accuracy not found in all elements of results. Calculating it now.</span></span></code></pre>
191
+<p><img src="ClassifyR_files/figure-html/unnamed-chunk-6-1.png" width="700"></p>
192
+<p>If using a list of <em>data.frame</em>s, the name of each element in the list will be used as the assay name.</p>
193
+<div class="sourceCode" id="cb16"><pre class="downlit sourceCode r">
194
+<code class="sourceCode R"><span><span class="co"># Assigns first 10 variables to dataset_1, and the rest to dataset_2</span></span>
195
+<span><span class="va">measurementsList</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/r/base/list.html" class="external-link">list</a></span><span class="op">(</span></span>
196
+<span>  <span class="op">(</span><span class="va">measurements</span> <span class="op">|&gt;</span> <span class="fu"><a href="https://rdrr.io/r/base/as.data.frame.html" class="external-link">as.data.frame</a></span><span class="op">(</span><span class="op">)</span><span class="op">)</span><span class="op">[</span><span class="fl">1</span><span class="op">:</span><span class="fl">10</span><span class="op">]</span>,</span>
197
+<span>  <span class="op">(</span><span class="va">measurements</span> <span class="op">|&gt;</span> <span class="fu"><a href="https://rdrr.io/r/base/as.data.frame.html" class="external-link">as.data.frame</a></span><span class="op">(</span><span class="op">)</span><span class="op">)</span><span class="op">[</span><span class="fl">11</span><span class="op">:</span><span class="fl">2000</span><span class="op">]</span></span>
198
+<span><span class="op">)</span></span>
199
+<span><span class="fu"><a href="https://rdrr.io/r/base/names.html" class="external-link">names</a></span><span class="op">(</span><span class="va">measurementsList</span><span class="op">)</span> <span class="op">&lt;-</span> <span class="fu"><a href="https://rdrr.io/r/base/c.html" class="external-link">c</a></span><span class="op">(</span><span class="st">"assay_1"</span>, <span class="st">"assay_2"</span><span class="op">)</span></span>
200
+<span></span>
201
+<span><span class="va">result</span> <span class="op">&lt;-</span> <span class="fu"><a href="../reference/crossValidate.html">crossValidate</a></span><span class="op">(</span><span class="va">measurementsList</span>, <span class="va">classes</span>, classifier <span class="op">=</span> <span class="st">"SVM"</span>, nFolds <span class="op">=</span> <span class="fl">5</span>,</span>
202
+<span>                        nRepeats <span class="op">=</span> <span class="fl">3</span>, multiViewMethod <span class="op">=</span> <span class="st">"merge"</span><span class="op">)</span></span></code></pre></div>
203
+<pre><code><span><span class="co">## Processing sample set 10.</span></span>
204
+<span><span class="co">## Processing sample set 10.</span></span>
205
+<span><span class="co">## Processing sample set 10.</span></span></code></pre>
206
+<div class="sourceCode" id="cb18"><pre class="downlit sourceCode r">
207
+<code class="sourceCode R"><span><span class="fu"><a href="../reference/performancePlot.html">performancePlot</a></span><span class="op">(</span><span class="va">result</span>, characteristicsList <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/r/base/list.html" class="external-link">list</a></span><span class="op">(</span>x <span class="op">=</span> <span class="st">"Assay Name"</span><span class="op">)</span><span class="op">)</span></span></code></pre></div>
208
+<pre><code><span><span class="co">## Warning in .local(results, ...): Balanced Accuracy not found in all elements of results. Calculating it now.</span></span></code></pre>
209
+<p><img src="ClassifyR_files/figure-html/unnamed-chunk-7-1.png" width="700"></p>
282 210
 </div>
283 211
 </div>
284
-<div id="a-more-detailed-look-at-classifyr" class="section level2">
285
-<h2>A More Detailed Look at ClassifyR</h2>
286
-<p>In the following sections, some of the most useful functions provided
287
-in <strong>ClassifyR</strong> will be demonstrated. However, a user
288
-could wrap any feature selection, training, or prediction function to
289
-the classification framework, as long as it meets some simple rules
290
-about the input and return parameters. See the appendix section of this
291
-guide titled “Rules for New Functions” for a description of these.</p>
292
-<div id="comparison-to-existing-classification-frameworks"
293
-class="section level3">
294
-<h3>Comparison to Existing Classification Frameworks</h3>
295
-<p>There are a few other frameworks for classification in R. The table
296
-below provides a comparison of which features they offer.</p>
297
-<table>
212
+<div class="section level2">
213
+<h2 id="a-more-detailed-look-at-classifyr">A More Detailed Look at ClassifyR<a class="anchor" aria-label="anchor" href="#a-more-detailed-look-at-classifyr"></a>
214
+</h2>
215
+<p>In the following sections, some of the most useful functions provided in <strong>ClassifyR</strong> will be demonstrated. However, a user could wrap any feature selection, training, or prediction function to the classification framework, as long as it meets some simple rules about the input and return parameters. See the appendix section of this guide titled “Rules for New Functions” for a description of these.</p>
216
+<div class="section level3">
217
+<h3 id="comparison-to-existing-classification-frameworks">Comparison to Existing Classification Frameworks<a class="anchor" aria-label="anchor" href="#comparison-to-existing-classification-frameworks"></a>
218
+</h3>
219
+<p>There are a few other frameworks for classification in R. The table below provides a comparison of which features they offer.</p>
220
+<table class="table">
298 221
 <colgroup>
299
-<col width="8%" />
300
-<col width="10%" />
301
-<col width="8%" />
302
-<col width="10%" />
303
-<col width="10%" />
304
-<col width="11%" />
305
-<col width="14%" />
306
-<col width="12%" />
307
-<col width="12%" />
222
+<col width="8%">
223
+<col width="10%">
224
+<col width="8%">
225
+<col width="10%">
226
+<col width="10%">
227
+<col width="11%">
228
+<col width="14%">
229
+<col width="12%">
230
+<col width="12%">
308 231
 </colgroup>
309
-<thead>
310
-<tr class="header">
232
+<thead><tr class="header">
311 233
 <th>Package</th>
312 234
 <th>Run User-defined Classifiers</th>
313 235
 <th>Parallel Execution on any OS</th>
... ...
@@ -317,8 +239,7 @@ below provides a comparison of which features they offer.</p>
317 239
 <th>Class Distribution Plot</th>
318 240
 <th>Sample-wise Error Heatmap</th>
319 241
 <th>Direct Support for MultiAssayExperiment Input</th>
320
-</tr>
321
-</thead>
242
+</tr></thead>
322 243
 <tbody>
323 244
 <tr class="odd">
324 245
 <td><strong>ClassifyR</strong></td>
... ...
@@ -378,109 +299,89 @@ below provides a comparison of which features they offer.</p>
378 299
 </tbody>
379 300
 </table>
380 301
 </div>
381
-<div id="provided-functionality" class="section level3">
382
-<h3>Provided Functionality</h3>
383
-<p>Although being a cross-validation framework, a number of popular
384
-feature selection and classification functions are provided by the
385
-package which meet the requirements of functions to be used by it (see
386
-the last section).</p>
387
-<div id="provided-methods-for-feature-selection-and-classification"
388
-class="section level4">
389
-<h4>Provided Methods for Feature Selection and Classification</h4>
390
-<p>In the following tables, a function that is used when no function is
391
-explicitly specified by the user is shown as <span
392
-style="padding:4px; border:2px dashed #e64626;">functionName</span>.</p>
393
-<p>The functions below produce a ranking, of which different size
394
-subsets are tried and the classifier performance evaluated, to select a
395
-best subset of features, based on a criterion such as balanced accuracy
396
-rate, for example.</p>
397
-<table style="width:100%;">
302
+<div class="section level3">
303
+<h3 id="provided-functionality">Provided Functionality<a class="anchor" aria-label="anchor" href="#provided-functionality"></a>
304
+</h3>
305
+<p>Although being a cross-validation framework, a number of popular feature selection and classification functions are provided by the package which meet the requirements of functions to be used by it (see the last section).</p>
306
+<div class="section level4">
307
+<h4 id="provided-methods-for-feature-selection-and-classification">Provided Methods for Feature Selection and Classification<a class="anchor" aria-label="anchor" href="#provided-methods-for-feature-selection-and-classification"></a>
308
+</h4>
309
+<p>In the following tables, a function that is used when no function is explicitly specified by the user is shown as <span style="padding:4px; border:2px dashed #e64626;">functionName</span>.</p>
310
+<p>The functions below produce a ranking, of which different size subsets are tried and the classifier performance evaluated, to select a best subset of features, based on a criterion such as balanced accuracy rate, for example.</p>
311
+<table style="width:100%;" class="table">
398 312
 <colgroup>
399
-<col width="9%" />
400
-<col width="62%" />
401
-<col width="9%" />
402
-<col width="9%" />
403
-<col width="9%" />
313
+<col width="9%">
314
+<col width="62%">
315
+<col width="9%">
316
+<col width="9%">
317
+<col width="9%">
404 318
 </colgroup>
405
-<thead>
406
-<tr class="header">
319
+<thead><tr class="header">
407 320
 <th>Function</th>
408 321
 <th>Description</th>
409 322
 <th>DM</th>
410 323
 <th>DV</th>
411 324
 <th>DD</th>
412
-</tr>
413
-</thead>
325
+</tr></thead>
414 326
 <tbody>
415 327
 <tr class="odd">
416
-<td><span
417
-style="padding:4px; border:2px dashed #e64626; font-family: &#39;Courier New&#39;, monospace;">differentMeansRanking</span></td>
328
+<td><span style="padding:4px; border:2px dashed #e64626; font-family: 'Courier New', monospace;">differentMeansRanking</span></td>
418 329
 <td>t-test ranking if two classes, F-test ranking if three or more</td>
419 330
 <td>✔</td>
420 331
 <td></td>
421 332
 <td></td>
422 333
 </tr>
423 334
 <tr class="even">
424
-<td><span
425
-style="font-family: &#39;Courier New&#39;, monospace;">limmaRanking</span></td>
335
+<td><span style="font-family: 'Courier New', monospace;">limmaRanking</span></td>
426 336
 <td>Moderated t-test ranking using variance shrinkage</td>
427 337
 <td>✔</td>
428 338
 <td></td>
429 339
 <td></td>
430 340
 </tr>
431 341
 <tr class="odd">
432
-<td><span
433
-style="font-family: &#39;Courier New&#39;, monospace;">edgeRranking</span></td>
342
+<td><span style="font-family: 'Courier New', monospace;">edgeRranking</span></td>
434 343
 <td>Likelihood ratio test for count data ranking</td>
435 344
 <td>✔</td>
436 345
 <td></td>
437 346
 <td></td>
438 347
 </tr>
439 348
 <tr class="even">
440
-<td><span
441
-style="font-family: &#39;Courier New&#39;, monospace;">bartlettRanking</span></td>
349
+<td><span style="font-family: 'Courier New', monospace;">bartlettRanking</span></td>
442 350
 <td>Bartlett’s test non-robust ranking</td>
443 351
 <td></td>
444 352
 <td>✔</td>
445 353
 <td></td>
446 354
 </tr>
447 355
 <tr class="odd">
448
-<td><span
449
-style="font-family: &#39;Courier New&#39;, monospace;">leveneRanking</span></td>
356
+<td><span style="font-family: 'Courier New', monospace;">leveneRanking</span></td>
450 357
 <td>Levene’s test robust ranking</td>
451 358
 <td></td>
452 359
 <td>✔</td>
453 360
 <td></td>
454 361
 </tr>
455 362
 <tr class="even">
456
-<td><span
457
-style="font-family: &#39;Courier New&#39;, monospace;">DMDranking</span></td>
458
-<td><span style="white-space: nowrap">Difference in location
459
-(mean/median) and/or scale (SD, MAD, <span
460
-class="math inline">\(Q_n\)</span>)</span></td>
363
+<td><span style="font-family: 'Courier New', monospace;">DMDranking</span></td>
364
+<td><span style="white-space: nowrap">Difference in location (mean/median) and/or scale (SD, MAD, <span class="math inline">\(Q_n\)</span>)</span></td>
461 365
 <td>✔</td>
462 366
 <td>✔</td>
463 367
 <td>✔</td>
464 368
 </tr>
465 369
 <tr class="odd">
466
-<td><span
467
-style="font-family: &#39;Courier New&#39;, monospace;">likelihoodRatioRanking</span></td>
370
+<td><span style="font-family: 'Courier New', monospace;">likelihoodRatioRanking</span></td>
468 371
 <td>Likelihood ratio (normal distribution) ranking</td>
469 372
 <td>✔</td>
470 373
 <td>✔</td>
471 374
 <td>✔</td>
472 375
 </tr>
473 376
 <tr class="even">
474
-<td><span
475
-style="font-family: &#39;Courier New&#39;, monospace;">KolmogorovSmirnovRanking</span></td>
377
+<td><span style="font-family: 'Courier New', monospace;">KolmogorovSmirnovRanking</span></td>
476 378
 <td>Kolmogorov-Smirnov distance between distributions ranking</td>
477 379
 <td>✔</td>
478 380
 <td>✔</td>
479 381
 <td>✔</td>
480 382
 </tr>
481 383
 <tr class="odd">
482
-<td><span
483
-style="font-family: &#39;Courier New&#39;, monospace;">KullbackLeiblerRanking</span></td>
384
+<td><span style="font-family: 'Courier New', monospace;">KullbackLeiblerRanking</span></td>
484 385
 <td>Kullback-Leibler distance between distributions ranking</td>
485 386
 <td>✔</td>
486 387
 <td>✔</td>
... ...
@@ -489,213 +390,164 @@ style="font-family: &#39;Courier New&#39;, monospace;">KullbackLeiblerRanking</s
489 390
 </tbody>
490 391
 </table>
491 392
 <p>Likewise, a variety of classifiers is also provided.</p>
492
-<table>
393
+<table class="table">
493 394
 <colgroup>
494
-<col width="9%" />
495
-<col width="61%" />
496
-<col width="9%" />
497
-<col width="9%" />
498
-<col width="9%" />
395
+<col width="9%">
396
+<col width="61%">
397
+<col width="9%">
398
+<col width="9%">
399
+<col width="9%">
499 400
 </colgroup>
500
-<thead>
501
-<tr class="header">
401
+<thead><tr class="header">
502 402
 <th>Function(s)</th>
503 403
 <th>Description</th>
504 404
 <th>DM</th>
505 405
 <th>DV</th>
506 406
 <th>DD</th>
507
-</tr>
508
-</thead>
407
+</tr></thead>
509 408
 <tbody>
510 409
 <tr class="odd">
511
-<td><span
512
-style="padding:1px; border:2px dashed #e64626; display:inline-block; margin-bottom: 3px; font-family: &#39;Courier New&#39;, monospace;">DLDAtrainInterface</span>,<br><span
513
-style="padding:1px; border:2px dashed #e64626; display:inline-block; font-family: &#39;Courier New&#39;, monospace;">DLDApredictInterface</span></td>
514
-<td>Wrappers for sparsediscrim’s functions <span
515
-style="font-family: &#39;Courier New&#39;, monospace;">dlda</span> and
516
-<span
517
-style="font-family: &#39;Courier New&#39;, monospace;">predict.dlda</span>
518
-functions</td>
410
+<td>
411
+<span style="padding:1px; border:2px dashed #e64626; display:inline-block; margin-bottom: 3px; font-family: 'Courier New', monospace;">DLDAtrainInterface</span>,<br><span style="padding:1px; border:2px dashed #e64626; display:inline-block; font-family: 'Courier New', monospace;">DLDApredictInterface</span>
412
+</td>
413
+<td>Wrappers for sparsediscrim’s functions <span style="font-family: 'Courier New', monospace;">dlda</span> and <span style="font-family: 'Courier New', monospace;">predict.dlda</span> functions</td>
519 414
 <td>✔</td>
520 415
 <td></td>
521 416
 <td></td>
522 417
 </tr>
523 418
 <tr class="even">
524
-<td><span
525
-style="font-family: &#39;Courier New&#39;, monospace;">classifyInterface</span></td>
526
-<td>Wrapper for PoiClaClu’s Poisson LDA function <span
527
-style="font-family: &#39;Courier New&#39;, monospace;">classify</span></td>
419
+<td><span style="font-family: 'Courier New', monospace;">classifyInterface</span></td>
420
+<td>Wrapper for PoiClaClu’s Poisson LDA function <span style="font-family: 'Courier New', monospace;">classify</span>
421
+</td>
528 422
 <td>✔</td>
529 423
 <td></td>
530 424
 <td></td>
531 425
 </tr>
532 426
 <tr class="odd">
533
-<td><span
534
-style="font-family: &#39;Courier New&#39;, monospace;">elasticNetGLMtrainInterface</span>,
535
-<span
536
-style="font-family: &#39;Courier New&#39;, monospace;">elasticNetGLMpredictInterface</span></td>
537
-<td>Wrappers for glmnet’s elastic net GLM functions <span
538
-style="font-family: &#39;Courier New&#39;, monospace;">glmnet</span> and
539
-<span
540
-style="font-family: &#39;Courier New&#39;, monospace;">predict.glmnet</span></td>
427
+<td>
428
+<span style="font-family: 'Courier New', monospace;">elasticNetGLMtrainInterface</span>, <span style="font-family: 'Courier New', monospace;">elasticNetGLMpredictInterface</span>
429
+</td>
430
+<td>Wrappers for glmnet’s elastic net GLM functions <span style="font-family: 'Courier New', monospace;">glmnet</span> and <span style="font-family: 'Courier New', monospace;">predict.glmnet</span>
431
+</td>
541 432
 <td>✔</td>
542 433
 <td></td>
543 434
 <td></td>
544 435
 </tr>
545 436
 <tr class="even">
546
-<td><span
547
-style="font-family: &#39;Courier New&#39;, monospace;">NSCtrainInterface</span>,
548
-<span
549
-style="font-family: &#39;Courier New&#39;, monospace;">NSCpredictInterface</span></td>
550
-<td>Wrappers for pamr’s Nearest Shrunken Centroid functions <span
551
-style="font-family: &#39;Courier New&#39;, monospace;">pamr.train</span>
552
-and <span
553
-style="font-family: &#39;Courier New&#39;, monospace;">pamr.predict</span></td>
437
+<td>
438
+<span style="font-family: 'Courier New', monospace;">NSCtrainInterface</span>, <span style="font-family: 'Courier New', monospace;">NSCpredictInterface</span>
439
+</td>
440
+<td>Wrappers for pamr’s Nearest Shrunken Centroid functions <span style="font-family: 'Courier New', monospace;">pamr.train</span> and <span style="font-family: 'Courier New', monospace;">pamr.predict</span>
441
+</td>
554 442
 <td>✔</td>
555 443
 <td></td>
556 444
 <td></td>
557 445
 </tr>
558 446
 <tr class="odd">
559
-<td><span
560
-style="font-family: &#39;Courier New&#39;, monospace;">fisherDiscriminant</span></td>
447
+<td><span style="font-family: 'Courier New', monospace;">fisherDiscriminant</span></td>
561 448
 <td>Implementation of Fisher’s LDA for departures from normality</td>
562 449
 <td>✔</td>
563 450
 <td>✔*</td>
564 451
 <td></td>
565 452
 </tr>
566 453
 <tr class="even">
567
-<td><span
568
-style="font-family: &#39;Courier New&#39;, monospace;">mixModelsTrain</span>,
569
-<span
570
-style="font-family: &#39;Courier New&#39;, monospace;">mixModelsPredict</span></td>
454
+<td>
455
+<span style="font-family: 'Courier New', monospace;">mixModelsTrain</span>, <span style="font-family: 'Courier New', monospace;">mixModelsPredict</span>
456
+</td>
571 457
 <td>Feature-wise mixtures of normals and voting</td>
572 458
 <td>✔</td>
573 459
 <td>✔</td>
574 460
 <td>✔</td>
575 461
 </tr>
576 462
 <tr class="odd">
577
-<td><span
578
-style="font-family: &#39;Courier New&#39;, monospace;">naiveBayesKernel</span></td>
463
+<td><span style="font-family: 'Courier New', monospace;">naiveBayesKernel</span></td>
579 464
 <td>Feature-wise kernel density estimation and voting</td>
580 465
 <td>✔</td>
581 466
 <td>✔</td>
582 467
 <td>✔</td>
583 468
 </tr>
584 469
 <tr class="even">
585
-<td><span
586
-style="font-family: &#39;Courier New&#39;, monospace;">randomForestTrainInterface</span>,
587
-<span
588
-style="font-family: &#39;Courier New&#39;, monospace;">randomForestPredictInterface</span></td>
589
-<td>Wrapper for ranger’s functions <span
590
-style="font-family: &#39;Courier New&#39;, monospace;">ranger</span> and
591
-<span
592
-style="font-family: &#39;Courier New&#39;, monospace;">predict</span></td>
470
+<td>
471
+<span style="font-family: 'Courier New', monospace;">randomForestTrainInterface</span>, <span style="font-family: 'Courier New', monospace;">randomForestPredictInterface</span>
472
+</td>
473
+<td>Wrapper for ranger’s functions <span style="font-family: 'Courier New', monospace;">ranger</span> and <span style="font-family: 'Courier New', monospace;">predict</span>
474
+</td>
593 475
 <td>✔</td>
594 476
 <td>✔</td>
595 477
 <td>✔</td>
596 478
 </tr>
597 479
 <tr class="odd">
598
-<td><span
599
-style="font-family: &#39;Courier New&#39;, monospace;">extremeGradientBoostingTrainInterface</span>,
600
-<span
601
-style="font-family: &#39;Courier New&#39;, monospace;">extremeGradientBoostingPredictInterface</span></td>
602
-<td>Wrapper for xgboost’s functions <span
603
-style="font-family: &#39;Courier New&#39;, monospace;">xgboost</span>
604
-and <span
605
-style="font-family: &#39;Courier New&#39;, monospace;">predict</span></td>
480
+<td>
481
+<span style="font-family: 'Courier New', monospace;">extremeGradientBoostingTrainInterface</span>, <span style="font-family: 'Courier New', monospace;">extremeGradientBoostingPredictInterface</span>
482
+</td>
483
+<td>Wrapper for xgboost’s functions <span style="font-family: 'Courier New', monospace;">xgboost</span> and <span style="font-family: 'Courier New', monospace;">predict</span>
484
+</td>
606 485
 <td>✔</td>
607 486
 <td>✔</td>
608 487
 <td>✔</td>
609 488
 </tr>
610 489
 <tr class="even">
611
-<td><span
612
-style="font-family: &#39;Courier New&#39;, monospace;">kNNinterface</span></td>
613
-<td>Wrapper for class’s function <span
614
-style="font-family: &#39;Courier New&#39;, monospace;">knn</span></td>
490
+<td><span style="font-family: 'Courier New', monospace;">kNNinterface</span></td>
491
+<td>Wrapper for class’s function <span style="font-family: 'Courier New', monospace;">knn</span>
492
+</td>
615 493
 <td>✔</td>
616 494
 <td>✔</td>
617 495
 <td>✔</td>
618 496
 </tr>
619 497
 <tr class="odd">
620
-<td><span
621
-style="font-family: &#39;Courier New&#39;, monospace;">SVMtrainInterface</span>,
622
-<span
623
-style="font-family: &#39;Courier New&#39;, monospace;">SVMpredictInterface</span></td>
624
-<td>Wrapper for e1071’s functions <span
625
-style="font-family: &#39;Courier New&#39;, monospace;">svm</span> and
626
-<span
627
-style="font-family: &#39;Courier New&#39;, monospace;">predict.svm</span></td>
498
+<td>
499
+<span style="font-family: 'Courier New', monospace;">SVMtrainInterface</span>, <span style="font-family: 'Courier New', monospace;">SVMpredictInterface</span>
500
+</td>
501
+<td>Wrapper for e1071’s functions <span style="font-family: 'Courier New', monospace;">svm</span> and <span style="font-family: 'Courier New', monospace;">predict.svm</span>
502
+</td>
628 503
 <td>✔</td>
629 504
 <td>✔ †</td>
630 505
 <td>✔ †</td>
631 506
 </tr>
632 507
 </tbody>
633 508
 </table>
634
-<p>* If ordinary numeric measurements have been transformed to absolute
635
-deviations using <span
636
-style="font-family: &#39;Courier New&#39;, monospace;">subtractFromLocation</span>.<br>
637
-† If the value of <span
638
-style="font-family: &#39;Courier New&#39;, monospace;">kernel</span> is
639
-not <span
640
-style="font-family: &#39;Courier New&#39;, monospace;">“linear”</span>.</p>
641
-<p>If a desired selection or classification method is not already
642
-implemented, rules for writing functions to work with
643
-<strong>ClassifyR</strong> are outlined in the wrapper vignette. Please
644
-visit it for more information.</p>
509
+<p>* If ordinary numeric measurements have been transformed to absolute deviations using <span style="font-family: 'Courier New', monospace;">subtractFromLocation</span>.<br> † If the value of <span style="font-family: 'Courier New', monospace;">kernel</span> is not <span style="font-family: 'Courier New', monospace;">“linear”</span>.</p>
510
+<p>If a desired selection or classification method is not already implemented, rules for writing functions to work with <strong>ClassifyR</strong> are outlined in the wrapper vignette. Please visit it for more information.</p>
645 511
 </div>
646
-<div id="provided-meta-feature-methods" class="section level4">
647
-<h4>Provided Meta-feature Methods</h4>
648
-<p>A number of methods are provided for users to enable classification
649
-in a feature-set-centric or interactor-centric way. The meta-feature
650
-creation functions should be used before cross-validation is done.</p>
651
-<table>
512
+<div class="section level4">
513
+<h4 id="provided-meta-feature-methods">Provided Meta-feature Methods<a class="anchor" aria-label="anchor" href="#provided-meta-feature-methods"></a>
514
+</h4>
515
+<p>A number of methods are provided for users to enable classification in a feature-set-centric or interactor-centric way. The meta-feature creation functions should be used before cross-validation is done.</p>
516
+<table class="table">
652 517
 <colgroup>
653
-<col width="9%" />
654
-<col width="61%" />
655
-<col width="14%" />
656
-<col width="14%" />
518
+<col width="9%">
519
+<col width="61%">
520
+<col width="14%">
521
+<col width="14%">
657 522
 </colgroup>
658
-<thead>
659
-<tr class="header">
523
+<thead><tr class="header">
660 524
 <th>Function</th>
661 525
 <th>Description</th>
662 526
 <th align="center">Before CV</th>
663 527
 <th align="center">During CV</th>
664
-</tr>
665
-</thead>
528
+</tr></thead>
666 529
 <tbody>
667 530
 <tr class="odd">
668
-<td><span
669
-style="font-family: &#39;Courier New&#39;, monospace;">edgesToHubNetworks</span></td>
670
-<td>Takes a two-column <span
671
-style="font-family: &#39;Courier New&#39;, monospace;">matrix</span> or
672
-<span
673
-style="font-family: &#39;Courier New&#39;, monospace;">DataFrame</span>
674
-and finds all nodes with at least a minimum number of interactions</td>
531
+<td><span style="font-family: 'Courier New', monospace;">edgesToHubNetworks</span></td>
532
+<td>Takes a two-column <span style="font-family: 'Courier New', monospace;">matrix</span> or <span style="font-family: 'Courier New', monospace;">DataFrame</span> and finds all nodes with at least a minimum number of interactions</td>
675 533
 <td align="center">✔</td>
676 534
 <td align="center"></td>
677 535
 </tr>
678 536
 <tr class="even">
679
-<td><span
680
-style="font-family: &#39;Courier New&#39;, monospace;">featureSetSummary</span></td>
681
-<td><span style="white-space: nowrap">Considers sets of features and
682
-calculates their mean or median</span></td>
537
+<td><span style="font-family: 'Courier New', monospace;">featureSetSummary</span></td>
538
+<td><span style="white-space: nowrap">Considers sets of features and calculates their mean or median</span></td>
683 539
 <td align="center">✔</td>
684 540
 <td align="center"></td>
685 541
 </tr>
686 542
 <tr class="odd">
687
-<td><span
688
-style="font-family: &#39;Courier New&#39;, monospace;">pairsDifferencesSelection</span></td>
689
-<td>Finds a set of pairs of features whose measurement inequalities can
690
-be used for predicting with</td>
543
+<td><span style="font-family: 'Courier New', monospace;">pairsDifferencesSelection</span></td>
544
+<td>Finds a set of pairs of features whose measurement inequalities can be used for predicting with</td>
691 545
 <td align="center"></td>
692 546
 <td align="center">✔</td>
693 547
 </tr>
694 548
 <tr class="even">
695
-<td><span
696
-style="font-family: &#39;Courier New&#39;, monospace;">kTSPclassifier</span></td>
697
-<td>Voting classifier that uses inequalities between pairs of features
698
-to vote for one of two classes</td>
549
+<td><span style="font-family: 'Courier New', monospace;">kTSPclassifier</span></td>
550
+<td>Voting classifier that uses inequalities between pairs of features to vote for one of two classes</td>
699 551
 <td align="center"></td>
700 552
 <td align="center">✔</td>
701 553
 </tr>
... ...
@@ -703,590 +555,459 @@ to vote for one of two classes</td>
703 555
 </table>
704 556
 </div>
705 557
 </div>
706
-<div id="fine-grained-cross-validation-and-modelling-using-runtests"
707
-class="section level3">
708
-<h3>Fine-grained Cross-validation and Modelling Using
709
-<em>runTests</em></h3>
710
-<p>For more control over the finer aspects of cross-validation of a
711
-single data set, <em>runTests</em> may be employed in place of
712
-<em>crossValidate</em>. For the variety of cross-validation, the
713
-parameters are specified by a <em>CrossValParams</em> object. The
714
-default setting is for 100 permutations and five folds and parameter
715
-tuning is done by resubstitution. It is also recommended to specify a
716
-<em>parallelParams</em> setting. On Linux and MacOS operating systems,
717
-it should be <em>MulticoreParam</em> and on Windows computers it should
718
-be <em>SnowParam</em>. Note that each of these have an option
719
-<em>RNGseed</em> and this <strong>needs to be set by the user</strong>
720
-because some classifiers or feature selection functions will have some
721
-element of randomisation. One example that works on all operating
722
-systems, but is best-suited to Windows is:</p>
723
-<div class="sourceCode" id="cb20"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb20-1"><a href="#cb20-1" aria-hidden="true" tabindex="-1"></a>CVparams <span class="ot">&lt;-</span> <span class="fu">CrossValParams</span>(<span class="at">parallelParams =</span> <span class="fu">SnowParam</span>(<span class="dv">16</span>, <span class="at">RNGseed =</span> <span class="dv">123</span>))</span>
724
-<span id="cb20-2"><a href="#cb20-2" aria-hidden="true" tabindex="-1"></a>CVparams</span></code></pre></div>
725
-<p>For the actual operations to do to the data to build a model of it,
726
-each of the stages should be specified by an object of class
727
-<em>ModellingParams</em>. This controls how class imbalance is handled
728
-(default is to downsample to the smallest class), any transformation
729
-that needs to be done inside of cross-validation (i.e. involving a
730
-computed value from the training set), any feature selection and the
731
-training and prediction functions to be used. The default is to do an
732
-ordinary t-test (two groups) or ANOVA (three or more groups) and
733
-classification using diagonal LDA.</p>
734
-<div class="sourceCode" id="cb21"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb21-1"><a href="#cb21-1" aria-hidden="true" tabindex="-1"></a><span class="fu">ModellingParams</span>()</span></code></pre></div>
735
-<pre><code>## An object of class &quot;ModellingParams&quot;
736
-## Slot &quot;balancing&quot;:
737
-## [1] &quot;downsample&quot;
738
-## 
739
-## Slot &quot;transformParams&quot;:
740
-## NULL
741
-## 
742
-## Slot &quot;selectParams&quot;:
743
-## An object of class &#39;SelectParams&#39;.
744
-## Selection Name: Difference in Means.
745
-## 
746
-## Slot &quot;trainParams&quot;:
747
-## An object of class &#39;TrainParams&#39;.
748
-## Classifier Name: Diagonal LDA.
749
-## 
750
-## Slot &quot;predictParams&quot;:
751
-## An object of class &#39;PredictParams&#39;.
752
-## 
753
-## Slot &quot;doImportance&quot;:
754
-## [1] FALSE</code></pre>
558
+<div class="section level3">
559
+<h3 id="fine-grained-cross-validation-and-modelling-using-runtests">Fine-grained Cross-validation and Modelling Using <em>runTests</em><a class="anchor" aria-label="anchor" href="#fine-grained-cross-validation-and-modelling-using-runtests"></a>
560
+</h3>
561
+<p>For more control over the finer aspects of cross-validation of a single data set, <em>runTests</em> may be employed in place of <em>crossValidate</em>. For the variety of cross-validation, the parameters are specified by a <em>CrossValParams</em> object. The default setting is for 100 permutations and five folds and parameter tuning is done by resubstitution. It is also recommended to specify a <em>parallelParams</em> setting. On Linux and MacOS operating systems, it should be <em>MulticoreParam</em> and on Windows computers it should be <em>SnowParam</em>. Note that each of these have an option <em>RNGseed</em> and this <strong>needs to be set by the user</strong> because some classifiers or feature selection functions will have some element of randomisation. One example that works on all operating systems, but is best-suited to Windows is:</p>
562
+<div class="sourceCode" id="cb20"><pre class="downlit sourceCode r">
563
+<code class="sourceCode R"><span><span class="va">CVparams</span> <span class="op">&lt;-</span> <span class="fu"><a href="../reference/CrossValParams-class.html">CrossValParams</a></span><span class="op">(</span>parallelParams <span class="op">=</span> <span class="fu"><a href="https://rdrr.io/pkg/BiocParallel/man/SnowParam-class.html" class="external-link">SnowParam</a></span><span class="op">(</span><span class="fl">16</span>, RNGseed <span class="op">=</span> <span class="fl">123</span><span class="op">)</span><span class="op">)</span></span>
564
+<span><span class="va">CVparams</span></span></code></pre></div>
565
+<p>For the actual operations to do to the data to build a model of it, each of the stages should be specified by an object of class <em>ModellingParams</em>. This controls how class imbalance is handled (default is to downsample to the smallest class), any transformation that needs to be done inside of cross-validation (i.e. involving a computed value from the training set), any feature selection and the training and prediction functions to be used. The default is to do an ordinary t-test (two groups) or ANOVA (three or more groups) and classification using diagonal LDA.</p>
566
+<div class="sourceCode" id="cb21"><pre class="downlit sourceCode r">
567
+<code class="sourceCode R"><span><span class="fu"><a href="../reference/ModellingParams-class.html">ModellingParams</a></span><span class="op">(</span><span class="op">)</span></span></code></pre></div>
568
+<pre><code><span><span class="co">## An object of class "ModellingParams"</span></span>
569
+<span><span class="co">## Slot "balancing":</span></span>
570
+<span><span class="co">## [1] "downsample"</span></span>
571
+<span><span class="co">## </span></span>
572
+<span><span class="co">## Slot "transformParams":</span></span>
573
+<span><span class="co">## NULL</span></span>
574
+<span><span class="co">## </span></span>
575
+<span><span class="co">## Slot "selectParams":</span></span>
576
+<span><span class="co">## An object of class 'SelectParams'.</span></span>
577
+<span><span class="co">## Selection Name: Difference in Means.</span></span>
578
+<span><span class="co">## </span></span>
579
+<span><span class="co">## Slot "trainParams":</span></span>
580
+<span><span class="co">## An object of class 'TrainParams'.</span></span>
581
+<span><span class="co">## Classifier Name: Diagonal LDA.</span></span>
582
+<span><span class="co">## </span></span>
583
+<span><span class="co">## Slot "predictParams":</span></span>