Browse code

Merge pull request #1 from wmm27/master

FoldIndexR addition

wmm27 authored on 15/03/2022 02:38:38 • GitHub committed on 15/03/2022 02:38:38
Showing 18 changed files

... ...
@@ -1,7 +1,7 @@
1 1
 Package: idpr
2 2
 Type: Package
3 3
 Title: Profiling and Analyzing Intrinsically Disordered Proteins in R
4
-Version: 1.0.007
4
+Version: 1.6.1
5 5
 Authors@R: c(person(c("William", "M."), "McFadden", 
6 6
                 email = "wmm27@pitt.edu", 
7 7
                 role = c("cre", "aut")),
... ...
@@ -23,9 +23,9 @@ License: LGPL-3
23 23
 Encoding: UTF-8
24 24
 LazyData: true
25 25
 biocViews: StructuralPrediction, Proteomics, CellBiology
26
-RoxygenNote: 7.1.1
26
+RoxygenNote: 7.1.2
27 27
 Depends: 
28
-    R (>= 4.0.0)
28
+    R (>= 4.1.3)
29 29
 Imports: 
30 30
     ggplot2 (>= 3.3.0),
31 31
     magrittr (>= 1.5),
... ...
@@ -3,6 +3,7 @@
3 3
 export(chargeCalculationGlobal)
4 4
 export(chargeCalculationLocal)
5 5
 export(chargeHydropathyPlot)
6
+export(foldIndexR)
6 7
 export(hendersonHasselbalch)
7 8
 export(idprofile)
8 9
 export(iupred)
9 10
new file mode 100644
... ...
@@ -0,0 +1,131 @@
1
+#' Prediction of Intrinsic Disorder with FoldIndex method in R
2
+#'
3
+#' This is used to calculate the prediction of intrinsic disorder based on
4
+#'   the scaled hydropathy and absolute net charge of an amino acid
5
+#'   sequence using a sliding window. FoldIndex described this relationship and
6
+#'   implemented it graphically in 2005 by Prilusky, Felder, et al, 
7
+#'   and this tool has been implemented
8
+#'   into multiple disorder prediction programs. When windows have a negative 
9
+#'   score (<0) sequences are predicted as disordered. 
10
+#'   When windows have a positive score (>0) sequences are predicted as 
11
+#'   disordered. Graphically, this cutoff is displayed by the dashed 
12
+#'   line at y = 0. Calculations are at pH 7.0 based on the described method and
13
+#'   the default is a sliding window of size 51. 
14
+#'   
15
+#'   The output is either a data frame or graph
16
+#'   showing the calculated scores for each window along the sequence.
17
+#'   The equation used was originally described in Uversky et al. (2000)\cr
18
+#'   \url{https://doi.org/10.1002/1097-0134(20001115)41:3<415::AID-PROT130>3.0.CO;2-7}
19
+#'   . \cr
20
+#'   
21
+#'   The FoldIndex method of using a sliding window and utilizing the uversky 
22
+#'   equation is described in Prilusky, J., Felder, C. E., et al. (2005). \cr
23
+#'   FoldIndex: a simple tool to predict whether a given protein sequence \cr 
24
+#'   is intrinsically unfolded. Bioinformatics, 21(16), 3435-3438. \cr
25
+#'   
26
+#'   
27
+#' @inheritParams sequenceCheck
28
+#' @inheritParams chargeCalculationLocal
29
+#' @param window a positive, odd integer. 51 by default.
30
+#'   Sets the size of sliding window, must be an odd number.
31
+#'   The window determines the number of residues to be analyzed and averaged
32
+#'   for each position along the sequence.
33
+#' @param plotResults logical value, TRUE by default.
34
+#'   If \code{plotResults = TRUE} a plot will be the output.
35
+#'   If \code{plotResults = FALSE} the output is a data frame with scores for
36
+#'   each window analyzed.
37
+#' @param proteinName character string with length = 1.
38
+#'   optional setting to replace the name of the plot if plotResults = TRUE.
39
+#' @param ... any additional parameters, especially those for plotting.
40
+#' @return see plotResults argument
41
+#' @family scaled hydropathy functions
42
+#' @seealso \code{\link{KDNorm}} for residue hydropathy values.
43
+#'   See \code{\link{pKaData}} for residue pKa values and citations. See
44
+#'   \code{\link{hendersonHasselbalch}} for charge calculations.
45
+#' @references Kyte, J., & Doolittle, R. F. (1982). A simple method for
46
+#'   displaying the hydropathic character of a protein.
47
+#'   Journal of molecular biology, 157(1), 105-132.
48
+#' @export
49
+#' @section Plot Colors:
50
+#'   For users who wish to keep a common aesthetic, the following colors are
51
+#'   used when plotResults = TRUE. \cr
52
+#'   \itemize{
53
+#'   \item Dynamic line colors: \itemize{
54
+#'   \item Close to -1 = "#9672E6"
55
+#'   \item Close to 1 = "#D1A63F"
56
+#'   \item Close to midpoint = "grey65" or "#A6A6A6"}}
57
+#'    
58
+#'   @references
59
+#'   Kozlowski, L. P. (2016). IPC – Isoelectric Point Calculator. Biology
60
+#'   Direct, 11(1), 55. \url{https://doi.org/10.1186/s13062-016-0159-9} \cr
61
+#'   Kyte, J., & Doolittle, R. F. (1982). A simple method for
62
+#'   displaying the hydropathic character of a protein.
63
+#'   Journal of molecular biology, 157(1), 105-132. \cr
64
+#'   Prilusky, J., Felder, C. E., et al. (2005). \cr
65
+#'   FoldIndex: a simple tool to predict whether a given protein sequence \cr 
66
+#'   is intrinsically unfolded. Bioinformatics, 21(16), 3435-3438. \cr
67
+#'   Uversky, V. N., Gillespie, J. R., & Fink, A. L. (2000).
68
+#'   Why are “natively unfolded” proteins unstructured under physiologic
69
+#'   conditions?. Proteins: structure, function, and bioinformatics, 41(3),
70
+#'   415-427.
71
+#'   \url{https://doi.org/10.1002/1097-0134(20001115)41:3<415::AID-PROT130>3.0.CO;2-7}
72
+#' @examples
73
+#' #Amino acid sequences can be character strings
74
+#' aaString <- "ACDEFGHIKLMNPQRSTVWY"
75
+#' #Amino acid sequences can also be character vectors
76
+#' aaVector <- c("A", "C", "D", "E", "F",
77
+#'               "G", "H", "I", "K", "L",
78
+#'               "M", "N", "P", "Q", "R",
79
+#'               "S", "T", "V", "W", "Y")
80
+#' #Alternatively, .fasta files can also be used by providing
81
+#'   ##The path to the file as a character string.
82
+#'
83
+#'
84
+#' foldIndexR(aaVector)
85
+#'
86
+#' exampleDF <- 
87
+#'   foldIndexR(aaString,
88
+#'       plotResults = FALSE)
89
+#' head(exampleDF)
90
+#' 
91
+
92
+foldIndexR <- function(sequence,
93
+                       window = 51, 
94
+                       proteinName = NA,
95
+                       pKaSet = "IPC_protein",
96
+                       plotResults = TRUE) {
97
+    
98
+    chargeDF <-
99
+        chargeCalculationLocal(sequence = sequence, window = window,
100
+                               pH = 7.0, pKaSet = pKaSet, 
101
+                               plotResults = FALSE)
102
+    chargeDF$scaledWindowCharge <- chargeDF$windowCharge / window
103
+    hydropDF <-  scaledHydropathyLocal(sequence = sequence, 
104
+                                       window = window,
105
+                                       plotResults = FALSE)
106
+    mergeDF <- merge(hydropDF, chargeDF)
107
+    
108
+    mergeDF$foldIndex <- 
109
+        mergeDF$WindowHydropathy * 2.785 - 
110
+        abs(mergeDF$scaledWindowCharge) - 1.151
111
+    
112
+    if (plotResults) {
113
+        plotTitle <- "FoldIndex Prediction of Intrinsic Disorder"
114
+        if (!is.na(proteinName)) {
115
+            plotTitle <- 
116
+                paste0("FoldIndex Prediction of Intrinsic Disorder in ", 
117
+                       proteinName, sep = "")
118
+        }
119
+        
120
+        gg <-  sequencePlot(position = mergeDF$Position,
121
+                            property = mergeDF$foldIndex,
122
+                            hline = 0, dynamicColor = mergeDF$foldIndex,
123
+                            customColors = c("#9672E6", "#D1A63F", "grey65"),
124
+                            customTitle = NA, propertyLimits = c(-1, 1))
125
+        gg <- gg + ggplot2::labs(title = plotTitle, y = "Score")
126
+        return(gg)
127
+    } else {
128
+        return(mergeDF)
129
+    }
130
+    
131
+}
... ...
@@ -7,6 +7,7 @@
7 7
 #'   \code{\link{chargeCalculationLocal}}\cr
8 8
 #'   \code{\link{scaledHydropathyLocal}}\cr
9 9
 #'   \code{\link{structuralTendencyPlot}}\cr
10
+#'   \code{\link{foldIndexR}}\cr
10 11
 #'   All of the above linked functions only require the sequence argument
11 12
 #'   to output plots of characteristics associated with IDPs. The function also
12 13
 #'   includes options for IUPred functions. The function does one of the
... ...
@@ -24,7 +25,11 @@
24 25
 #' @param uniprotAccession character string specifying the UniProt Accession of
25 26
 #'   the protein of interest. Used to fetch predictions from IUPreds REST API.
26 27
 #'   Default is NA. Keep as NA if you do not have a UniProt Accession.
27
-#'
28
+#' @param window a positive, odd integer. 51 by default.
29
+#'   Sets the size of sliding window, must be an odd number.
30
+#'   The window determines the number of residues to be analyzed and averaged
31
+#'   for each position along the sequence. 51 is default for 
32
+#'   \code{\link{foldIndexR}}\cr.
28 33
 #' @param proteinName character string, optional.
29 34
 #'   Used to add protein name to the title in ggplot.
30 35
 #' @inheritParams chargeCalculationLocal
... ...
@@ -66,6 +71,7 @@
66 71
 #'   \code{\link{chargeCalculationLocal}}\cr
67 72
 #'   \code{\link{scaledHydropathyLocal}}\cr
68 73
 #'   \code{\link{structuralTendencyPlot}}\cr
74
+#'   \code{\link{foldIndexR}}\cr
69 75
 #'   \code{\link{iupred}}\cr
70 76
 #'   \code{\link{iupredAnchor}}\cr
71 77
 #'   \code{\link{iupredRedox}}
... ...
@@ -112,6 +118,19 @@
112 118
 #'               Protein Science, 22(6), 693-724.
113 119
 #'               doi:10.1002/pro.2261 }
114 120
 #'     }
121
+#'     \item \code{\link{foldIndexR}}
122
+#'       \itemize{
123
+#'         \item{Prilusky, J., Felder, C. E., et al. (2005). 
124
+#'               FoldIndex: a simple tool to predict whether 
125
+#'               a given protein sequence is intrinsically unfolded. 
126
+#'               Bioinformatics, 21(16), 3435-3438.}
127
+#'          \item{Uversky, V. N., Gillespie, J. R., & Fink, A. L. (2000).
128
+#'               Why are “natively unfolded” proteins unstructured under
129
+#'               physiologic conditions?. Proteins: structure, function,
130
+#'               and bioinformatics, 41(3), 415-427.
131
+#'               https://doi.org/10.1002/1097-0134(20001115)41:3<415::AID-PROT130>3.0.CO;2-7}
132
+#'         \item{Also see citations for hydrapthy and charge plots above}
133
+#'     }
115 134
 #'     \item \code{\link{iupred}},
116 135
 #'           \code{\link{iupredAnchor}},
117 136
 #'           \code{\link{iupredRedox}}
... ...
@@ -155,7 +174,7 @@ idprofile <- function(
155 174
     uniprotAccession = NA,
156 175
     proteinName = NA,
157 176
     iupredType = "long",
158
-    window = 9,
177
+    window = 51,
159 178
     pH = 7.2,
160 179
     pKaSet = "IPC_protein",
161 180
     structuralTendencyType = "bar",
... ...
@@ -180,6 +199,12 @@ idprofile <- function(
180 199
         plotResults = TRUE,
181 200
         pKaSet = pKaSet,
182 201
         proteinName = proteinName)
202
+    hydropPlot <- scaledHydropathyLocal(
203
+        sequence = sequence,
204
+        window = window,
205
+        plotResults = TRUE,
206
+        pKaSet = pKaSet,
207
+        proteinName = proteinName)
183 208
     tendencyPlot <- structuralTendencyPlot(
184 209
         sequence = sequence,
185 210
         graphType = structuralTendencyType,
... ...
@@ -188,6 +213,10 @@ idprofile <- function(
188 213
         disorderNeutral = disorderNeutral,
189 214
         orderPromoting = orderPromoting,
190 215
         proteinName = proteinName)
216
+    foldIndexPlot <- foldIndexR(sequence = sequence,
217
+        window = window, 
218
+        proteinName = proteinName,
219
+        pKaSet = pKaSet) 
191 220
 
192 221
     #-------- Adding IUPred Plot based on which type
193 222
     if (!is.na(uniprotAccession)) {
... ...
@@ -216,6 +245,7 @@ idprofile <- function(
216 245
                 label = "No Uniprot Accession provided...IUPred plot skipped") +
217 246
             ggplot2::theme_void()
218 247
     }
219
-    plotList <- list(rhPlot, tendencyPlot, chargePlot, hydropPlot, iupredPlot)
248
+    plotList <- list(rhPlot, tendencyPlot, chargePlot, hydropPlot, 
249
+                     foldIndexPlot, iupredPlot)
220 250
     return(plotList)
221 251
 }
... ...
@@ -85,7 +85,7 @@ idprofile(sequence = P53_HUMAN, #Generates the Profile
85 85
 **idpr package.**
86 86
 [Link to the Vignette (here)](https://bioconductor.org/packages/release/bioc/vignettes/idpr/inst/doc/idpr-vignette.html)
87 87
 
88
-
88
+ 
89 89
 ## Appendix
90 90
 
91 91
 ### Package citation
... ...
@@ -89,6 +89,11 @@ idprofile(sequence = P53_HUMAN, #Generates the Profile
89 89
 
90 90
 <img src="man/figures/README-example-5.png" width="75%" />
91 91
 
92
+    #> 
93
+    #> [[6]]
94
+
95
+<img src="man/figures/README-example-6.png" width="75%" />
96
+
92 97
 **Please Refer to idpr-vignette.Rmd file for a detailed introduction to
93 98
 the** **idpr package.** [Link to the Vignette
94 99
 (here)](https://bioconductor.org/packages/release/bioc/vignettes/idpr/inst/doc/idpr-vignette.html)
... ...
@@ -104,7 +109,7 @@ citation("idpr")
104 109
 #> 
105 110
 #>   William M. McFadden and Judith L. Yanowitz (2020). idpr: Profiling
106 111
 #>   and Analyzing Intrinsically Disordered Proteins in R. R package
107
-#>   version 1.0.005.
112
+#>   version 1.6.1.
108 113
 #> 
109 114
 #> A BibTeX entry for LaTeX users is
110 115
 #> 
... ...
@@ -112,7 +117,7 @@ citation("idpr")
112 117
 #>     title = {idpr: Profiling and Analyzing Intrinsically Disordered Proteins in R},
113 118
 #>     author = {William M. McFadden and Judith L. Yanowitz},
114 119
 #>     year = {2020},
115
-#>     note = {R package version 1.0.005},
120
+#>     note = {R package version 1.6.1},
116 121
 #>   }
117 122
 ```
118 123
 
... ...
@@ -120,9 +125,9 @@ citation("idpr")
120 125
 
121 126
 ``` r
122 127
 Sys.time()
123
-#> [1] "2020-12-23 14:07:28 EST"
128
+#> [1] "2022-03-11 02:31:26 EST"
124 129
 Sys.Date()
125
-#> [1] "2020-12-23"
130
+#> [1] "2022-03-11"
126 131
 R.version
127 132
 #>                _                           
128 133
 #> platform       x86_64-apple-darwin17.0     
... ...
@@ -131,12 +136,12 @@ R.version
131 136
 #> system         x86_64, darwin17.0          
132 137
 #> status                                     
133 138
 #> major          4                           
134
-#> minor          0.3                         
135
-#> year           2020                        
136
-#> month          10                          
139
+#> minor          1.3                         
140
+#> year           2022                        
141
+#> month          03                          
137 142
 #> day            10                          
138
-#> svn rev        79318                       
143
+#> svn rev        81868                       
139 144
 #> language       R                           
140
-#> version.string R version 4.0.3 (2020-10-10)
141
-#> nickname       Bunny-Wunnies Freak Out
145
+#> version.string R version 4.1.3 (2022-03-10)
146
+#> nickname       One Push-Up
142 147
 ```
... ...
@@ -28,6 +28,7 @@ A dataset containing a measure of hydropathy for each amino acid residue
28 28
 }
29 29
 \seealso{
30 30
 Other scaled hydropathy functions: 
31
+\code{\link{foldIndexR}()},
31 32
 \code{\link{meanScaledHydropathy}()},
32 33
 \code{\link{scaledHydropathyGlobal}()},
33 34
 \code{\link{scaledHydropathyLocal}()}
34 35
Binary files a/man/figures/README-example-3.png and b/man/figures/README-example-3.png differ
35 36
Binary files a/man/figures/README-example-4.png and b/man/figures/README-example-4.png differ
36 37
Binary files a/man/figures/README-example-5.png and b/man/figures/README-example-5.png differ
37 38
new file mode 100644
38 39
Binary files /dev/null and b/man/figures/README-example-6.png differ
39 40
new file mode 100644
... ...
@@ -0,0 +1,135 @@
1
+% Generated by roxygen2: do not edit by hand
2
+% Please edit documentation in R/foldIndexR.R
3
+\name{foldIndexR}
4
+\alias{foldIndexR}
5
+\title{Prediction of Intrinsic Disorder with FoldIndex method in R}
6
+\usage{
7
+foldIndexR(
8
+  sequence,
9
+  window = 51,
10
+  proteinName = NA,
11
+  pKaSet = "IPC_protein",
12
+  plotResults = TRUE
13
+)
14
+}
15
+\arguments{
16
+\item{sequence}{amino acid sequence as a single character string,
17
+a vector of single characters, or an AAString object.
18
+It also supports a single character string that specifies
19
+the path to a .fasta or .fa file.}
20
+
21
+\item{window}{a positive, odd integer. 51 by default.
22
+Sets the size of sliding window, must be an odd number.
23
+The window determines the number of residues to be analyzed and averaged
24
+for each position along the sequence.}
25
+
26
+\item{proteinName}{character string with length = 1.
27
+optional setting to replace the name of the plot if plotResults = TRUE.}
28
+
29
+\item{pKaSet}{A character string or data frame. "IPC_protein" by default.
30
+Character string to load specific, preloaded pKa sets.
31
+ c("EMBOSS", "DTASelect", "Solomons", "Sillero", "Rodwell",
32
+  "Lehninger", "Toseland", "Thurlkill", "Nozaki", "Dawson",
33
+  "Bjellqvist", "ProMoST", "Vollhardt", "IPC_protein", "IPC_peptide")
34
+ Alternatively, the user may supply a custom pKa dataset.
35
+ The format must be a data frame where:
36
+ Column 1 must be a character vector of residues named "AA" AND
37
+ Column 2 must be a numeric vector of pKa values.}
38
+
39
+\item{plotResults}{logical value, TRUE by default.
40
+If \code{plotResults = TRUE} a plot will be the output.
41
+If \code{plotResults = FALSE} the output is a data frame with scores for
42
+each window analyzed.}
43
+
44
+\item{...}{any additional parameters, especially those for plotting.}
45
+}
46
+\value{
47
+see plotResults argument
48
+}
49
+\description{
50
+This is used to calculate the prediction of intrinsic disorder based on
51
+  the scaled hydropathy and absolute net charge of an amino acid
52
+  sequence using a sliding window. FoldIndex described this relationship and
53
+  implemented it graphically in 2005 by Prilusky, Felder, et al, 
54
+  and this tool has been implemented
55
+  into multiple disorder prediction programs. When windows have a negative 
56
+  score (<0) sequences are predicted as disordered. 
57
+  When windows have a positive score (>0) sequences are predicted as 
58
+  disordered. Graphically, this cutoff is displayed by the dashed 
59
+  line at y = 0. Calculations are at pH 7.0 based on the described method and
60
+  the default is a sliding window of size 51. 
61
+  
62
+  The output is either a data frame or graph
63
+  showing the calculated scores for each window along the sequence.
64
+  The equation used was originally described in Uversky et al. (2000)\cr
65
+  \url{https://doi.org/10.1002/1097-0134(20001115)41:3<415::AID-PROT130>3.0.CO;2-7}
66
+  . \cr
67
+  
68
+  The FoldIndex method of using a sliding window and utilizing the uversky 
69
+  equation is described in Prilusky, J., Felder, C. E., et al. (2005). \cr
70
+  FoldIndex: a simple tool to predict whether a given protein sequence \cr 
71
+  is intrinsically unfolded. Bioinformatics, 21(16), 3435-3438. \cr
72
+}
73
+\section{Plot Colors}{
74
+
75
+  For users who wish to keep a common aesthetic, the following colors are
76
+  used when plotResults = TRUE. \cr
77
+  \itemize{
78
+  \item Dynamic line colors: \itemize{
79
+  \item Close to -1 = "#9672E6"
80
+  \item Close to 1 = "#D1A63F"
81
+  \item Close to midpoint = "grey65" or "#A6A6A6"}}
82
+   
83
+  @references
84
+  Kozlowski, L. P. (2016). IPC – Isoelectric Point Calculator. Biology
85
+  Direct, 11(1), 55. \url{https://doi.org/10.1186/s13062-016-0159-9} \cr
86
+  Kyte, J., & Doolittle, R. F. (1982). A simple method for
87
+  displaying the hydropathic character of a protein.
88
+  Journal of molecular biology, 157(1), 105-132. \cr
89
+  Prilusky, J., Felder, C. E., et al. (2005). \cr
90
+  FoldIndex: a simple tool to predict whether a given protein sequence \cr 
91
+  is intrinsically unfolded. Bioinformatics, 21(16), 3435-3438. \cr
92
+  Uversky, V. N., Gillespie, J. R., & Fink, A. L. (2000).
93
+  Why are “natively unfolded” proteins unstructured under physiologic
94
+  conditions?. Proteins: structure, function, and bioinformatics, 41(3),
95
+  415-427.
96
+  \url{https://doi.org/10.1002/1097-0134(20001115)41:3<415::AID-PROT130>3.0.CO;2-7}
97
+}
98
+
99
+\examples{
100
+#Amino acid sequences can be character strings
101
+aaString <- "ACDEFGHIKLMNPQRSTVWY"
102
+#Amino acid sequences can also be character vectors
103
+aaVector <- c("A", "C", "D", "E", "F",
104
+              "G", "H", "I", "K", "L",
105
+              "M", "N", "P", "Q", "R",
106
+              "S", "T", "V", "W", "Y")
107
+#Alternatively, .fasta files can also be used by providing
108
+  ##The path to the file as a character string.
109
+
110
+
111
+foldIndexR(aaVector)
112
+
113
+exampleDF <- 
114
+  foldIndexR(aaString,
115
+      plotResults = FALSE)
116
+head(exampleDF)
117
+
118
+}
119
+\references{
120
+Kyte, J., & Doolittle, R. F. (1982). A simple method for
121
+  displaying the hydropathic character of a protein.
122
+  Journal of molecular biology, 157(1), 105-132.
123
+}
124
+\seealso{
125
+\code{\link{KDNorm}} for residue hydropathy values.
126
+  See \code{\link{pKaData}} for residue pKa values and citations. See
127
+  \code{\link{hendersonHasselbalch}} for charge calculations.
128
+
129
+Other scaled hydropathy functions: 
130
+\code{\link{KDNorm}},
131
+\code{\link{meanScaledHydropathy}()},
132
+\code{\link{scaledHydropathyGlobal}()},
133
+\code{\link{scaledHydropathyLocal}()}
134
+}
135
+\concept{scaled hydropathy functions}
... ...
@@ -43,10 +43,11 @@ disorder based on environmental conditions. Regions of predicted
43 43
 environmental sensitivity are highlighted. See the respective functions
44 44
 for more details. This is skipped if uniprotAccession = NA.}
45 45
 
46
-\item{window}{a positive, odd integer. 7 by default.
46
+\item{window}{a positive, odd integer. 51 by default.
47 47
 Sets the size of sliding window, must be an odd number.
48 48
 The window determines the number of residues to be analyzed and averaged
49
-for each position along the sequence.}
49
+for each position along the sequence. 51 is default for 
50
+\code{\link{foldIndexR}}\cr.}
50 51
 
51 52
 \item{pH}{numeric value, 7.0 by default.
52 53
 The environmental pH used to calculate residue charge.}
... ...
@@ -94,6 +95,7 @@ The IDPRofile is a summation of many features of the idpr package,
94 95
   \code{\link{chargeCalculationLocal}}\cr
95 96
   \code{\link{scaledHydropathyLocal}}\cr
96 97
   \code{\link{structuralTendencyPlot}}\cr
98
+  \code{\link{foldIndexR}}\cr
97 99
   All of the above linked functions only require the sequence argument
98 100
   to output plots of characteristics associated with IDPs. The function also
99 101
   includes options for IUPred functions. The function does one of the
... ...
@@ -149,6 +151,19 @@ The IDPRofile is a summation of many features of the idpr package,
149 151
               Protein Science, 22(6), 693-724.
150 152
               doi:10.1002/pro.2261 }
151 153
     }
154
+    \item \code{\link{foldIndexR}}
155
+      \itemize{
156
+        \item{Prilusky, J., Felder, C. E., et al. (2005). 
157
+              FoldIndex: a simple tool to predict whether 
158
+              a given protein sequence is intrinsically unfolded. 
159
+              Bioinformatics, 21(16), 3435-3438.}
160
+         \item{Uversky, V. N., Gillespie, J. R., & Fink, A. L. (2000).
161
+              Why are “natively unfolded” proteins unstructured under
162
+              physiologic conditions?. Proteins: structure, function,
163
+              and bioinformatics, 41(3), 415-427.
164
+              https://doi.org/10.1002/1097-0134(20001115)41:3<415::AID-PROT130>3.0.CO;2-7}
165
+        \item{Also see citations for hydrapthy and charge plots above}
166
+    }
152 167
     \item \code{\link{iupred}},
153 168
           \code{\link{iupredAnchor}},
154 169
           \code{\link{iupredRedox}}
... ...
@@ -194,6 +209,7 @@ idprofile(
194 209
   \code{\link{chargeCalculationLocal}}\cr
195 210
   \code{\link{scaledHydropathyLocal}}\cr
196 211
   \code{\link{structuralTendencyPlot}}\cr
212
+  \code{\link{foldIndexR}}\cr
197 213
   \code{\link{iupred}}\cr
198 214
   \code{\link{iupredAnchor}}\cr
199 215
   \code{\link{iupredRedox}}
... ...
@@ -46,6 +46,7 @@ Kyte, J., & Doolittle, R. F. (1982). A simple method for
46 46
 
47 47
 Other scaled hydropathy functions: 
48 48
 \code{\link{KDNorm}},
49
+\code{\link{foldIndexR}()},
49 50
 \code{\link{scaledHydropathyGlobal}()},
50 51
 \code{\link{scaledHydropathyLocal}()}
51 52
 }
... ...
@@ -87,6 +87,7 @@ Kyte, J., & Doolittle, R. F. (1982). A simple method for
87 87
 
88 88
 Other scaled hydropathy functions: 
89 89
 \code{\link{KDNorm}},
90
+\code{\link{foldIndexR}()},
90 91
 \code{\link{meanScaledHydropathy}()},
91 92
 \code{\link{scaledHydropathyLocal}()}
92 93
 }
... ...
@@ -105,6 +105,7 @@ Kyte, J., & Doolittle, R. F. (1982). A simple method for
105 105
 
106 106
 Other scaled hydropathy functions: 
107 107
 \code{\link{KDNorm}},
108
+\code{\link{foldIndexR}()},
108 109
 \code{\link{meanScaledHydropathy}()},
109 110
 \code{\link{scaledHydropathyGlobal}()}
110 111
 }
... ...
@@ -63,7 +63,15 @@ $$<R> = - 2.785 <H> + 1.151 $$
63 63
 
64 64
 This plot allows a distinction between
65 65
 negative and positive proteins while preserving the information of the 
66
-charge-hydropathy plot. 
66
+charge-hydropathy plot.
67
+
68
+Further, a this can be used to identify folded regions on a protein. 
69
+FoldIndex used this equation and set variables to 0 and using a sliding window, 
70
+the resulting values would identify regions predicted as folded or unfolded. 
71
+$$ Score = 2.785 <H> - \lvert<R>\rvert -1.151 $$
72
+When windows have a negative score (<0) sequences are predicted as disordered. 
73
+When windows have a positive score (>0) sequences are predicted as ordered. 
74
+This was described in Prilusky, J., Felder, C. E., et al. (2005). 
67 75
 
68 76
 ## Installation  
69 77
 
... ...
@@ -196,6 +204,15 @@ chargeHydropathyPlot(
196 204
 ```
197 205
 
198 206
 
207
+## Using FoldIndexR to predict folded and unfolded windows. 
208
+
209
+```{r}
210
+foldIndexR(sequence = HUMAN_P53,
211
+           plotResults = TRUE)
212
+```
213
+
214
+Prilusky, J., Felder, C. E., et al. (2005). 
215
+
199 216
 ## Calculating Scaled Hydropathy
200 217
 
201 218
 ### Mean Scaled Hydropathy
... ...
@@ -521,6 +538,11 @@ biology, 157(1), 105-132.
521 538
 Po, H. N., & Senozan, N. (2001). The Henderson-Hasselbalch equation:
522 539
 its history and limitations. Journal of Chemical Education, 78(11), 1499. 
523 540
 
541
+Prilusky, J., Felder, C. E., et al. (2005). 
542
+FoldIndex: a simple tool to predict whether a given protein sequence 
543
+is intrinsically unfolded. Bioinformatics, 21(16), 3435-3438. 
544
+
545
+
524 546
 Proteinogenic amino acid. (n.d.). In Wikipedia. Retrieved July 12th, 2020. 
525 547
 https://en.wikipedia.org/wiki/Proteinogenic_amino_acid#Chemical_properties
526 548
 
... ...
@@ -153,12 +153,13 @@ idprofile(sequence = P53_HUMAN,
153 153
 ```
154 154
 
155 155
 
156
-idprofile returns 4-5 plots:
156
+idprofile returns 5-6 plots:
157 157
 
158 158
  * Charge-Hydropathy Plot^\*^
159 159
  * Plot of Amino Acid Composition and Structural Tendency^†^
160 160
  * Calculations of Local Charge Along a Protein Sequence^\*^
161 161
  * Local, Scaled Hydropathy Along a Protein Sequence^\*^
162
+ * A prediction of intrinsic disorder by FoldIndex^\*^
162 163
  * A prediction of intrinsic disorder by IUPred2 (only with a uniprotAccession)^‡^
163 164
  
164 165
 *Detailed descriptions of each plot can be found in specific vignettes.*
... ...
@@ -173,7 +174,7 @@ idprofile returns 4-5 plots:
173 174
 A brief explanation of each plot is given below:
174 175
 
175 176
 
176
-### Charge-Hydropathy Plot
177
+### Charge-Hydropathy Plot and FoldIndex
177 178
 
178 179
 Uversky, Gillespie, & Fink (2000) showed that both high net charge and 
179 180
 low mean hydropathy are properties of IDPs (15). One explanation is that a high 
... ...
@@ -185,7 +186,11 @@ graphic can be used to distinguish proteins that are extended or compact under
185 186
 native conditions. However, it is important to note that IDPs can have the 
186 187
 characteristics of a collapsed protein or an extended protein. Therefore a 
187 188
 protein within the “collapsed protein” field does not necessary mean that it 
188
-lacks intrinsic disorder under native conditions (15, 31). 
189
+lacks intrinsic disorder under native conditions (15, 31). This equation was
190
+later applied to a method of predicting unfolded peptides using a sliding window 
191
+of charge and hydropathy in FoldIndex (44). When scores are negative, a region 
192
+is predicted as unfolded; when scores are positive, a region is predicted as 
193
+folded.
189 194
 
190 195
 
191 196
 **For further theory and details, please refer to idpr's **
... ...
@@ -387,6 +392,7 @@ et al. (2001) (25).
387 392
 41. Johnson M, Zaretskaya I, Raytselis Y, Merezhuk Y, McGinnis S, Madden TL. NCBI BLAST: a better web interface. Nucleic Acids Research. 2008;36(suppl_2):W5-W9.
388 393
 42. Madeira F, Park YM, Lee J, Buso N, Gur T, Madhusoodanan N, et al. The EMBL-EBI search and sequence analysis tools APIs in 2019. Nucleic acids research. 2019;47(W1):W636-W41.
389 394
 43. Pagès H, Aboyoun P, Gentleman R, DebRoy S. Biostrings: Efficient manipulation of biological strings. R package version. 2020;2(0).
395
+44. Prilusky J, Felder C, Zeev-Ben-Mordehai T, Rydberg E, Man O, Beckmann J, Silman I, & Sussman J. FoldIndex©: a simple tool to predict whether a given protein sequence is intrinsically unfolded. Bioinformatics 21, no. 16 (2005): 3435-3438.
390 396
 
391 397
 
392 398