Browse code

3.99.1: interventions, death with fdf, user variables

ramon diaz-uriarte (at Phelsuma) authored on 25/06/2022 14:24:13
Showing 1 changed files
... ...
@@ -167,7 +167,9 @@ additive models.}
167 167
   routines can have trouble (specially if you log) with values <=0. Or
168 168
   we might have trouble if we want to log the fitness. This is done
169 169
   after possibly taking logs. Noise is added to prevent creating several
170
-  identical minimal fitness values.}
170
+  identical minimal fitness values.  Note that \code{\link{allFitnessEffects}} will remove from the table
171
+  of genotypes any genotype with a fitness <= 1e-9, thus 
172
+    making it a non-viable genotype during simulations. }
171 173
 
172 174
 \item{K}{K for NK model; K is the number of loci with which each locus
173 175
   interacts, and the larger the K the larger the ruggedness of the
... ...
@@ -288,6 +290,9 @@ Optimum model component.}
288 290
   of the data can be large, specially if \code{g} (the number of genes)
289 291
   is large.
290 292
 
293
+  Note that \code{\link{allFitnessEffects}} will remove from the table
294
+  of genotypes any genotype with a fitness <= 1e-9, thus 
295
+    making it a non-viable genotype during simulations.   
291 296
   
292 297
 } 
293 298
 
Browse code

2.99.93; fixed bugs and test errors for three-element vector

ramon diaz-uriarte (at Phelsuma) authored on 29/04/2021 23:22:43
Showing 1 changed files
... ...
@@ -115,7 +115,7 @@ additive models.}
115 115
   This option has no effect if you pass a three-element vector for
116 116
   \code{scale}. Using a three-element vector for \code{scale} is
117 117
   probably the most natural way of changing the scale and range of
118
-  fitness while setting the wildtype to value of your choice.
118
+  fitness while setting the wildtype to a value of your choice.
119 119
   
120 120
 }
121 121
 
Browse code

2.99.92: rfitness with a three-element vector for scale and SSWM examples

ramon diaz-uriarte (at Phelsuma) authored on 27/04/2021 13:58:19
Showing 1 changed files
... ...
@@ -55,9 +55,36 @@ additive models.}
55 55
   genotypes with that number of mutations have equal probability of
56 56
   being the reference). }
57 57
 
58
-\item{scale}{Either NULL (nothing is done) or a two-element vector. If a
59
-  two-element vector, fitness is re-scaled between \code{scale[1]} (the
60
-  minimum) and \code{scale[2]} (the maximum).}
58
+\item{scale}{Either NULL (nothing is done) or a two- or three-element
59
+  vector.
60
+
61
+  If a two-element vector, fitness is re-scaled between
62
+  \code{scale[1]} (the minimum) and \code{scale[2]} (the maximum) and,
63
+  later, if you have selected it, \code{wt_is_1} will be enforced.
64
+
65
+  If you pass a three element vector, fitness is re-scaled so that the
66
+  new maximum fitness is \code{scale[1]}, the new minimum is
67
+  \code{scale[2]} and the new wildtype is \code{scale[3]}. If you pass a
68
+  three element vector, none of the \code{wt_is_1} options apply in this
69
+  case, to ensure you obtain the range you want. If you want the
70
+  wildtype to be one, pass it as the third element of the vector.
71
+
72
+  As a consequence of using a three element vector, the amount of
73
+  stretching/compressing (i.e., scaling) of fitness values larger than
74
+  that of the wildtype will likely be different from the scaling of
75
+  fitness values smaller than that of the wildtype.  In other words,
76
+  this argument allows you to change the spread of the positive and
77
+  negative fitness values (and you can make this difference extreme and
78
+  make most fitness values less than wildtype be 0 by using a huge
79
+  negative number --huge in absolute value-- for \code{scale[2]} if you
80
+  then truncate at 0 --see \code{truncate_at_9}).
81
+
82
+  Using a three element vector is probably the most natural way of
83
+  changing the scale and range of fitness.
84
+
85
+  See also \code{log} if you want the log-transformed values to respect
86
+  the scale.
87
+}
61 88
 
62 89
 \item{wt_is_1}{If "divide" the fitness of all genotypes is
63 90
   divided by the fitness of the wildtype (after possibly adding a value
... ...
@@ -83,14 +110,28 @@ additive models.}
83 110
   option can easily lead to landscapes with no accessible genotypes
84 111
   (even if you also use \code{scale}).
85 112
 
86
-  If "no", the fitness of the wildtype is not modified.  }
113
+  If "no", the fitness of the wildtype is not modified.
114
+
115
+  This option has no effect if you pass a three-element vector for
116
+  \code{scale}. Using a three-element vector for \code{scale} is
117
+  probably the most natural way of changing the scale and range of
118
+  fitness while setting the wildtype to value of your choice.
119
+  
120
+}
87 121
 
88 122
 
89 123
 \item{log}{If TRUE, log-transform fitness. Actually, there are two
90 124
   cases: if \code{wt_is_1 = "no"} we simply log the fitness values;
91 125
   otherwise, we log the fitness values and add a 1, thus shifting all
92 126
   fitness values, because by decree the fitness (birth rate) of the
93
-  wildtype must be 1.}
127
+  wildtype must be 1.
128
+
129
+  If you pass a three-element vector for scale, you will want to pass
130
+  \code{exp(desired_max)}, \code{exp(desired_min)}, and
131
+  \code{exp(desired_wildtype)} to the \code{scale} argument. (We first
132
+  scale values in the original scale and then log them). In this case,
133
+  we ignore whatever you passed as \code{wt_is_1}, setting \code{wt_is_1
134
+  = "no"} to avoid modifying your requested value for the wildtype.}
94 135
 
95 136
 \item{min_accessible_genotypes}{If not NULL, the minimum number of
96 137
   accessible genotypes in the fitness landscape. A genotype is
Browse code

v. 2.99.3

ramon diaz-uriarte (at Phelsuma) authored on 13/12/2020 14:35:47
Showing 1 changed files
... ...
@@ -314,11 +314,6 @@ MAGELLAN web site: \url{http://wwwabi.snv.jussieu.fr/public/Magellan/}
314 314
 ## plotting and simulating an oncogenetic trajectory
315 315
 
316 316
 
317
-r1 <- rfitness(4)
318
-plot(r1)
319
-oncoSimulIndiv(allFitnessEffects(genotFitness = r1))
320
-
321
-
322 317
 ## NK model
323 318
 rnk <- rfitness(5, K = 3, model = "NK")
324 319
 plot(rnk)
... ...
@@ -328,6 +323,8 @@ oncoSimulIndiv(allFitnessEffects(genotFitness = rnk))
328 323
 radd <- rfitness(4, model = "Additive", mu = 0.2, sd = 0.5)
329 324
 plot(radd)
330 325
 
326
+
327
+\dontrun{
331 328
 ## Eggbox model
332 329
 regg = rfitness(g=4,model="Eggbox", e = 2, E=2.4)
333 330
 plot(regg)
... ...
@@ -342,7 +339,8 @@ plot(ris)
342 339
 rfull = rfitness(g=4, model="Full", i = 0.002, I=2, 
343 340
                  K = 2, r = TRUE,
344 341
                  p = 0.2, P = 0.3, o = 0.3, O = 1)
345
-plot(rfull)
342
+    plot(rfull)
343
+    }
346 344
 }
347 345
 \keyword{ datagen }
348 346
 
Browse code

2.17.7: no longer depends on nem; more models from MAGELLAN

ramon diaz-uriarte (at Phelsuma) authored on 30/01/2020 13:39:14
Showing 1 changed files
... ...
@@ -1,11 +1,12 @@
1 1
 \name{rfitness}
2 2
 \alias{rfitness}
3
-
3
+\encoding{UTF-8}
4 4
 
5 5
 \title{Generate random fitness.}
6 6
 
7 7
 \description{ Generate random fitness landscapes under a House of Cards,
8
-  Rough Mount Fuji, additive model, and Kauffman's NK model.  }
8
+  Rough Mount Fuji (RMF), additive (multiplicative) model, Kauffman's NK
9
+  model, Ising model, Eggbox model and Full model}
9 10
 
10 11
 
11 12
 \usage{
... ...
@@ -14,7 +15,9 @@ rfitness(g, c = 0.5, sd = 1, mu = 1, reference = "random", scale = NULL,
14 15
          wt_is_1 = c("subtract", "divide", "force", "no"),
15 16
          log = FALSE, min_accessible_genotypes = NULL,
16 17
          accessible_th = 0, truncate_at_0 = TRUE,
17
-         K = 1, r = TRUE, model = c("RMF", "NK"))
18
+         K = 1, r = TRUE, i = 0, I = -1, circular = FALSE, e = 0, E = -1,
19
+         H = -1, s = 0.1, S = -1, d = 0, o = 0, O = -1, p = 0, P = -1, 
20
+         model = c("RMF", "Additive", "NK", "Ising", "Eggbox", "Full"))
18 21
 }
19 22
 
20 23
 
... ...
@@ -25,28 +28,32 @@ rfitness(g, c = 0.5, sd = 1, mu = 1, reference = "random", scale = NULL,
25 28
   \item{g}{Number of genes.}
26 29
 
27 30
   \item{c}{The decrease in fitness of a genotype per each unit increase
28
-    in Hamming distance from the reference genotype (see \code{reference}).}
31
+    in Hamming distance from the reference genotype for the RMF model
32
+    (see \code{reference}).}
29 33
 
30 34
   \item{sd}{The standard deviation of the random component (a normal
31
-  distribution of mean \code{mu} and standard deviation \code{sd}).}
35
+  distribution of mean \code{mu} and standard deviation \code{sd}) for
36
+  the RMF and additive models .}
32 37
 
33 38
 \item{mu}{The mean of the random component (a normal distribution of
34
-mean \code{mu} and standard deviation \code{sd}).}
35
-
36
-
37
-\item{reference}{The reference genotype: for the deterministic, additive
38
-  part, this is the genotype with maximal fitness, and all other
39
-  genotypes decrease their fitness by \code{c} for every unit of Hamming
40
-  distance from this reference. If "random" a genotype will be randomly
41
-  chosen as the reference. If "max" the genotype with all positions
42
-  mutated will be chosen as the reference. If you pass a vector (e.g.,
43
-  \code{reference = c(1, 0, 1, 0)}) that will be the reference genotype.
44
-  If "random2" a genotype will be randomly chosen as the reference. In
45
-  contrast to "random", however, not all genotypes have the same
46
-  probability of being chosen; here, what is equal is the probability
47
-  that the reference genotype has 1, 2, ..., g, mutations (and, once a
48
-  number mutations is chosen, all genotypes with that number of
49
-  mutations have equal probability of being the reference). }
39
+mean \code{mu} and standard deviation \code{sd}) for the RMF and
40
+additive models.}
41
+
42
+
43
+\item{reference}{The reference genotype: in the RMF model, for the
44
+  deterministic, additive part, this is the genotype with maximal
45
+  fitness, and all other genotypes decrease their fitness by \code{c}
46
+  for every unit of Hamming distance from this reference. If "random" a
47
+  genotype will be randomly chosen as the reference. If "max" the
48
+  genotype with all positions mutated will be chosen as the
49
+  reference. If you pass a vector (e.g., \code{reference = c(1, 0, 1,
50
+  0)}) that will be the reference genotype.  If "random2" a genotype
51
+  will be randomly chosen as the reference. In contrast to "random",
52
+  however, not all genotypes have the same probability of being chosen;
53
+  here, what is equal is the probability that the reference genotype has
54
+  1, 2, ..., g, mutations (and, once a number mutations is chosen, all
55
+  genotypes with that number of mutations have equal probability of
56
+  being the reference). }
50 57
 
51 58
 \item{scale}{Either NULL (nothing is done) or a two-element vector. If a
52 59
   two-element vector, fitness is re-scaled between \code{scale[1]} (the
... ...
@@ -127,9 +134,45 @@ mean \code{mu} and standard deviation \code{sd}).}
127 134
 
128 135
 \item{r}{For the NK model, whether interacting loci are chosen at random
129 136
   (\code{r = TRUE}) or are neighbors (\code{r = FALSE}).}
137
+\item{i}{For de Ising model, i is the mean cost for incompatibility with which
138
+  the genotype's fitness is penalized when in two adjacent genes, only one of 
139
+  them is mutated.}
140
+
141
+\item{I}{For the Ising model, I is the standard deviation for the cost 
142
+  incompatibility (i).}
143
+  
144
+\item{circular}{For the Ising model, whether there is a circular arrangement, 
145
+  where the last and the first genes are adjacent to each other.}
146
+
147
+\item{e}{For the Eggbox model, mean effect in fitness for the neighbor
148
+  locus +/- e.}
149
+  
150
+\item{E}{For the Eggbox model, noise added to the mean effect in fitness (e).}
151
+
152
+\item{H}{For Full models, standard deviation for the House of Cards model.}
153
+
154
+\item{s}{For Full models, mean of the fitness for the Multiplicative model.}
155
+
156
+\item{S}{For Full models, standard deviation for the Multiplicative model.}
157
+
158
+\item{d}{For Full models, a disminishing (negative) or increasing 
159
+  (positive) return as the peak is approached for multiplicative model.}
160
+  
161
+\item{o}{For Full models, mean value for the optimum model.}
162
+
163
+\item{O}{For Full models, standard deviation for the optimum model.}
130 164
 
131
-\item{model}{One of "RMF" (default), for Rough Mount Fuji, or "NK", for
132
-  Kauffman's NK model.}
165
+\item{p}{For Full models, the mean production value for each non 0
166
+  allele in the Optimum model component.}
167
+
168
+\item{P}{For Full models, the associated stdev (of non 0 alleles) in the
169
+Optimum model component.}
170
+
171
+
172
+
173
+\item{model}{One of "RMF" (default) for Rough Mount Fuji, "Additive" for
174
+ Additive model, "NK", for Kauffman's NK model, "Ising" for Ising model,
175
+ "Eggbox" for Eggbox model or "Full" for Full models.}
133 176
 } 
134 177
 
135 178
 
... ...
@@ -146,14 +189,56 @@ mean \code{mu} and standard deviation \code{sd}).}
146 189
   random variable (in this case, a normal deviate of mean \code{mu}
147 190
   and standard deviation \code{sd}).
148 191
 
149
-  Setting \eqn{c = 0} we obtain a House of Cards model. Setting \eqn{sd
150
-    = 0} fitness is given by the distance from the reference and if the
151
-    reference is the genotype with all positions mutated, then we have a
152
-    fully additive model (fitness increases linearly with the number of
153
-    positions mutated).
192
+  When using \code{model = "RMF"}, setting \eqn{c = 0} we obtain a House
193
+    of Cards model. Setting \eqn{sd = 0} fitness is given by the
194
+    distance from the reference and if the reference is the genotype
195
+    with all positions mutated, then we have a fully additive model
196
+    (fitness increases linearly with the number of positions mutated),
197
+    where all mutations have the same effect.
198
+
199
+  More flexible additive models can be used using \code{model =
200
+  "Additive"}. This model is like the Rough Mount Fuji model in Szendro
201
+  et al., 2013 or Franke et al., 2011, but in this case, each locus can
202
+  have different contributions to the fitness evaluation. This model is
203
+  also referred to as the "multiplicative" model in the literature as it
204
+  is additive in the log-scale (e.g., see Brouillet et al., 2015 or
205
+  Ferretti et al., 2016). The contribution of each mutated allele to the
206
+  log-fitness is a random deviate from a Normal distribution with
207
+  specified mean \code{mu} and standard deviation \code{sd}, and the
208
+  log-fitness of a genotype is the sum of the contributions of each
209
+  mutated allele. There is no "reference" genotype in the Additive
210
+  model.  There is no epistasis in the additve model because the effect
211
+  of a mutation in a locus does not depend on the genetic background, or
212
+  whether the rest of the loci are mutated or not.
213
+  
214
+
215
+  When using \code{model = "NK"} fitness is drawn from a uniform (0, 1)
216
+  distribution.
217
+  
218
+  
219
+  When using \code{model = "Ising"} for each pair of interacting loci, 
220
+  there is an associated cost if both alleles are not identical 
221
+  (and therefore 'compatible').
222
+  
223
+  
224
+  When using \code{model = "Eggbox"} each locus is either high or low fitness,
225
+  with a systematic change between each neighbor.
226
+  
227
+  
228
+  When using \code{model = "Full"}, the fitness is computed with different
229
+  parts of the previous models depending on the choosen parameters described 
230
+  above. 
231
+  
232
+  
233
+  For \code{model = "NK" | "Ising" | "Eggbox" | "Full"} the fitness
234
+  landscape is generated by directly calling the \code{fl_generate}
235
+  function of MAGELLAN
236
+  (\url{http://wwwabi.snv.jussieu.fr/public/Magellan/}). See details in
237
+  Ferretti et al. 2016, or Brouillet et al., 2015.
238
+  
154 239
 
155 240
   For OncoSimulR, we often want the wildtype to have a mean of
156
-  1. Reasonable settings are \code{mu = 1} and \code{wt_is_1 =
241
+  1. Reasonable settings when using RMF are \code{mu = 1} and \code{wt_is_1 =
157 242
   'subtract'} so that we simulate from a distribution centered in 1, and
158 243
   we make sure afterwards (via a simple shift) that the wildtype is
159 244
   actuall 1. The \code{sd} controls the standard deviation, with the
... ...
@@ -162,14 +247,6 @@ mean \code{mu} and standard deviation \code{sd}).}
162 247
   of the data can be large, specially if \code{g} (the number of genes)
163 248
   is large.
164 249
 
165
-
166
-  When using \code{model = "NK"}, the model used is Kauffman's NK model
167
-  (see details in Ferretti et al., or Brouillet et al., below), as
168
-  implemented in MAGELLAN
169
-  (\url{http://wwwabi.snv.jussieu.fr/public/Magellan/}). This fitness
170
-  landscape is generated by directly calling the \code{fl_generate}
171
-  function of MAGELLAN. Fitness is drawn from a uniform (0, 1)
172
-  distribution.
173 250
   
174 251
 } 
175 252
 
... ...
@@ -214,10 +291,12 @@ MAGELLAN web site: \url{http://wwwabi.snv.jussieu.fr/public/Magellan/}
214 291
 }
215 292
 
216 293
 \author{ Ramon Diaz-Uriarte for the RMF and general wrapping
217
-code. S. Brouillet, G. Achaz, S. Matuszewski, H. Annoni, and L. Ferreti
218
-for the MAGELLAN code.
219
-
220
-}
294
+  code. S. Brouillet, G. Achaz, S. Matuszewski, H. Annoni, and
295
+  L. Ferreti for the MAGELLAN code. Further contributions to the
296
+  additive model and to wrapping MAGELLAN code and documentation from
297
+  Guillermo Gorines Cordero, Ivan Lorca Alonso, Francisco Muñoz Lopez,
298
+  David Roncero Moroño, Alvaro Quevedo, Pablo Perez, Cristina Devesa,
299
+  Alejandro Herrador.}
221 300
 
222 301
 \seealso{
223 302
   
... ...
@@ -234,6 +313,7 @@ for the MAGELLAN code.
234 313
 ## Random fitness for four genes-genotypes,
235 314
 ## plotting and simulating an oncogenetic trajectory
236 315
 
316
+
237 317
 r1 <- rfitness(4)
238 318
 plot(r1)
239 319
 oncoSimulIndiv(allFitnessEffects(genotFitness = r1))
... ...
@@ -243,7 +323,26 @@ oncoSimulIndiv(allFitnessEffects(genotFitness = r1))
243 323
 rnk <- rfitness(5, K = 3, model = "NK")
244 324
 plot(rnk)
245 325
 oncoSimulIndiv(allFitnessEffects(genotFitness = rnk))
246
-}
247 326
 
327
+## Additive model
328
+radd <- rfitness(4, model = "Additive", mu = 0.2, sd = 0.5)
329
+plot(radd)
330
+
331
+## Eggbox model
332
+regg = rfitness(g=4,model="Eggbox", e = 2, E=2.4)
333
+plot(regg)
334
+
335
+
336
+## Ising model
337
+ris = rfitness(g=4,model="Ising", i = 0.002, I=2)
338
+plot(ris)
339
+
340
+
341
+## Full model
342
+rfull = rfitness(g=4, model="Full", i = 0.002, I=2, 
343
+                 K = 2, r = TRUE,
344
+                 p = 0.2, P = 0.3, o = 0.3, O = 1)
345
+plot(rfull)
346
+}
248 347
 \keyword{ datagen }
249 348
 
Browse code

2.17.2: fno-common

ramon diaz-uriarte (at Phelsuma) authored on 17/12/2019 15:14:43
Showing 1 changed files
... ...
@@ -226,6 +226,7 @@ for the MAGELLAN code.
226 226
   \code{\link{evalAllGenotypes}}
227 227
   \code{\link{allFitnessEffects}}
228 228
   \code{\link{plotFitnessLandscape}}
229
+  \code{\link{Magellan_stats}}  
229 230
 
230 231
 }
231 232
 \examples{
Browse code

2.17.1: rfitness, clarified log and truncate, and Magellan_stats, return vector and do not use log by default

ramon diaz-uriarte (at Phelsuma) authored on 28/11/2019 19:58:05
Showing 1 changed files
... ...
@@ -76,10 +76,14 @@ mean \code{mu} and standard deviation \code{sd}).}
76 76
   option can easily lead to landscapes with no accessible genotypes
77 77
   (even if you also use \code{scale}).
78 78
 
79
-  If "none", the fitness of the wildtype is not touched.  }
79
+  If "no", the fitness of the wildtype is not modified.  }
80 80
 
81 81
 
82
-\item{log}{If TRUE, log-transform fitness.}
82
+\item{log}{If TRUE, log-transform fitness. Actually, there are two
83
+  cases: if \code{wt_is_1 = "no"} we simply log the fitness values;
84
+  otherwise, we log the fitness values and add a 1, thus shifting all
85
+  fitness values, because by decree the fitness (birth rate) of the
86
+  wildtype must be 1.}
83 87
 
84 88
 \item{min_accessible_genotypes}{If not NULL, the minimum number of
85 89
   accessible genotypes in the fitness landscape. A genotype is
... ...
@@ -110,10 +114,12 @@ mean \code{mu} and standard deviation \code{sd}).}
110 114
   negative value for \code{accessible_th}.  }
111 115
 
112 116
 \item{truncate_at_0}{If TRUE (the default) any fitness <= 0 is
113
-  substituted by a small positive constant (1e-9). Why? Because
114
-  MAGELLAN and some plotting routines can have trouble (specially if you
115
-  log) with values <=0. Or we might have trouble if we want to log the
116
-  fitness.}
117
+  substituted by a small positive constant (a random uniform number
118
+  between 1e-10 and 1e-9). Why? Because MAGELLAN and some plotting
119
+  routines can have trouble (specially if you log) with values <=0. Or
120
+  we might have trouble if we want to log the fitness. This is done
121
+  after possibly taking logs. Noise is added to prevent creating several
122
+  identical minimal fitness values.}
117 123
 
118 124
 \item{K}{K for NK model; K is the number of loci with which each locus
119 125
   interacts, and the larger the K the larger the ruggedness of the
Browse code

2.15.1: Added MAGELLANs sources and functionality from MAGELLAN

ramon diaz-uriarte (at Phelsuma) authored on 02/07/2019 14:55:40
Showing 1 changed files
... ...
@@ -5,7 +5,7 @@
5 5
 \title{Generate random fitness.}
6 6
 
7 7
 \description{ Generate random fitness landscapes under a House of Cards,
8
-  Rough Mount Fuji, or additive model.  }
8
+  Rough Mount Fuji, additive model, and Kauffman's NK model.  }
9 9
 
10 10
 
11 11
 \usage{
... ...
@@ -13,7 +13,8 @@
13 13
 rfitness(g, c = 0.5, sd = 1, mu = 1, reference = "random", scale = NULL,
14 14
          wt_is_1 = c("subtract", "divide", "force", "no"),
15 15
          log = FALSE, min_accessible_genotypes = NULL,
16
-         accessible_th = 0, truncate_at_0 = TRUE)
16
+         accessible_th = 0, truncate_at_0 = TRUE,
17
+         K = 1, r = TRUE, model = c("RMF", "NK"))
17 18
 }
18 19
 
19 20
 
... ...
@@ -51,7 +52,7 @@ mean \code{mu} and standard deviation \code{sd}).}
51 52
   two-element vector, fitness is re-scaled between \code{scale[1]} (the
52 53
   minimum) and \code{scale[2]} (the maximum).}
53 54
 
54
-\item{wt_is_1}{If "divide" (the default) the fitness of all genotypes is
55
+\item{wt_is_1}{If "divide" the fitness of all genotypes is
55 56
   divided by the fitness of the wildtype (after possibly adding a value
56 57
   to ensure no negative fitness) so that the wildtype (the genotype with
57 58
   no mutations) has fitness 1. This is a case of scaling, and it is
... ...
@@ -60,7 +61,7 @@ mean \code{mu} and standard deviation \code{sd}).}
60 61
   likely that the final fitness will not respect the limits in
61 62
   \code{scale}.
62 63
 
63
-  If "subtract" we shift all the fitness values (subtracting fitness of
64
+  If "subtract" (the default) we shift all the fitness values (subtracting fitness of
64 65
   the wildtype and adding 1) so that the wildtype ends up with a fitness
65 66
   of 1. This is also applied after \code{scale}, so if you specify both
66 67
   "wt_is_1 = 'subtract'" and use an argument for \code{scale} it is most
... ...
@@ -114,13 +115,23 @@ mean \code{mu} and standard deviation \code{sd}).}
114 115
   log) with values <=0. Or we might have trouble if we want to log the
115 116
   fitness.}
116 117
 
118
+\item{K}{K for NK model; K is the number of loci with which each locus
119
+  interacts, and the larger the K the larger the ruggedness of the
120
+  landscape.}
121
+
122
+\item{r}{For the NK model, whether interacting loci are chosen at random
123
+  (\code{r = TRUE}) or are neighbors (\code{r = FALSE}).}
124
+
125
+\item{model}{One of "RMF" (default), for Rough Mount Fuji, or "NK", for
126
+  Kauffman's NK model.}
117 127
 } 
118 128
 
119 129
 
120 130
 \details{
121 131
 
122
-  The model used here follows the Rough Mount Fuji model in Szendro et
123
-  al., 2013 or Franke et al., 2011. Fitness is given as
132
+  When using \code{model = "RMF"}, the model used here follows
133
+  the Rough Mount Fuji model in Szendro et al., 2013 or Franke et al.,
134
+  2011. Fitness is given as
124 135
 
125 136
   \deqn{f(i) = -c d(i, reference) + x_i}
126 137
 
... ...
@@ -144,6 +155,15 @@ mean \code{mu} and standard deviation \code{sd}).}
144 155
   is different from zero. In this case, with \code{c} large, the range
145 156
   of the data can be large, specially if \code{g} (the number of genes)
146 157
   is large.
158
+
159
+
160
+  When using \code{model = "NK"}, the model used is Kauffman's NK model
161
+  (see details in Ferretti et al., or Brouillet et al., below), as
162
+  implemented in MAGELLAN
163
+  (\url{http://wwwabi.snv.jussieu.fr/public/Magellan/}). This fitness
164
+  landscape is generated by directly calling the \code{fl_generate}
165
+  function of MAGELLAN. Fitness is drawn from a uniform (0, 1)
166
+  distribution.
147 167
   
148 168
 } 
149 169
 
... ...
@@ -159,7 +179,12 @@ mean \code{mu} and standard deviation \code{sd}).}
159 179
   \code{accessible_th} that show the number of accessible
160 180
   genotypes under the specified  threshold.
161 181
 }
162
-  
182
+
183
+
184
+\note{MAGELLAN uses its own random number generating functions; using
185
+  \code{set.seed} does not allow to obtain the same fitness landscape
186
+  repeatedly.}
187
+
163 188
 \references{
164 189
 
165 190
   Szendro I.~G. et al. (2013). Quantitative analyses of empirical
... ...
@@ -169,9 +194,23 @@ fitness landscapes. \emph{Journal of Statistical Mehcanics: Theory and
169 194
 Franke, J. et al. (2011). Evolutionary accessibility of mutational
170 195
 pathways. \emph{PLoS Computational Biology\/}, \bold{7}(8), 1--9.
171 196
 
197
+Brouillet, S. et al. (2015). MAGELLAN: a tool to explore small fitness
198
+landscapes. \emph{bioRxiv},
199
+\bold{31583}. \url{http://doi.org/10.1101/031583}
200
+
201
+Ferretti, L., Schmiegelt, B., Weinreich, D., Yamauchi, A., Kobayashi,
202
+Y., Tajima, F., & Achaz, G. (2016). Measuring epistasis in fitness
203
+landscapes: The correlation of fitness effects of mutations. \emph{Journal of
204
+Theoretical Biology\/}, \bold{396}, 132--143. \url{https://doi.org/10.1016/j.jtbi.2016.01.037}
205
+
206
+MAGELLAN web site: \url{http://wwwabi.snv.jussieu.fr/public/Magellan/}
207
+
172 208
 }
173 209
 
174
-\author{ Ramon Diaz-Uriarte
210
+\author{ Ramon Diaz-Uriarte for the RMF and general wrapping
211
+code. S. Brouillet, G. Achaz, S. Matuszewski, H. Annoni, and L. Ferreti
212
+for the MAGELLAN code.
213
+
175 214
 }
176 215
 
177 216
 \seealso{
... ...
@@ -192,6 +231,11 @@ r1 <- rfitness(4)
192 231
 plot(r1)
193 232
 oncoSimulIndiv(allFitnessEffects(genotFitness = r1))
194 233
 
234
+
235
+## NK model
236
+rnk <- rfitness(5, K = 3, model = "NK")
237
+plot(rnk)
238
+oncoSimulIndiv(allFitnessEffects(genotFitness = rnk))
195 239
 }
196 240
 
197 241
 \keyword{ datagen }
ramon diaz-uriarte (at Phelsuma) authored on 17/04/2018 23:58:39
Showing 1 changed files
... ...
@@ -73,7 +73,7 @@ mean \code{mu} and standard deviation \code{sd}).}
73 73
   it is up to you to make sure that the range of the scale argument
74 74
   includes 1 or you might not get what you want). Note that using this
75 75
   option can easily lead to landscapes with no accessible genotypes
76
-  (unless you also use \code{scale}).
76
+  (even if you also use \code{scale}).
77 77
 
78 78
   If "none", the fitness of the wildtype is not touched.  }
79 79
 
Browse code

v.2.5.12

- Several improvements to rfitness.
- simOGraph using transitive reduction properly.
- Miscell documentation improvements.
- Updated citation to Bioinformatics paper.



git-svn-id: file:///home/git/hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/OncoSimulR@126818 bc3139a8-67e5-0310-9ffc-ced21a209358

Ramon Diaz-Uriarte authored on 18/02/2017 20:20:42
Showing 1 changed files
... ...
@@ -10,9 +10,10 @@
10 10
 
11 11
 \usage{
12 12
 
13
-rfitness(g, c = 0.5, sd = 1, reference = "random", scale = NULL,
14
-         wt_is_1 = TRUE, log = FALSE, min_accessible_genotypes = 0,
15
-         accessible_th = 0)
13
+rfitness(g, c = 0.5, sd = 1, mu = 1, reference = "random", scale = NULL,
14
+         wt_is_1 = c("subtract", "divide", "force", "no"),
15
+         log = FALSE, min_accessible_genotypes = NULL,
16
+         accessible_th = 0, truncate_at_0 = TRUE)
16 17
 }
17 18
 
18 19
 
... ...
@@ -26,7 +27,11 @@ rfitness(g, c = 0.5, sd = 1, reference = "random", scale = NULL,
26 27
     in Hamming distance from the reference genotype (see \code{reference}).}
27 28
 
28 29
   \item{sd}{The standard deviation of the random component (a normal
29
-  distribution of mean 0 and standard deviation \code{sd}).}
30
+  distribution of mean \code{mu} and standard deviation \code{sd}).}
31
+
32
+\item{mu}{The mean of the random component (a normal distribution of
33
+mean \code{mu} and standard deviation \code{sd}).}
34
+
30 35
 
31 36
 \item{reference}{The reference genotype: for the deterministic, additive
32 37
   part, this is the genotype with maximal fitness, and all other
... ...
@@ -46,15 +51,36 @@ rfitness(g, c = 0.5, sd = 1, reference = "random", scale = NULL,
46 51
   two-element vector, fitness is re-scaled between \code{scale[1]} (the
47 52
   minimum) and \code{scale[2]} (the maximum).}
48 53
 
49
-\item{wt_is_1}{If TRUE, fitness will be scaled so that the wildtype (the
50
-  genotype with no mutations) has fitness of 1. This is applied after
51
-  \code{scale}, so if you specify both it is most likely that the final
52
-  fitness will not respect the limits in \code{scale}.}
54
+\item{wt_is_1}{If "divide" (the default) the fitness of all genotypes is
55
+  divided by the fitness of the wildtype (after possibly adding a value
56
+  to ensure no negative fitness) so that the wildtype (the genotype with
57
+  no mutations) has fitness 1. This is a case of scaling, and it is
58
+  applied after \code{scale}, so if you specify both
59
+  "wt_is_1 = 'divide'" and use an argument for \code{scale} it is most
60
+  likely that the final fitness will not respect the limits in
61
+  \code{scale}.
62
+
63
+  If "subtract" we shift all the fitness values (subtracting fitness of
64
+  the wildtype and adding 1) so that the wildtype ends up with a fitness
65
+  of 1. This is also applied after \code{scale}, so if you specify both
66
+  "wt_is_1 = 'subtract'" and use an argument for \code{scale} it is most
67
+  likely that the final fitness will not respect the limits in
68
+  \code{scale} (though the distorsion might be simpler to see as just a
69
+  shift up or down).
70
+  
71
+  If "force" we simply set the fitness of the wildtype to 1, without any
72
+  divisions. This means that the \code{scale} argument would work (but
73
+  it is up to you to make sure that the range of the scale argument
74
+  includes 1 or you might not get what you want). Note that using this
75
+  option can easily lead to landscapes with no accessible genotypes
76
+  (unless you also use \code{scale}).
77
+
78
+  If "none", the fitness of the wildtype is not touched.  }
53 79
 
54 80
 
55 81
 \item{log}{If TRUE, log-transform fitness.}
56 82
 
57
-\item{min_accessible_genotypes}{If larger than 0, the minimum number of
83
+\item{min_accessible_genotypes}{If not NULL, the minimum number of
58 84
   accessible genotypes in the fitness landscape. A genotype is
59 85
   considered accessible if you can reach if from the wildtype by going
60 86
   through at least one path where all changes in fitness are larger or
... ...
@@ -69,6 +95,10 @@ rfitness(g, c = 0.5, sd = 1, reference = "random", scale = NULL,
69 95
   If the condition is not satisfied, we continue generating random
70 96
   fitness landscapes with the specified parameters until the condition
71 97
   is satisfied.
98
+
99
+  (Why check against NULL and not against zero? Because this allows you
100
+  to count accessible genotypes even if you do not want to ensure a
101
+  minimum number of accessible genotypes.)
72 102
 }
73 103
 
74 104
 \item{accessible_th}{The threshold for the minimal change in fitness at
... ...
@@ -78,6 +108,12 @@ rfitness(g, c = 0.5, sd = 1, reference = "random", scale = NULL,
78 108
   allow small decreases in fitness in successive steps, use a small
79 109
   negative value for \code{accessible_th}.  }
80 110
 
111
+\item{truncate_at_0}{If TRUE (the default) any fitness <= 0 is
112
+  substituted by a small positive constant (1e-9). Why? Because
113
+  MAGELLAN and some plotting routines can have trouble (specially if you
114
+  log) with values <=0. Or we might have trouble if we want to log the
115
+  fitness.}
116
+
81 117
 } 
82 118
 
83 119
 
... ...
@@ -90,14 +126,25 @@ rfitness(g, c = 0.5, sd = 1, reference = "random", scale = NULL,
90 126
 
91 127
   where \eqn{d(i, j)} is the Hamming distance between genotypes \eqn{i}
92 128
   and \eqn{j} (the number of positions that differ) and \eqn{x_i} is a
93
-  random variable (in this case, a normal deviate of mean 0 and standard
94
-  deviation \code{sd}).
129
+  random variable (in this case, a normal deviate of mean \code{mu}
130
+  and standard deviation \code{sd}).
95 131
 
96 132
   Setting \eqn{c = 0} we obtain a House of Cards model. Setting \eqn{sd
97 133
     = 0} fitness is given by the distance from the reference and if the
98 134
     reference is the genotype with all positions mutated, then we have a
99 135
     fully additive model (fitness increases linearly with the number of
100 136
     positions mutated).
137
+
138
+  For OncoSimulR, we often want the wildtype to have a mean of
139
+  1. Reasonable settings are \code{mu = 1} and \code{wt_is_1 =
140
+  'subtract'} so that we simulate from a distribution centered in 1, and
141
+  we make sure afterwards (via a simple shift) that the wildtype is
142
+  actuall 1. The \code{sd} controls the standard deviation, with the
143
+  usual working and meaning as in a normal distribution, unless \code{c}
144
+  is different from zero. In this case, with \code{c} large, the range
145
+  of the data can be large, specially if \code{g} (the number of genes)
146
+  is large.
147
+  
101 148
 } 
102 149
 
103 150
 \value{
Browse code

2.3.17; vignette: typos, decreased size and time

git-svn-id: file:///home/git/hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/OncoSimulR@121246 bc3139a8-67e5-0310-9ffc-ced21a209358

Ramon Diaz-Uriarte authored on 22/09/2016 16:47:10
Showing 1 changed files
... ...
@@ -34,7 +34,13 @@ rfitness(g, c = 0.5, sd = 1, reference = "random", scale = NULL,
34 34
   distance from this reference. If "random" a genotype will be randomly
35 35
   chosen as the reference. If "max" the genotype with all positions
36 36
   mutated will be chosen as the reference. If you pass a vector (e.g.,
37
-  \code{reference = c(1, 0, 1, 0)}) that will be the reference genotype.}
37
+  \code{reference = c(1, 0, 1, 0)}) that will be the reference genotype.
38
+  If "random2" a genotype will be randomly chosen as the reference. In
39
+  contrast to "random", however, not all genotypes have the same
40
+  probability of being chosen; here, what is equal is the probability
41
+  that the reference genotype has 1, 2, ..., g, mutations (and, once a
42
+  number mutations is chosen, all genotypes with that number of
43
+  mutations have equal probability of being the reference). }
38 44
 
39 45
 \item{scale}{Either NULL (nothing is done) or a two-element vector. If a
40 46
   two-element vector, fitness is re-scaled between \code{scale[1]} (the
Browse code

2.3.11;\n - evalAllGenotypes: order = FALSE by default; \n - Clarified difference plotFitnessEffects and plotFitnessLandscape.

git-svn-id: file:///home/git/hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/OncoSimulR@120020 bc3139a8-67e5-0310-9ffc-ced21a209358

Ramon Diaz-Uriarte authored on 10/08/2016 15:47:33
Showing 1 changed files
... ...
@@ -23,7 +23,7 @@ rfitness(g, c = 0.5, sd = 1, reference = "random", scale = NULL,
23 23
   \item{g}{Number of genes.}
24 24
 
25 25
   \item{c}{The decrease in fitness of a genotype per each unit increase
26
-    in Hamming distance from the reference genotype (\code{reference}).}
26
+    in Hamming distance from the reference genotype (see \code{reference}).}
27 27
 
28 28
   \item{sd}{The standard deviation of the random component (a normal
29 29
   distribution of mean 0 and standard deviation \code{sd}).}
... ...
@@ -34,7 +34,7 @@ rfitness(g, c = 0.5, sd = 1, reference = "random", scale = NULL,
34 34
   distance from this reference. If "random" a genotype will be randomly
35 35
   chosen as the reference. If "max" the genotype with all positions
36 36
   mutated will be chosen as the reference. If you pass a vector (e.g.,
37
-  \code{fittest = c(1, 0, 1, 0)}) that will be the reference genotype.}
37
+  \code{reference = c(1, 0, 1, 0)}) that will be the reference genotype.}
38 38
 
39 39
 \item{scale}{Either NULL (nothing is done) or a two-element vector. If a
40 40
   two-element vector, fitness is re-scaled between \code{scale[1]} (the
... ...
@@ -101,7 +101,12 @@ rfitness(g, c = 0.5, sd = 1, reference = "random", scale = NULL,
101 101
   column denotes gene mutated/not-mutated. (For ease of use in other
102 102
   functions, this matrix has class  "genotype_fitness_matrix".) 
103 103
 
104
+  If you have specified \code{min_accessible_genotypes > 0}, the return
105
+  object has added attributes \code{accessible_genotypes} and
106
+  \code{accessible_th} that show the number of accessible
107
+  genotypes under the specified  threshold.
104 108
 }
109
+  
105 110
 \references{
106 111
 
107 112
   Szendro I.~G. et al. (2013). Quantitative analyses of empirical
Browse code

v.2.3.9. accessible genotypes

git-svn-id: file:///home/git/hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/OncoSimulR@119231 bc3139a8-67e5-0310-9ffc-ced21a209358

Ramon Diaz-Uriarte authored on 09/07/2016 13:43:10
Showing 1 changed files
... ...
@@ -4,16 +4,15 @@
4 4
 
5 5
 \title{Generate random fitness.}
6 6
 
7
-\description{
8
-  Generate random fitness under a House of Cards, Rough Mount Fuji, or
9
-  additive model.
10
-}
7
+\description{ Generate random fitness landscapes under a House of Cards,
8
+  Rough Mount Fuji, or additive model.  }
11 9
 
12 10
 
13 11
 \usage{
14 12
 
15 13
 rfitness(g, c = 0.5, sd = 1, reference = "random", scale = NULL,
16
-         wt_is_1 = TRUE, log = FALSE)
14
+         wt_is_1 = TRUE, log = FALSE, min_accessible_genotypes = 0,
15
+         accessible_th = 0)
17 16
 }
18 17
 
19 18
 
... ...
@@ -48,7 +47,34 @@ rfitness(g, c = 0.5, sd = 1, reference = "random", scale = NULL,
48 47
 
49 48
 
50 49
 \item{log}{If TRUE, log-transform fitness.}
50
+
51
+\item{min_accessible_genotypes}{If larger than 0, the minimum number of
52
+  accessible genotypes in the fitness landscape. A genotype is
53
+  considered accessible if you can reach if from the wildtype by going
54
+  through at least one path where all changes in fitness are larger or
55
+  equal to \code{accessible_th}. The changes in fitness are considered
56
+  at each mutational step, i.e., at each addition of one mutation we
57
+  compute the difference between the genotype with \code{k + 1}
58
+  mutations minus the ancestor genotype with \code{k} mutations. Thus, a
59
+  genotype is considered accessible if there is at least one path where
60
+  fitness increases at each mutational step by at least
61
+  \code{accessible_th}.
62
+
63
+  If the condition is not satisfied, we continue generating random
64
+  fitness landscapes with the specified parameters until the condition
65
+  is satisfied.
51 66
 }
67
+
68
+\item{accessible_th}{The threshold for the minimal change in fitness at
69
+  each mutation step (i.e., between successive genotypes) that allows a
70
+  genotype to be regarded as accessible. This only applies if
71
+  \code{min_accessible_genotypes} is larger than 0.  So if you want to
72
+  allow small decreases in fitness in successive steps, use a small
73
+  negative value for \code{accessible_th}.  }
74
+
75
+} 
76
+
77
+
52 78
 \details{
53 79
 
54 80
   The model used here follows the Rough Mount Fuji model in Szendro et
Browse code

v. 2.3.3.\n mutator, fitness landscapes, rfitness, and many other changes

git-svn-id: file:///home/git/hedgehog.fhcrc.org/bioconductor/trunk/madman/Rpacks/OncoSimulR@118909 bc3139a8-67e5-0310-9ffc-ced21a209358

Ramon Diaz-Uriarte authored on 23/06/2016 16:43:51
Showing 1 changed files
1 1
new file mode 100644
... ...
@@ -0,0 +1,114 @@
1
+\name{rfitness}
2
+\alias{rfitness}
3
+
4
+
5
+\title{Generate random fitness.}
6
+
7
+\description{
8
+  Generate random fitness under a House of Cards, Rough Mount Fuji, or
9
+  additive model.
10
+}
11
+
12
+
13
+\usage{
14
+
15
+rfitness(g, c = 0.5, sd = 1, reference = "random", scale = NULL,
16
+         wt_is_1 = TRUE, log = FALSE)
17
+}
18
+
19
+
20
+
21
+
22
+\arguments{
23
+
24
+  \item{g}{Number of genes.}
25
+
26
+  \item{c}{The decrease in fitness of a genotype per each unit increase
27
+    in Hamming distance from the reference genotype (\code{reference}).}
28
+
29
+  \item{sd}{The standard deviation of the random component (a normal
30
+  distribution of mean 0 and standard deviation \code{sd}).}
31
+
32
+\item{reference}{The reference genotype: for the deterministic, additive
33
+  part, this is the genotype with maximal fitness, and all other
34
+  genotypes decrease their fitness by \code{c} for every unit of Hamming
35
+  distance from this reference. If "random" a genotype will be randomly
36
+  chosen as the reference. If "max" the genotype with all positions
37
+  mutated will be chosen as the reference. If you pass a vector (e.g.,
38
+  \code{fittest = c(1, 0, 1, 0)}) that will be the reference genotype.}
39
+
40
+\item{scale}{Either NULL (nothing is done) or a two-element vector. If a
41
+  two-element vector, fitness is re-scaled between \code{scale[1]} (the
42
+  minimum) and \code{scale[2]} (the maximum).}
43
+
44
+\item{wt_is_1}{If TRUE, fitness will be scaled so that the wildtype (the
45
+  genotype with no mutations) has fitness of 1. This is applied after
46
+  \code{scale}, so if you specify both it is most likely that the final
47
+  fitness will not respect the limits in \code{scale}.}
48
+
49
+
50
+\item{log}{If TRUE, log-transform fitness.}
51
+}
52
+\details{
53
+
54
+  The model used here follows the Rough Mount Fuji model in Szendro et
55
+  al., 2013 or Franke et al., 2011. Fitness is given as
56
+
57
+  \deqn{f(i) = -c d(i, reference) + x_i}
58
+
59
+  where \eqn{d(i, j)} is the Hamming distance between genotypes \eqn{i}
60
+  and \eqn{j} (the number of positions that differ) and \eqn{x_i} is a
61
+  random variable (in this case, a normal deviate of mean 0 and standard
62
+  deviation \code{sd}).
63
+
64
+  Setting \eqn{c = 0} we obtain a House of Cards model. Setting \eqn{sd
65
+    = 0} fitness is given by the distance from the reference and if the
66
+    reference is the genotype with all positions mutated, then we have a
67
+    fully additive model (fitness increases linearly with the number of
68
+    positions mutated).
69
+} 
70
+
71
+\value{
72
+  
73
+  An matrix with \code{g + 1} columns. Each column corresponds to a
74
+  gene, except the last one that corresponds to fitness. 1/0 in a gene
75
+  column denotes gene mutated/not-mutated. (For ease of use in other
76
+  functions, this matrix has class  "genotype_fitness_matrix".) 
77
+
78
+}
79
+\references{
80
+
81
+  Szendro I.~G. et al. (2013). Quantitative analyses of empirical
82
+fitness landscapes. \emph{Journal of Statistical Mehcanics: Theory and
83
+  Experiment\/}, \bold{01}, P01005.
84
+
85
+Franke, J. et al. (2011). Evolutionary accessibility of mutational
86
+pathways. \emph{PLoS Computational Biology\/}, \bold{7}(8), 1--9.
87
+
88
+}
89
+
90
+\author{ Ramon Diaz-Uriarte
91
+}
92
+
93
+\seealso{
94
+  
95
+  \code{\link{oncoSimulIndiv}},
96
+  \code{\link{plot.genotype_fitness_matrix}},
97
+  \code{\link{evalAllGenotypes}}
98
+  \code{\link{allFitnessEffects}}
99
+  \code{\link{plotFitnessLandscape}}
100