#### 2.99.4

ramon diaz-uriarte (at Phelsuma) authored on 17/12/2020 15:07:07
Showing 1 changed files
 ... ... @@ -97,7 +97,10 @@ diversityLOD(llod) 97 97  descent in a given simulation. In v. 2.9.1 we also returned the LOD 98 98  as explained above. Now we only return the LOD as defined above. 99 99   100 -  100 + Beware, however, that if you use multiple initial mutants the LOD 101 +function will probably not do what you want. It is not even clear that 102 +the LOD is well defined in this case. We are working on this. 103 + 101 104  } 102 105   103 106  \value{

#### v. 2.99.3

ramon diaz-uriarte (at Phelsuma) authored on 13/12/2020 14:35:47
Showing 1 changed files
 ... ... @@ -168,7 +168,13 @@ pancr <- allFitnessEffects(data.frame(parent = c("Root", rep("KRAS", 4), "SMAD4" 168 168   169 169   170 170  pancr1 <- oncoSimulIndiv(pancr, model = "Exp") 171 -pancr8 <- oncoSimulPop(8, pancr, model = "Exp", 171 + 172 +RNGkind("L'Ecuyer-CMRG") 173 +set.seed(3) 174 +pancr8 <- oncoSimulPop(3, pancr, model = "Exp", 175 + finalTime = 600, 176 + onlyCancer = TRUE, 177 + seed = NULL, 172 178  mc.cores = 2) 173 179   174 180  POM(pancr1)

#### 2.17.9: POM doc and adapt to stringsAsFactors = FALSE

ramon diaz-uriarte (at Phelsuma) authored on 17/03/2020 12:31:37
Showing 1 changed files
 ... ... @@ -46,7 +46,7 @@ diversityLOD(llod) 46 46  \item{llod}{A list of LODs, as returned from \code{LOD} on an object of 47 47  class \code{oncosimulpop}.} 48 48   49 -\item{...}{Other arguments passed to methods (ignored now).} 49 +% \item{...}{Other arguments passed to methods (ignored now).} 50 50  } 51 51   52 52  \details{

#### v. 2.9.2 - LOD: using only the strict Szendro et al. meaning. - POM: computed in C++. - Using fitness landscape directly when given as input (no conversion to epistasis)

ramon diaz-uriarte (at Phelsuma) authored on 24/11/2017 12:41:48
Showing 1 changed files
 ... ... @@ -34,8 +34,11 @@ diversityLOD(llod) 34 34   35 35  \arguments{ \item{x}{An object of class \code{oncosimulpop} (version >= 36 36  2, so simulations with the old poset specification will not work) or 37 - class \code{oncosimul2} (a single simulation). For \code{LOD} 38 - simulations must have been run with \code{keepPhylog = TRUE}.} 37 + class \code{oncosimul2} (a single simulation). } 38 + 39 +%% \item{strict}{If TRUE, a single LOD as in Szendro et al. See Details. 40 +%% If FALSE, simulations must have been run with \code{keepPhylog = TRUE} 41 +%% to compute all possible LODs (see Details).} 39 42   40 43  \item{lpom}{A list of POMs, as returned from \code{POM} on an object of 41 44  class \code{oncosimulpop}.} ... ... @@ -49,35 +52,50 @@ diversityLOD(llod) 49 52  \details{ 50 53   51 54  Lines of Descent (LOD) and Path of the Maximum (POM) were defined in 52 - Szendro et al. (2013) and I follow those definitions here as closely 53 - as possible, as applied to a process in continuous time with sampling 54 - at user-specified periods. 55 - 56 - For POM, the results can depend strongly on how often we sample and 57 - keep samples (i.e., the \code{sampleEvery} and \code{keepEvery} 58 - arguments to \code{oncoSimulIndiv} and \code{oncoSimulPop}), since the 59 - POM is computed from the values stored in the \code{pops.by.time} 60 - matrix. This also explains why it is generally meaningless to use POM on 61 - \code{oncoSimulSample} runs: these only keep the very last sample. 62 - 63 - 64 - For LOD my implementation is not exactly identical to the definition 65 - given in p. 572 of Szendro et al. (2013). First, in case this might be 66 - useful, for each simulation I keep all the paths that 67 - "(...) arrive at the most populated genotype at the final time" (first 68 - paragraph in p. 572 of Szendro et al.), whereas they only keep one 69 - (see second column of p. 572). However, I do provide a single LOD for 70 - each run, too. This is the first path to arrive at the genotype that 71 - eventually becomes the most populated genotype at the final time (and, 72 - in this sense, agrees with the LOD of Szendro et al.). However, in 73 - contrast to what is apparently done in Szendro 74 - ("A given genotype may undergo several episodes of colonization and extinction that are stored by the algorithm, and the last episode before the colonization of the final state is used to construct the step."), 75 - I do not check that this genotype (which is the one that will become 76 - the most populated at final time) does not become extinct before the 77 - final colonization. So there could be other paths (all in 78 - \code{all_paths}) that are actually the one(s) that are colonizers of 79 - the most populated genotype (with no extinction before the final 80 - colonization). 55 + Szendro et al. (2013) and I follow those definitions here, as applied 56 + to a process in continuous time with sampling at user-specified 57 + periods. 58 + 59 + For POM, the results can depend strongly on how often we sample (i.e., 60 + the \code{sampleEvery} argument to \code{oncoSimulIndiv} and 61 + \code{oncoSimulPop}), since the POM is computed by finding the clone 62 + with largest population size whenever we sample.%% from the values 63 + %% stored in the \code{pops.by.time} matrix. 64 + This also explains why 65 + it is generally meaningless to use POM on \code{oncoSimulSample} runs: 66 + these only keep the very last sample. 67 + 68 + 69 + For LOD, %% when using \code{strict = TRUE},  70 + a single LOD per simulation 71 + is returned, with the same meaning as that in p. 572 of Szendro et 72 + al. (2013). "A given genotype may undergo several episodes of colonization and extinction that are stored by the algorithm, and the last episode before the colonization of the final state is used to construct the step.", 73 + and I check that this genotype (which is the one that will become the 74 + most populated at final time) does not become extinct before the final 75 + colonization. 76 + 77 + %% If \code{strict = FALSE}, and if you have run the simulations with 78 + %% \code{keepPhylog = TRUE}, then a I return both \code{all_paths} and 79 + %% \code{lod_single}, with meanings as follow. First, in case this might 80 + %% be useful, for each simulation I keep all the paths that 81 + %% "(...) arrive at the most populated genotype at the final time" (first 82 + %% paragraph in p. 572 of Szendro et al.), and these are stored in 83 + %% \code{all_paths}. When \code{strict = FALSE} I also provide another 84 + %% single LOD for each run, too. This is the first path to arrive at the 85 + %% genotype that eventually becomes the most populated genotype at the 86 + %% final time (and, in this sense, agrees with the LOD of Szendro et 87 + %% al.). However, in contrast to what is done in Szendro 88 + %% ("A given genotype may undergo several episodes of colonization and extinction that are stored by the algorithm, and the last episode before the colonization of the final state is used to construct the step.") 89 + %% and when \code{strict = TRUE}, I do not check that this genotype 90 + %% (which is the one that will become the most populated at final time) 91 + %% does not become extinct before the final colonization. So there could 92 + %% be other paths (all in \code{all_paths}) that are actually the one(s) 93 + %% that are colonizers of the most populated genotype (with no extinction 94 + %% before the final colonization). 95 + 96 + Note \emph{breaking changes}: for LOD we used to return all lines of 97 + descent in a given simulation. In v. 2.9.1 we also returned the LOD 98 + as explained above. Now we only return the LOD as defined above. 81 99   82 100   83 101  } ... ... @@ -89,18 +107,29 @@ diversityLOD(llod) 89 107  the ordered set of genotypes that contain the largest subpopulation at 90 108  the times of sampling. 91 109   92 - For \code{LOD}, if \code{x} is a single simulation, a two-element 93 - list. The first, \code{all_paths}, contains all paths to the 94 - maximum. The second, \code{lod_single}, contain the single LOD which 95 - is closest in meaning to the original definition of Szendro et 96 - al. (See "Details"). If \code{x} is a list (population) of 97 - simulations, then a list where each element is a two-element list, as 98 - just explained. All the lists contain objects of class "igraph.vs" (an 99 - igraph vertex sequence: see \code{\link[igraph]{vertex_attr}}).  110 + For \code{LOD}, if \code{x} is a single simulation, the line of 111 + descent as defined above (either an object of class "igraph.vs" (an 112 + igraph vertex sequence: see \code{\link[igraph]{vertex_attr}}) or a 113 + character vector if there were no descendants). If \code{x} is a list 114 + (population) of simulations, then a list where each element is a list 115 + as just explained. 116 + 117 + %% a two-element 118 + %% list. If \code{strict = TRUE}, only \code{lod_single} is returned. If 119 + %% \code{strict = FALSE} (and simulations were run with \code{keepPhylog 120 + %% = TRUE}), \code{all_paths} contains all paths to the maximum, and 121 + %% \code{lod_single} contains the single LOD which first arrives at the 122 + %% maximum. 123 + 124 + %% If \code{x} is a list (population) of simulations, then a list 125 + %% where each element is a two-element list, as just explained. 126 + %% All the lists 127 + %% contain objects of class "igraph.vs" (an igraph vertex sequence: see 128 + %% \code{\link[igraph]{vertex_attr}}). 100 129   101 130  For \code{diversityLOD} and \code{diversityPOM} a single element 102 - vector with the Shannon's diversity (entropy) of the \code{lod_single} 103 - (for \code{diversityLOD}) or of the POMs (for \code{diversityPOM}). 131 + vector with the Shannon's diversity (entropy) of the LODs (for 132 + \code{diversityLOD}) or of the POMs (for \code{diversityPOM}). 104 133   105 134  } 106 135   ... ... @@ -138,8 +167,8 @@ pancr <- allFitnessEffects(data.frame(parent = c("Root", rep("KRAS", 4), "SMAD4" 138 167  typeDep = "MN")) 139 168   140 169   141 -pancr1 <- oncoSimulIndiv(pancr, model = "Exp", keepPhylog = TRUE) 142 -pancr8 <- oncoSimulPop(8, pancr, model = "Exp", keepPhylog = TRUE, 170 +pancr1 <- oncoSimulIndiv(pancr, model = "Exp") 171 +pancr8 <- oncoSimulPop(8, pancr, model = "Exp", 143 172  mc.cores = 2) 144 173   145 174  POM(pancr1)

#### v.2.5.2 - Lots and lots of addition to vignette including benchmarks. - Diversity of sampled genotypes. - Genotyping error can be added in samplePop. - LOD and POM (lines of descent, path of maximum, sensu Szendro et al.). - simOGraph can also out rT data frames. - Better (and better explained) estimates of simulation error for McFL.

 1 1 new file mode 100644 ... ... @@ -0,0 +1,168 @@ 1 +\name{POM} 2 +\alias{POM} 3 +\alias{LOD} 4 +\alias{diversityPOM} 5 +\alias{diversityLOD} 6 +\alias{POM.oncosimul2} 7 +\alias{LOD.oncosimul2} 8 +\alias{POM.oncosimulpop} 9 +\alias{LOD.oncosimulpop} 10 + 11 + 12 +\title{ 13 + Obtain Lines of Descent and Paths of the Maximum and their diversity from simulations. 14 +} 15 + 16 +\description{ 17 +  18 + Compute Lines of Descent (LOD) and Path of the Maximum (POM) for a 19 + single simulation or a set of simulations (from \code{oncoSimulPop}). 20 + 21 + \code{diversityPOM} and \code{diversityLOD} return the Shannon's 22 + diversity (entropy) of the POM and LOD, respectively, of a set of 23 + simulations (it makes no sense to compute those from a single simulation). 24 +  25 +} 26 + 27 +\usage{ 28 + 29 +POM(x) 30 +LOD(x) 31 +diversityPOM(lpom) 32 +diversityLOD(llod) 33 +} 34 + 35 +\arguments{ \item{x}{An object of class \code{oncosimulpop} (version >= 36 + 2, so simulations with the old poset specification will not work) or 37 + class \code{oncosimul2} (a single simulation). For \code{LOD} 38 + simulations must have been run with \code{keepPhylog = TRUE}.} 39 + 40 +\item{lpom}{A list of POMs, as returned from \code{POM} on an object of 41 + class \code{oncosimulpop}.} 42 + 43 +\item{llod}{A list of LODs, as returned from \code{LOD} on an object of 44 + class \code{oncosimulpop}.} 45 + 46 +\item{...}{Other arguments passed to methods (ignored now).} 47 +} 48 + 49 +\details{ 50 + 51 + Lines of Descent (LOD) and Path of the Maximum (POM) were defined in 52 + Szendro et al. (2013) and I follow those definitions here as closely 53 + as possible, as applied to a process in continuous time with sampling 54 + at user-specified periods. 55 + 56 + For POM, the results can depend strongly on how often we sample and 57 + keep samples (i.e., the \code{sampleEvery} and \code{keepEvery} 58 + arguments to \code{oncoSimulIndiv} and \code{oncoSimulPop}), since the 59 + POM is computed from the values stored in the \code{pops.by.time} 60 + matrix. This also explains why it is generally meaningless to use POM on 61 + \code{oncoSimulSample} runs: these only keep the very last sample. 62 + 63 + 64 + For LOD my implementation is not exactly identical to the definition 65 + given in p. 572 of Szendro et al. (2013). First, in case this might be 66 + useful, for each simulation I keep all the paths that 67 + "(...) arrive at the most populated genotype at the final time" (first 68 + paragraph in p. 572 of Szendro et al.), whereas they only keep one 69 + (see second column of p. 572). However, I do provide a single LOD for 70 + each run, too. This is the first path to arrive at the genotype that 71 + eventually becomes the most populated genotype at the final time (and, 72 + in this sense, agrees with the LOD of Szendro et al.). However, in 73 + contrast to what is apparently done in Szendro 74 + ("A given genotype may undergo several episodes of colonization and extinction that are stored by the algorithm, and the last episode before the colonization of the final state is used to construct the step."), 75 + I do not check that this genotype (which is the one that will become 76 + the most populated at final time) does not become extinct before the 77 + final colonization. So there could be other paths (all in 78 + \code{all_paths}) that are actually the one(s) that are colonizers of 79 + the most populated genotype (with no extinction before the final 80 + colonization). 81 +  82 +  83 +} 84 + 85 +\value{ 86 + 87 + For \code{POM} either a character vector (if \code{x} is a single 88 + simulation) or a list of character vectors. Each character vector is 89 + the ordered set of genotypes that contain the largest subpopulation at 90 + the times of sampling. 91 + 92 + For \code{LOD}, if \code{x} is a single simulation, a two-element 93 + list. The first, \code{all_paths}, contains all paths to the 94 + maximum. The second, \code{lod_single}, contain the single LOD which 95 + is closest in meaning to the original definition of Szendro et 96 + al. (See "Details"). If \code{x} is a list (population) of 97 + simulations, then a list where each element is a two-element list, as 98 + just explained. All the lists contain objects of class "igraph.vs" (an 99 + igraph vertex sequence: see \code{\link[igraph]{vertex_attr}}).  100 +  101 + For \code{diversityLOD} and \code{diversityPOM} a single element 102 + vector with the Shannon's diversity (entropy) of the \code{lod_single} 103 + (for \code{diversityLOD}) or of the POMs (for \code{diversityPOM}). 104 + 105 +} 106 + 107 +\references{ 108 + 109 + Szendro, I. G., Franke, J., Visser, J. A. G. M. de, & Krug, 110 + J. (2013). Predictability of evolution depends nonmonotonically on 111 + population size. \emph{Proceedings of the National Academy of Sciences}, 112 + 110(2), 571-576. \url{https://doi.org/10.1073/pnas.1213613110} 113 + 114 +} 115 + 116 +\author{ 117 + Ramon Diaz-Uriarte 118 +} 119 + 120 +\seealso{ 121 + \code{\link{oncoSimulPop}}, \code{\link{oncoSimulIndiv}} 122 +  123 +} 124 + 125 +\examples{ 126 + 127 +######## Using a poset for pancreatic cancer from Gerstung et al. 128 +### (s and sh are made up for the example; only the structure 129 +### and names come from Gerstung et al.) 130 + 131 +pancr <- allFitnessEffects(data.frame(parent = c("Root", rep("KRAS", 4), "SMAD4", "CDNK2A",  132 + "TP53", "TP53", "MLL3"), 133 + child = c("KRAS","SMAD4", "CDNK2A",  134 + "TP53", "MLL3", 135 + rep("PXDN", 3), rep("TGFBR2", 2)), 136 + s = 0.05, 137 + sh = -0.3, 138 + typeDep = "MN")) 139 + 140 + 141 +pancr1 <- oncoSimulIndiv(pancr, model = "Exp", keepPhylog = TRUE) 142 +pancr8 <- oncoSimulPop(8, pancr, model = "Exp", keepPhylog = TRUE, 143 + mc.cores = 2) 144 + 145 +POM(pancr1) 146 +LOD(pancr1) 147 + 148 +POM(pancr8) 149 +LOD(pancr8) 150 + 151 +diversityPOM(POM(pancr8)) 152 +diversityLOD(LOD(pancr8)) 153 + 154 + 155 + 156 +} 157 + 158 +\keyword{manip} 159 +\keyword{univar} 160 + 161 + 162 + 163 + 164 + 165 + 166 + 167 + 168 +