...
|
...
|
@@ -8,71 +8,31 @@
|
8
|
8
|
1. Celda can simultaneously cluster genes into transcriptional states and cells into subpopulations
|
9
|
9
|
2. Celda uses count-based Dirichlet-multinomial distributions so no additional normalization is required for 3' DGE single cell RNA-seq
|
10
|
10
|
3. These types of models have shown good performance with sparse data.
|
|
11
|
+4. **Celda now includes DecontX, a computational algorithm for decontamination of droplet based scRNA-seq data.**
|
11
|
12
|
|
12
|
13
|
|
13
|
14
|
## Installation Instructions
|
14
|
15
|
|
15
|
|
-To install the most recent release of celda (used in the preprint version of the celda paper) via devtools:
|
|
16
|
+To install the most recent release of celda via devtools:
|
16
|
17
|
```
|
17
|
18
|
library(devtools)
|
18
|
|
-install_github("campbio/celda@v0.6")
|
19
|
|
-```
|
20
|
|
-The most up-to-date (but potentially less stable) version of celda can similarly be installed with:
|
21
|
|
-```
|
22
|
|
-install_github("campbio/celda@devel")
|
|
19
|
+install_github("campbio/celda")
|
23
|
20
|
```
|
24
|
21
|
|
25
|
22
|
**NOTE** On OSX, devtools::install_github() requires installation of **libgit2.** This can be installed via homebrew:
|
26
|
23
|
```
|
27
|
24
|
brew install libgit2
|
28
|
25
|
```
|
29
|
|
-**NOTE** If you install celda in Rstudio and get an error:could not find tools necessary to compile a package, you can try this:
|
|
26
|
+**NOTE** If you are trying to install celda using Rstudio and get this error: "could not find tools necessary to compile a package", you can try this:
|
30
|
27
|
```
|
31
|
28
|
options(buildtools.check = function(action) TRUE)
|
32
|
29
|
```
|
33
|
30
|
|
34
|
31
|
## Examples and vignettes
|
35
|
32
|
|
36
|
|
-Vignettes are available in the package.
|
37
|
|
-
|
38
|
|
-An analysis example using celda with RNASeq via vignette('celda-analysis')
|
39
|
|
-
|
40
|
|
-
|
41
|
|
-### Decontamination with DecontX
|
42
|
|
-Highly expressed genes from various cells clusters will be expressed at low levels in other clusters in droplet-based systems due to contamination. DecontX will decompose an observed count matrix into a decontaminated expression matrix and a contamination matrix. The only other parameter needed is a vector of cell cluster labels.
|
43
|
|
-
|
44
|
|
-To simulate two 300 (gene) x 100 (cell) count matrices from 3 different cell types with total reads per cell ranged from 5000 to 40000: one matrix being ture expression matrix (rmat), the other matrix being contamination count matrix (cmat)
|
45
|
|
-```
|
46
|
|
-sim.con = simulateContaminatedMatrix( C = 100, G = 300, K = 3, N.Range= c(5000, 40000), seed = 9124)
|
47
|
|
-true.contamination.percentage = colSums( sim.con$cmat ) / colSums( sim.con$cmat + sim.con$rmat )
|
48
|
|
-str(sim.con)
|
49
|
|
-# N.by.C: total transcripts per cell
|
50
|
|
-# z: cell type label
|
51
|
|
-
|
52
|
|
-```
|
53
|
|
-Use DecontX to decompose the observed (contaminated) count matrix back into true expression matrix and a contamination matrix with specified cell label
|
54
|
|
-```
|
55
|
|
-observedCounts = sim.con$observedCounts
|
56
|
|
-cell.label = sim.con$z
|
57
|
|
-new.counts = DecontX( counts = observedCounts, z = cell.label, max.iter = 200, seed = 123)
|
58
|
|
-str(new.counts)
|
59
|
|
-# Decontaminated matrix: new.counts$res.list$est.rmat
|
60
|
|
-# Percentage of contamination per cell: new.counts$res.list$est.conp
|
61
|
|
-
|
62
|
|
-```
|
63
|
|
-DecontX Performance check
|
64
|
|
-```
|
65
|
|
-estimated.contamination.percentage = new.counts$res.list$est.conp
|
66
|
|
-plot( true.contamination.percentage, estimated.contamination.percentage) ; abline(0,1)
|
67
|
|
-```
|
68
|
|
-
|
69
|
|
-
|
70
|
|
-
|
71
|
|
-## New Features and announcements
|
72
|
|
-The v0.4 release of celda represents a useable implementation of the various celda clustering models.
|
73
|
|
-Please submit any usability issues or bugs to the issue tracker at https://github.com/campbio/celda
|
|
33
|
+Uncompiled vignettes are available in the package.
|
74
|
34
|
|
75
|
|
-You can discuss celda, or ask the developers usage questions, in our [Google Group.](https://groups.google.com/forum/#!forum/celda-list)
|
|
35
|
+Examples of doing single-cell RNA-seq data analysis using celda and DecontX is available in files vignettes/celda-analysis.Rmd and vignettes/DecontX-analysis.Rmd.
|
76
|
36
|
|
77
|
37
|
## For developers
|
78
|
38
|
Check out our [Wiki](https://github.com/campbio/celda/wiki) for [coding style guide](https://github.com/campbio/celda/wiki/Celda-Development-Coding-Style-Guide) if you want to contribute!
|