...
|
...
|
@@ -22,15 +22,8 @@ knitr::opts_chunk$set(
|
22
|
22
|
)
|
23
|
23
|
```
|
24
|
24
|
|
25
|
|
-# Installation
|
26
|
25
|
|
27
|
|
-First of all we need to install PsiNorm:
|
28
|
|
-
|
29
|
|
-```{r, eval = FALSE}
|
30
|
|
-if(!requireNamespace("BiocManager", quietly = TRUE))
|
31
|
|
- install.packages("BiocManager")
|
32
|
|
-BiocManager::install("PsiNorm")
|
33
|
|
-```
|
|
26
|
+# Introduction
|
34
|
27
|
|
35
|
28
|
```{r, message=FALSE, warning=FALSE}
|
36
|
29
|
library(SingleCellExperiment)
|
...
|
...
|
@@ -40,14 +33,18 @@ library(cluster)
|
40
|
33
|
library(scone)
|
41
|
34
|
```
|
42
|
35
|
|
43
|
|
-# Introduction
|
44
|
|
-
|
45
|
|
-PsiNorm is a new scalable between-sample normalization for single cell RNA-seq count data based on the power-law Pareto type I distribution. It can be demonstrated that the Pareto parameter is inversely proportional to the sequencing depth, it is sample specific and its estimate can be obtained for each cell independently. PsiNorm computes the shape parameter for each cellular sample and then uses it as multiplicative size factor to normalize the data. The final goal of the transformation is to align the gene expression distribution especially for those genes characterised by high expression. Note that, similar to other global scaling methods, our method does not remove batch effects, which can be dealt with downstream tools.
|
|
36
|
+PsiNorm is a scalable between-sample normalization for single cell RNA-seq count data based on the power-law Pareto type I distribution. It can be demonstrated that the Pareto parameter is inversely proportional to the sequencing depth, it is sample specific and its estimate can be obtained for each cell independently. PsiNorm computes the shape parameter for each cellular sample and then uses it as multiplicative size factor to normalize the data. The final goal of the transformation is to align the gene expression distribution especially for those genes characterised by high expression. Note that, similar to other global scaling methods, our method does not remove batch effects, which can be dealt with downstream tools.
|
46
|
37
|
|
47
|
38
|
To evaluate the ability of PsiNorm to remove technical bias and reveal the true cell similarity structure, we used both an unsupervised and a supervised approach.
|
48
|
39
|
We first simulate a scRNA-seq experiment with four known clusters using the _splatter_ Bioconductor package. Then in the unsupervised approach, we i) reduce dimentionality using PCA, ii) identify clusters using the _clara_ partitional method and then we iii) computed the Adjusted Rand Index (ARI) to compare the known and the estimated partition.
|
49
|
40
|
|
50
|
|
-In the supervised approach, we i) reduce dimentionality using PCA, and we ii) compute the silhouette index of the known partition in the reduced dimensional space.
|
|
41
|
+In the supervised approach, we i) reduce dimentionality using PCA, and we ii) compute the silhouette index of the known partition in the reduced dimensional space.
|
|
42
|
+
|
|
43
|
+# Citation
|
|
44
|
+
|
|
45
|
+If you use `PsiNorm` in publications, please cite the following article:
|
|
46
|
+
|
|
47
|
+Borella, M., Martello, G., Risso, D., & Romualdi, C. (2021). PsiNorm: a scalable normalization for single-cell RNA-seq data. bioRxiv. https://doi.org/10.1101/2021.04.07.438822.
|
51
|
48
|
|
52
|
49
|
# Data Simulation
|
53
|
50
|
|
...
|
...
|
@@ -165,20 +162,15 @@ See Section 3.2 of the "Introduction to SCONE" vignette for an example on how to
|
165
|
162
|
|
166
|
163
|
# Using PsiNorm with Seurat
|
167
|
164
|
|
168
|
|
-The PsiNorm normalization method can be used as a replacement for Seurat's default normalization methods. To do so, we need to first normalize the data stored in a `SingleCellExperiment` object and then coerce that object to a Seurat object. This can be done with the `as.Seurat` function provided in the `Seurat` package.
|
|
165
|
+The PsiNorm normalization method can be used as a replacement for Seurat's default normalization methods. To do so, we need to first normalize the data stored in a `SingleCellExperiment` object and then coerce that object to a Seurat object. This can be done with the `as.Seurat` function provided in the `Seurat` package (tested with Seurat 4.0.3).
|
169
|
166
|
|
170
|
|
-```{r seurat, message=FALSE, warning=FALSE}
|
|
167
|
+```{r seurat, eval=FALSE}
|
171
|
168
|
library(Seurat)
|
|
169
|
+sce <- PsiNorm(sce)
|
|
170
|
+sce <- logNormCounts(sce)
|
172
|
171
|
seu <- as.Seurat(sce)
|
173
|
172
|
```
|
174
|
173
|
|
175
|
|
-Note that the `data` slot of the Seurat object will contained the PsiNorm log-normalized data.
|
176
|
|
-
|
177
|
|
-```{r check}
|
178
|
|
-head(seu@assays$RNA@data[,1:3])
|
179
|
|
-head(logcounts(sce)[,1:3])
|
180
|
|
-```
|
181
|
|
-
|
182
|
174
|
From this point on, one can continue the analysis with the recommended Seurat workflow, but using PsiNorm log-normalized data.
|
183
|
175
|
|
184
|
176
|
# Using PsiNorm with HDF5 files
|