Name Mode Size
.github 040000
R 040000
man 040000
tests 040000
vignettes 040000
.BBSoptions 100644 0 kb
.Rbuildignore 100644 0 kb
DESCRIPTION 100644 1 kb
LICENSE 100644 0 kb
LICENSE.md 100644 1 kb
NAMESPACE 100644 1 kb
NEWS.md 100644 1 kb
README.md 100644 2 kb
configure 100755 0 kb
configure.win 100755 0 kb
README.md
<img src="man/figures/sketchR.png" align="right" alt="sketchR" width="150"/> <br> # sketchR <br> <!-- badges: start --> [![R-CMD-check](https://github.com/fmicompbio/sketchR/workflows/R-CMD-check/badge.svg)](https://github.com/fmicompbio/sketchR/actions) <!-- badges: end --> `sketchR` provides a simple interface to the [`geosketch`](https://github.com/brianhie/geosketch) and [`scSampler`](https://github.com/SONGDONGYUAN1994/scsampler) python packages, which implement subsampling algorithms described in [Hie et al (2019)](https://www.cell.com/cell-systems/fulltext/S2405-4712(19)30152-8) and [Song et al (2022)](https://www.biorxiv.org/content/10.1101/2022.01.15.476407v1), respectively. The implementation makes use of the [`basilisk`](https://bioconductor.org/packages/basilisk/) package for interaction between R and python. ## Installation You can install `sketchR` from Bioconductor (release 3.19 onwards) using: ``` r if (!require("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("sketchR") ``` ## Example ``` r library(sketchR) ## Create an example data matrix. Rows represent "samples" (the unit of ## downsampling), columns represent features (e.g., principal components). mat <- matrix(rnorm(5000), nrow = 500) ## Run geosketch. The output is a vector of indices, which you can use ## to subset the rows of the input matrix. idx <- geosketch(mat, N = 100) ## Run scSampler. As for geosketch, the output is a vector of indices. idx2 <- scsampler(mat, N = 100) ```