--- output: html_document: keep_md: yes --- # ncdfFlow: A package that provides HDF5 based storage for cytometry data. This package extends the `flowCore` infrastructure by storing the large volume of event-level data on disk as `HDF` format and only keeps the file handler and meta data in memory. Thus the memory consumption is significantly reduced. ### INSTALLATION ```{r, echo=FALSE} library(knitr) opts_chunk$set(message = FALSE, warning = FALSE, fig.height= 3, fig.width= 5) ``` ```{r, eval=FALSE} # First, install it from bionconductor so that it will pull all the dependent packages automatically library(BiocInstalller) bicLite(ncdfFlow) # or install the latest version from github using devtools package install.packages("devtools") library(devtools) #load it install_github("RGLab/ncdfFlow", ref="trunk") ``` ### Unix/Linux/Mac users To build the ncdfFlow package from source, make sure that HDF5 Library is present on your system: If HDF5 is installed to some non-standard location, you may pass the environment variable --with-hdf5 to point to the correct location of HDF5, for example, ```bash #install from github install_github('RGLab/ncdfFlow', ref='trunk', args='--configure-args="--with-hdf5=<path-to-hdf>"') #or install from locally downloaded tar ball R CMD INSTALL ncdfFlow_x.y.z.tar.gz --configure-args="--with-hdf5='<path-to-hdf>'" ``` under '/path/to', you should find "include" and "lib" sub-folders that contain HDF5 headers and shared libraries. Also, make sure add the path of `libhdf5.so` (should be `lib` subfolder of `<path-to-hdf>`) to your environment variable `LD_LIBRARY_PATH` so that it can be found at runtime. ```bash export LD_LIBRARY_PATH=<path-to-hdf>/lib:LD_LIBRARY_PATH ``` ### Create `ncdfFlowSet` object ```{r} library(ncdfFlow) #read from FCS files path <- system.file("extdata","compdata","data",package="flowCore") files <- list.files(path,full.names=TRUE)[1:3] fs <- read.ncdfFlowSet(files=files) #equivalent to flowCore::read.flowSet #or convert the existing flowSet into ncdfFlowSet data(GvHD) fs <- GvHD[1:4] fs <- ncdfFlowSet(fs) fs ``` ### Use it as the same way as `flowSet` (except it is memory efficient and fast) ```{r} pData(fs) sampleNames(fs) keyword(fs,"FILENAME") colnames(fs) length(fs) fs[[1]] fs[2:3] ```