---
output: 
  html_document: 
    keep_md: yes
---
# ncdfFlow: A package that provides HDF5 based storage for cytometry data.

This package extends the `flowCore` infrastructure by storing the large volume of event-level data on disk as `HDF` format and only keeps the file handler and meta data in memory. Thus the memory consumption is significantly reduced.

### INSTALLATION

```{r, echo=FALSE}
library(knitr)
opts_chunk$set(message = FALSE, warning = FALSE, fig.height= 3, fig.width= 5)
```

```{r, eval=FALSE}
# First, install it from bionconductor so that it will pull all the dependent packages automatically
library(BiocInstalller)
bicLite(ncdfFlow) 

# or install the latest version from github using devtools package 
install.packages("devtools") 
library(devtools) #load it
install_github("RGLab/ncdfFlow", ref="trunk")

```

### Unix/Linux/Mac users

To build the ncdfFlow package from source, make sure that HDF5 Library is present on your system:

If HDF5 is installed to some non-standard location, you may pass the environment variable --with-hdf5 to point to the correct location of HDF5, for example,
```bash
#install from github
install_github('RGLab/ncdfFlow', ref='trunk', args='--configure-args="--with-hdf5=<path-to-hdf>"')
 
#or install from locally downloaded tar ball
R CMD INSTALL ncdfFlow_x.y.z.tar.gz --configure-args="--with-hdf5='<path-to-hdf>'"
```
under '/path/to', you should find "include" and "lib" sub-folders that contain HDF5 headers and shared libraries. 

Also, make sure add the path of `libhdf5.so` (should be `lib` subfolder of `<path-to-hdf>`) to your environment variable `LD_LIBRARY_PATH` so that it can be found at runtime.

```bash
export LD_LIBRARY_PATH=<path-to-hdf>/lib:LD_LIBRARY_PATH
```

### Create `ncdfFlowSet` object

```{r}
library(ncdfFlow)

#read from FCS files
path <- system.file("extdata","compdata","data",package="flowCore")
files <- list.files(path,full.names=TRUE)[1:3]
fs <- read.ncdfFlowSet(files=files) #equivalent to flowCore::read.flowSet


#or convert the existing flowSet into ncdfFlowSet
data(GvHD)
fs <- GvHD[1:4]
fs <- ncdfFlowSet(fs)
fs
```

### Use it as the same way as `flowSet` (except it is memory efficient and fast)
```{r}

pData(fs)
sampleNames(fs)
keyword(fs,"FILENAME")
colnames(fs)
length(fs)
fs[[1]]
fs[2:3]

```