Browse code

Improvements to read.bismark()

- Now uses fread()'s native support of compressed files
- Supports fread()'s nThreads argument

Peter Hickey authored on 07/10/2018 03:37:11
Showing1 changed files
... ...
@@ -1,6 +1,6 @@
1 1
 test_that(".isHDF5ArrayBacked()", {
2 2
     matrix <- matrix(1:10, ncol = 2)
3
-    da <- bsseq:::.DelayedMatrix(matrix)
3
+    da <- DelayedArray(matrix)
4 4
     ha <- realize(da, BACKEND = "HDF5Array")
5 5
     ha_ha <- rbind(ha, ha)
6 6
     da_da <- rbind(da, da)
Browse code

Work in progress: refactoring bsseq

- BSseq objects can once again use ordinary matrix objects as assays.
- Reimplement `BSmooth()` more-or-less from scratch:
- Switch from 'parallel' to 'BiocParallel' for parallelization. This brings some notable improvements:
- Smoothed results can now be written directly to an on-disk realization backend by the worker. This dramatically reduces memory usage compared to previous versions of 'bsseq' that required all results be retained in-memory.
- Parallelization is now supported on Windows through the use of a 'SnowParam' object as the value of `BPPARAM`.
- Improved error handling makes it possible to gracefully resume `BSmooth()` jobs that encountered errors by re-doing only the necessary tasks.
- Detailed and extensive job logging facilities.
- Fix bug in `BSmooth()` with the `maxGap` parameter.
- Re-factor BSseq() constructor and add fast, internal .BSseq() constructor.
- Re-factor collapseBSseq() and combine(). Should be much more performant.
- Use beachmat to implement fast validity checking of 'M' and 'Cov' matrices.
- Resave BS.chr22 (supplied data) using integer for storage mode of assays to reduce size.
- Switch from RUnit to testthat. testthat has better integration with code coverage tools that help when refactoring.

Peter Hickey authored on 28/05/2018 23:42:18
Showing1 changed files
1 1
new file mode 100644
... ...
@@ -0,0 +1,32 @@
1
+test_that(".isHDF5ArrayBacked()", {
2
+    matrix <- matrix(1:10, ncol = 2)
3
+    da <- bsseq:::.DelayedMatrix(matrix)
4
+    ha <- realize(da, BACKEND = "HDF5Array")
5
+    ha_ha <- rbind(ha, ha)
6
+    da_da <- rbind(da, da)
7
+    ha_ha_ha_ha <- rbind(ha_ha, ha_ha)
8
+    da_da_da_da <- rbind(da_da, da_da)
9
+    da_da_ha_ha <- rbind(da_da, ha_ha)
10
+
11
+    expect_true(!bsseq:::.isHDF5ArrayBacked(matrix))
12
+    expect_true(bsseq:::.isHDF5ArrayBacked(ha))
13
+    expect_true(!bsseq:::.isHDF5ArrayBacked(da))
14
+    expect_true(bsseq:::.isHDF5ArrayBacked(ha_ha))
15
+    expect_true(!bsseq:::.isHDF5ArrayBacked(da_da))
16
+    expect_true(bsseq:::.isHDF5ArrayBacked(ha_ha_ha_ha))
17
+    expect_true(!bsseq:::.isHDF5ArrayBacked(da_da_da_da))
18
+    expect_true(bsseq:::.isHDF5ArrayBacked(da_da_ha_ha))
19
+
20
+    # A complicated DelayedArray with another DelayedArray as a seed
21
+    skip("TODO: Remove these tests if no longer required")
22
+    da_with_da_seed <- bsseq:::.collapseDelayedMatrix(x = da,
23
+                                                      sp = list(1:2, 3:5),
24
+                                                      MARGIN = 2)
25
+    ha_with_ha_seed <- bsseq:::.collapseDelayedMatrix(
26
+        x = ha,
27
+        sp = list(1:2, 3:5),
28
+        MARGIN = 2,
29
+        BACKEND = "HDF5Array")
30
+    expect_true(!bsseq:::.isHDF5ArrayBacked(da_with_da_seed))
31
+    expect_true(bsseq:::.isHDF5ArrayBacked(ha_with_ha_seed))
32
+})