<!--
%\VignetteEngine{knitr}
%\VignetteIndexEntry{Common use of ComplexHeatmap package}
-->

Common use of ComplexHeatmap package
========================================

**Author**: Zuguang Gu ( z.gu@dkfz.de )

**Date**: `r Sys.Date()`

-------------------------------------------------------------

In this vignette we only show the most used cases of the ComplexHeatmap package.
ComplexHeatmap package is highly flexible and users can find the complete reference
in []().

```{r global_settings, echo = FALSE, message = FALSE}
library(markdown)
options(markdown.HTML.options = c(options('markdown.HTML.options')[[1]], "toc"))

library(knitr)
knitr::opts_chunk$set(
    error = FALSE,
    tidy  = FALSE,
    message = FALSE,
    fig.align = "center"
)
options(markdown.HTML.stylesheet = "custom.css")

options(width = 100)

library(circlize)
library(ComplexHeatmap)
```

First we load the circlize package and ComplexHeatmap package. The circlize
package is used very often with ComplexHeatmap for generating color mapping
functions.

```{r}
library(circlize)
library(ComplexHeatmap)
```

In the vignette, we demonstrate ComplexHeatmap package with a randomly
generated DNA methylation data and gene expression data.

```{r}
load(system.file("extdata", "random_meth_expr_data.RData", package = "ComplexHeatmap"))
```

The data variables are:

- `mat_meth`: The methylation matrix for 1000 DMRs in 20 samples. The value in
  the matrix is the mean methylation of all CpGs in a DMR.
- `mat_expr`: The gene expression matrix. The $i^th$ row is the gene having
  the closest TSS to the $i^th$ DMR in `mat_meth`. The samples are the same as in `mat_meth`.

The annotation variables for samples are:

- `anno`: The annotation data frame. There are five annotations:
    - `type`: Whether the sample is a tumor sample or a control sample.
    - `gender`: Whether the patient is a male or female. There are two `NA`
      values in it.
    - `age`: Age of the patient. It is a numeric annotation.
    - `mut1` and `mut2`: Whether the sample has mutation for the two genes.
      The value is logical.
- `anno_col`: The color of annotations in `anno`.

```{r}
anno
anno_col
```

The annotation variables for DMRs or the associated genes are:

- `direction`: The direction of methylation, i.e. whether the DMR is
  hyper-methylated in tumor?
- `cor_pvalue`: The p-value for the correlation test between DMR methylation
  and gene expression.
- `gene_type`: Gene types, e.g. protein-coding gene, or lincRNA.
- `tss_dist`: The distance from DMR to the nearest TSS.
- `anno_gene`: Annotation to genes, e.g. TSS, intragenic, intergenic.
- `anno_states`: The value is how much percent in a DMR is covered by a
  certain chromatin states. There are three different chromatin states in this
  data frame: active TSS state, enhancer state and repressive state.

You may find we didn't set the colors for the annotations of DMRs or genes, we
will demonstrate how random colors are assigned to them if the colors are not
set.

In real case, this set of data types is very common for many epigenomic
researches, which always have data of DNA methylation, gene expression and
histone modifications, or some of them. We will show here how ComplexHeatmap
package easily helps the integrative analysis of multiple datasets to find the
associations hiding behind it.

## A Single Heatmap with annotations

A single heatmap is the most used way to visualize matrix-like data.

Drawing a heatmap is straightforward. The only mandatory argument is the
matrix. However, a heatmap can have different components: names or labels by
the heatmap, the dendrograms, the annotations for rows or columns and the
title of the heatmap. All these components can be added by `Heatmap()`
function. The `Heatmap()` function has huge number of arguments which give
exact control of the heatmap components and users can refer to ...

In following example, we added column and row dendrogram, column annotations
and column names. The dendrograms and row/columns names are natural to add. If
the matrix has row names or column names, they are added to the heatmap by
default, and clustering is turned on, dendrograms are also added to the
heatmap.

Annotations are a little bit complex to configure. Because the aim of ComplexHeatmap
package is to provide a flexible way to control many types of annotations, the package
has a `HeatmapAnnotation()` function to properly construct heatmap annotations.

In following, apart from `col` and `annotation_legend_param`, all other
arguments specify single annotations and they are combined as a global heatmap
annotation. The simplest annotation is heatmap-like annotation for which you
only to specify it as a numeric or character vector (e.g. the `type` and
`gender` annotation). The heatmap-like annotation can also be a matrix (e.g.
`mutation` annotation) that the annotation will be represented as a multi-row
or multi-column annotation and they share one color mapping schema. Moreover,
the annotation can be so-called "complex annotations" that it is defined by a
annotation function. A annotation function is defined by users and basically
users can draw whatever they want. 	

In ComplexHeatmap package, there are already several pre-defined annotation functions.
In following, `anno_points()` generates an annotation function given the data and the 
settings (check the returned value of `anno_points(1:10)`).

Colors for the legends are controlled by `col`. `col` can only control colors for
"simple annotations" which are specified by a vector or a matrix. The value of `col`
should be a named list where the name in `col` should correspond to the names of
the annotations (e.g. `mutation` in following example) because that is the way to
connect `col` to individual annotations. The discrete annotations (e.g. in character)
have the color as a named vector and the continous annotations have color as a 
color mapping function which is generated by `circlize::colorRmap2()`. You can check
the value of `anno_col` for example.

In following code, we also customized the legend for the mutation annotation because
the labels or the levels in `mutation` is `TRUE` and `FALSE` and we change to `has mutation`
and `no mutation`.

```{r}
Heatmap(mat_meth, name = "methylation",
	top_annotation = HeatmapAnnotation(
		type = anno$type,
		gender = anno$gender,
		age = anno_points(anno$age, ylim = c(0, 80)),
		mutation = as.matrix(anno[, c("mut1", "mut2")]),
		col = anno_col,
		border = c(mutation = TRUE),
		annotation_legend_param = list(
			mutation = list(
				at = c("TRUE", "FALSE"),
				labels = c("has mutation", "no mutation")
		))
	), column_title = "Differential Methylated Regions")
```

As you may notice, the legends are arranged into two columns. The reason for
doing this is we always assume the matrix itself gives the major information,
especially when you have several heatmaps add horizontally, while the column
annotations give the secondary information. However, if you want to merge the
heatmap legends and annotaiton legends, you need to explicitly draw the
heatmap by `draw()` function and specify `merge_legends = TRUE`.

Also, as mentioned before, the heatmap has components on the four sides. We
can set the title to the left of the heatmap by setting `row_title` and we put
the annotation to the bottom of the heatmap by switching to
`bottom_annotation`. we can also control the side of the annotation name to
the left by setting the `annotation_name_side` argument in
`HeatmapAnnotation()`.

```{r}
ht = Heatmap(mat_meth, name = "methylation",
	bottom_annotation = HeatmapAnnotation(
		df = anno,
		annotation_name_side = "left"
	), row_title = "Differential Methylated Regions")
draw(ht, merge_legends = TRUE)
```

We can also set the left and right annotation which is similar as top and
bottom annotation. The main difference is you need to use `rowAnnotation()`
or `HeatmapAnntation(..., which = "row")` to construct the row annotations.
the `anno_*()` functions, if you specify them inside `rowAnnotation()`, you
don't need to ...

```{r}
Heatmap(mat_meth, name = "methylation",
	top_annotation = HeatmapAnnotation(
		type = anno$type,
		gender = anno$gender,
		age = anno_points(anno$age, ylim = c(0, 80)),
		col = anno_col
	), 
	right_annotation = rowAnnotation(
		anno_gene = anno_gene,
		tss_dist = anno_points(tss_dist, size = unit(0.5, "mm"), 
			width = unit(2, "cm"))
	),
	column_title = "Differential Methylated Regions")
```

ComplexHeatmap package supports to split heatmaps by rows or/and by columns.
The split can be applied by k-means clustering, by cutting the dendrograms,
or by a categorical data frame. In following example, we simply split the
heatmap into 2 groups horizontally and 4 groups vertically.

```{r, fig.width = 8}
ht = Heatmap(mat_meth, name = "methylation",
	right_annotation = rowAnnotation(
		direction = direction,
		pvalue = -log10(cor_pvalue),
		anno_gene = anno_gene,
		gene_type = gene_type,
		tss_dist = anno_points(tss_dist, size = unit(0.5, "mm"), 
			width = unit(2, "cm")),
		states = as.matrix(anno_states),
		col = list(
			pvalue = colorRamp2(c(0, 2, 4), c("green", "white", "red")),
			states = colorRamp2(c(0, 1), c("white", "orange"))),
		annotation_legend_param = list(
			pvalue = list(at = c(0, 2, 4), labels = c("1", "0.01", "0.0001")))
	),
	show_column_names = FALSE,
	column_title = "Differential Methylated Regions",
	column_km = 2, row_km = 4)
draw(ht)
```

When k-means splitting and data frame splitting are both provided, they are combined.

```{r, fig.width = 8}
draw(ht, row_km = 2, row_split = direction)
```

## Heatmap List

One unique advantage of ComplexHeatmap is it supports adding a list of heatmaps and annotations.
"+" operator is for horizontal add.


```{r}
meth_col_fun = colorRamp2(c(0, 0.5, 1), c("blue", "white", "red"))
expr_col_fun = colorRamp2(c(-2, 0, 2), c("green", "white", "red"))
ht_list = Heatmap(mat_meth, name = "methylation", col = meth_col_fun,
		column_title = "Methylation") + 
	Heatmap(mat_expr, name = "epxression", col = expr_col_fun,
		column_title = "Expression")
draw(ht_list, row_km = 4)
```

As memtioned, row anntations can be attached to the heatmap by `left_annotation` or
by `right_annotation`. Actually they can also be separated and add to the heatmaps.
so `Heatmap(..., left_annotation = rowAnnotation(...))` is similar as `Heatmap(...) + rowAnnotation(...)`.

```{r}
ht_list = Heatmap(mat_meth, name = "methylation", col = meth_col_fun,
		column_title = "Methylation") + 
	Heatmap(mat_expr, name = "epxression", col = expr_col_fun,
		column_title = "Expression") +
	rowAnnotation(anno_gene = anno_gene,
		tss_dist = anno_points(tss_dist, size = unit(0.5, "mm"), 
			width = unit(2, "cm"))
	)
draw(ht_list, row_km = 4)
```

ComplexHeatmap also supports add heatmap vertically, you just need to change the add operator
to `%v%`. 

```{r}
ht_list = Heatmap(mat_meth[1:40, ], name = "methylation", col = meth_col_fun,
		row_km = 2, row_title = "Methylation", show_column_names = FALSE) %v%
	Heatmap(mat_expr[1:40, ], name = "epxression", col = expr_col_fun,
		row_km = 2, row_title = "Expression")
draw(ht_list, column_km = 2)
```

And similar, column annotations can be separated from teh heatmap and add to the list.

```{r}
ht_list = Heatmap(mat_meth[1:40, ], name = "methylation", col = meth_col_fun,
		row_km = 2, row_title = "Methylation", show_column_names = FALSE) %v%
	columnAnnotation(
		type = anno$type,
		gender = anno$gender,
		age = anno_points(anno$age, ylim = c(0, 80)),
		mutation = as.matrix(anno[, c("mut1", "mut2")]),
		col = anno_col,
		annotation_name_side = "left"
	) %v%
	Heatmap(mat_expr[1:40, ], name = "epxression", col = expr_col_fun,
		row_km = 2, row_title = "Expression")
draw(ht_list, column_km = 2)
```

## Density as a heatmap

```{r}
densityHeatmap(mat_meth[1:40, ], ylab = "methylation values", 
	title = "Methylation distribution in samples")
```

```{r, fig.height = 10}
densityHeatmap(mat_meth[1:40, ], ylab = "methylation values", 
	show_column_names = FALSE,
	title = "Methylation distribution in samples",
	top_annotation = HeatmapAnnotation(type = anno$type, col = anno_col)) %v%
columnAnnotation(
	gender = anno$gender,
	age = anno_points(anno$age, ylim = c(0, 80)),
	col = anno_col
) %v%
Heatmap(mat_expr[1:40, ], name = "epxression", col = expr_col_fun,
		row_km = 2, row_title = "Expression", heatmap_height = unit(6, "cm"))
```

## OncoPrint

```{r, fig.width = 10}
mat = readRDS(system.file("extdata", "tcga_lung_adenocarcinoma_provisional_ras_raf_mek_jnk_signalling.rds",
	package = "ComplexHeatmap"))
alter_fun = list(
    background = function(x, y, w, h) {
        grid.rect(x, y, w-unit(0.5, "mm"), h-unit(0.5, "mm"), gp = gpar(fill = "#CCCCCC", col = NA))
    },
    HOMDEL = function(x, y, w, h) {
        grid.rect(x, y, w-unit(0.5, "mm"), h-unit(0.5, "mm"), gp = gpar(fill = "blue", col = NA))
    },
    AMP = function(x, y, w, h) {
        grid.rect(x, y, w-unit(0.5, "mm"), h-unit(0.5, "mm"), gp = gpar(fill = "red", col = NA))
    },
    MUT = function(x, y, w, h) {
        grid.rect(x, y, w-unit(0.5, "mm"), h*0.33, gp = gpar(fill = "#008000", col = NA))
    }
)
col = c("MUT" = "#008000", "AMP" = "red", "HOMDEL" = "blue")
oncoPrint(mat, get_type = function(x) strsplit(x, ";")[[1]],
    alter_fun = alter_fun, col = col, 
    remove_empty_columns = TRUE, remove_empty_rows = TRUE,
    column_title = "OncoPrint for TCGA Lung Adenocarcinoma, genes in Ras Raf MEK JNK signalling",
    heatmap_legend_param = list(title = "Alternations", at = c("AMP", "HOMDEL", "MUT"), 
        labels = c("Amplification", "Deep deletion", "Mutation")))
```



## Stacked plot

## Session Info

```{r}
sessionInfo()
```