vignettes/ggtree.Rmd
25ce9608
 ---
3780dc24
 title: "ggtree: tree visualization and annotation"
5ffee357
 author: "Guangchuang Yu\\
25ce9608
 
5ffee357
         School of Basic Medical Sciences, Southern Medical University"
25ce9608
 date: "`r Sys.Date()`"
f1b9814b
 output:
9c4b3c86
   prettydoc::html_pretty:
     toc: false
     theme: cayman
     highlight: github
245479db
   pdf_document:
25ce9608
     toc: true
 vignette: >
c69c6bd7
   %\VignetteIndexEntry{ggtree}
25ce9608
   %\VignetteEngine{knitr::rmarkdown}
   %\usepackage[utf8]{inputenc}
 ---
 
 ```{r style, echo=FALSE, results="asis", message=FALSE}
 knitr::opts_chunk$set(tidy = FALSE,
 		   message = FALSE)
 ```
 
 
3be56e90
 ```{r echo=FALSE}
 CRANpkg <- function (pkg) {
     cran <- "https://CRAN.R-project.org/package"
     fmt <- "[%s](%s=%s)"
     sprintf(fmt, pkg, cran, pkg)
 }
25ce9608
 
3be56e90
 Biocpkg <- function (pkg) {
     sprintf("[%s](http://bioconductor.org/packages/%s)", pkg, pkg)
 }
25ce9608
 
3be56e90
 ```
25ce9608
 
c69c6bd7
 # Vignette
 
 Please go to <https://yulab-smu.github.io/treedata-book/> for the full vignette.
25ce9608
 
 
 # Citation
6788b024
 
38554c25
 If you use `r Biocpkg('ggtree')` in published research, please cite the most appropriate paper(s) from this list:
25ce9608
 
3780dc24
 
 1. __G Yu__^\*^, TTY Lam, H Zhu, Y Guan^\*^. Two methods for mapping and visualizing associated data on phylogeny using ggtree. __*Molecular Biology and Evolution*__, 2018, 35(2):3041-3043.
 doi: [10.1093/molbev/msy194](https://doi.org/10.1093/molbev/msy194).
 2. __G Yu__, DK Smith, H Zhu, Y Guan, TTY Lam^\*^. ggtree: an R package for
3be56e90
 visualization and annotation of phylogenetic trees with their covariates and
 other associated data. __*Methods in Ecology and Evolution*__. 2017, 8(1):28-36.
 doi: [10.1111/2041-210X.12628](https://doi.org/10.1111/2041-210X.12628).
3780dc24
 
 
 
 
 # Need helps?
 
 
 If you have questions/issues, please visit
 [ggtree homepage](https://guangchuangyu.github.io/software/ggtree/) first.
 Your problems are mostly documented. 
 
 
 If you think you found a bug, please follow
 [the guide](https://guangchuangyu.github.io/2016/07/how-to-bug-author/) and
 provide a reproducible example to be posted
 on
 [github issue tracker](https://github.com/GuangchuangYu/ggtree/issues).
 For questions, please post
 to [google group](https://groups.google.com/forum/#!forum/bioc-ggtree). Users
 are highly recommended to subscribe to the [mailing list](https://groups.google.com/forum/#!forum/bioc-ggtree).
 
38554c25
 
5540ce8d
 
25ce9608
 
c69c6bd7
 <!--
 
25ce9608
 # Introduction
 
3be56e90
 This project arose from our needs to annotate nucleotide substitutions in the
 phylogenetic tree, and we found that there is no tree visualization software can
 do this easily. Existing tree viewers are designed for displaying phylogenetic
 tree, but not annotating it. Although some tree viewers can displaying bootstrap
 values in the tree, it is hard/impossible to display other information in the
 tree. Our first solution for displaying nucleotide substituitions in the tree is
 to add this information in the node/tip names and use traditional tree viewer to
 show it. We displayed the information in the tree successfully, but we believe
 this indirect approach is inefficient.
 
 Previously, phylogenetic trees were much smaller. Annotation of phylogenetic
 trees was not as necessary as nowadays much more data is becomming available. We
 want to associate our experimental data, for instance antigenic change, with the
 evolution relationship. Visualizing these associations in a phylogenetic tree
 can help us to identify evolution patterns. We believe we need a next generation
 tree viewer that should be programmable and extensible. It can view a
 phylogenetic tree easily as we did with classical software and support adding
 annotation data in a layer above the tree. This is the objective of developing
 the `r Biocpkg('ggtree')` [@yu_ggtree:_2017]. Common tasks of annotating a phylogenetic tree should
 be easy and complicated tasks can be possible to achieve by adding multiple
 layers of annotation.
 
 The `r Biocpkg('ggtree')` is designed by extending the `r CRANpkg('ggplot2')`
 [@wickham_ggplot2_2009] package. It is based on the grammar of graphics and
 takes all the good parts of `r CRANpkg('ggplot2')`. There are other R packages that implement
 tree viewer using `r CRANpkg('ggplot2')`, including `r CRANpkg('OutbreakTools')`,
 `r Biocpkg('phyloseq')` [@mcmurdie_phyloseq_2013]
 and [ggphylo](https://github.com/gjuggler/ggphylo); they mostly create complex
 tree view functions for their specific needs. Internally, these packages
 interpret a phylogenetic as a collection of lines, which makes it hard to
 annotate diverse user input that are related to node (taxa). The `r Biocpkg('ggtree')` is
 different to them by interpreting a tree as a collection of taxa and allowing
 general flexibilities of annotating phylogenetic tree with diverse types of user
 inputs.
 
 
 # Getting data into *R*
 
 Most of the tree viewer software (including *R* packages) focus on *Newick* and
 *Nexus* file format, while there are file formats from different evolution
 analysis software that contain supporting evidences within the file that are
 ready for annotating a phylogenetic tree. The `r Biocpkg('treeio')` package
 supports several file formats and software outputs. It brings analysis findings
 to *R* users for further analysis (*e.g.* summarization, visualization,
 comparison and test, *etc.*). It also allows external data to be mapped on the
 phylogeny. Please refer to the `r Biocpkg('treeio')` vignette
 for more details.
 
 Users can use the following command to open the vignette:
 
 ```r
 vignette("Importer", package="treeio")
 ```
25ce9608
 
3be56e90
 All the data parsed/integrated by `r Biocpkg('treeio')` package can be used to
 visualize or annotate phylogenetic tree in `r Biocpkg('ggtree')` [@yu_ggtree:_2017].
25ce9608
 
 
 # Tree Visualization and Annotation
 
68c3e3af
 Tree Visualization in `r Biocpkg('ggtree')` is easy, with one line of command
 `ggtree(tree_object)`. It supports several layouts, including *rectangular*,
 *slanted*, *circular* and *fan* for *phylogram* and *cladogram*, *equal_angle*
 and *daylight* for *unrooted* layout, time-scaled and two dimentional
 phylogenies. [Tree Visualization](treeVisualization.html) vignette describes
 these feature in details.
25ce9608
 
b681f0b4
 We implement several functions to manipulate a phylogenetic tree visually,
 including viewing selected clade to explore large tree, taxa clustering,
 rotating clade or tree, zoom out or collapsing clades *etc.*.
 
 
 
 ```{r treeman, echo=FALSE, out.extra='', message=FALSE}
 treeman <- matrix(c(
   "collapse", "collapse a selecting clade",
   "expand", "expand collapsed clade",
   "flip", "exchange position of 2 clades that share a parent node",
   "groupClade", "grouping clades",
   "groupOTU", "grouping OTUs by tracing back to most recent common ancestor",
   "identify", "interactive tree manipulation",
   "rotate", "rotating a selected clade by 180 degree",
   "rotate_tree", "rotating circular layout tree by specific angle",
   "scaleClade", "zoom in or zoom out selecting clade",
   "open_tree", "convert a tree to fan layout by specific open angle"
 ), ncol=2, byrow=TRUE)
 treeman <- as.data.frame(treeman)
 colnames(treeman) <- c("Function", "Descriptiotn")
 knitr::kable(treeman, caption = "Tree manipulation functions.", booktabs = T)
 ```
25ce9608
 
68c3e3af
 
25ce9608
 Details and examples can be found in [Tree Manipulation](treeManipulation.html) vignette.
 
 
68c3e3af
 Most of the phylogenetic trees are scaled by evolutionary distance
 (substitution/site), in `r Biocpkg('ggtree')` a phylogenetic tree can be
 re-scaled by any numerical variable inferred by evolutionary analysis (e.g.
 species divergence time, *d~N~/d~S~*, _etc_). Numerical and category variable can be
 used to color a phylogenetic tree.
 
 The `r Biocpkg('ggtree')` package provides several layers to annotate a
 phylogenetic tree. These layers are building blocks that can be freely combined
 together to create complex tree visualization.
 
 ```{r geoms, echo=FALSE, message=FALSE}
 geoms <- matrix(c(
   "geom_balance", "highlights the two direct descendant clades of an internal node",
   "geom_cladelabel", "annotate a clade with bar and text label",
   "geom_cladelabel2", "annotate a clade with bar and text label for unrooted layout",
   "geom_hilight", "highlight a clade with rectangle",
   "geom_hilight_encircle", "highlight a clade with xspline for unrooted layout",
   "geom_label2", "modified version of geom_label, with subsetting supported",
   "geom_nodelab", "layer for node labels, which can be text or image",
   "geom_nodepoint", "annotate internal nodes with symbolic points",
   "geom_point2", "modified version of geom_point, with subsetting supported",
   "geom_range", "bar layer to present uncertainty of evolutionary inference",
   "geom_rootpoint", "annotate root node with symbolic point",
   "geom_segment2", "modified version of geom_segment, with subsetting supported",
   "geom_strip", "annotate associated taxa with bar and (optional) text label",
   "geom_taxalink", "associate two related taxa by linking them with a curve",
   "geom_text2", "modified version of geom_text, with subsetting supported",
   "geom_tiplab", "layer of tip labels, which can be text or image",
   "geom_tiplab2", "layer of tip labels for circular layout",
   "geom_tippoint", "annotate external nodes with symbolic points",
   "geom_tree", "tree structure layer, with multiple layout supported",
   "geom_treescale", "tree branch scale legend"
 ), ncol=2, byrow=TRUE)
 geoms <- as.data.frame(geoms)
 colnames(geoms) <- c("Layer", "Description")
 knitr::kable(geoms, caption = "Geom layers defined in ggtree.", booktabs = T)
 ```
 
 `r Biocpkg('ggtree')` supports creating phylomoji using Emoji fonts, please
8dab5192
 refer to the
 [Phylomoji](https://guangchuangyu.github.io/software/ggtree/vignettes/phylomoji.html) vignette.
25ce9608
 
0f93c46a
 
68c3e3af
 `r Biocpkg('ggtree')` integrates [phylopic](http://phylopic.org/) database and silhouette images of organisms can
 be downloaded and used to annotate phylogenetic directly. `r Biocpkg('ggtree')` also supports
 using local or remote images to annotate a phylogenetic tree. For details,
 please refer to the `r CRANpkg('ggimage')` package vignette, which can be opened
 via the following command:
0f93c46a
 
25ce9608
 
68c3e3af
 ```r
 vignette("ggtree", package="ggimage")
 ```
25ce9608
 
 
68c3e3af
 Visualizing an annotated phylogenetic tree with numerical matrix (e.g. genotype
 table), multiple sequence alignment and subplots are also supported in `ggtree`.
 Examples of annotating phylogenetic trees can be found in
 the [Tree Annotation](treeAnnotation.html) vignette.
25ce9608
 
 
 # Vignette Entry
 
3be56e90
 + [Tree Data Import](https://bioconductor.org/packages/devel/bioc/vignettes/treeio/inst/doc/Importer.html)
25ce9608
 + [Tree Visualization](treeVisualization.html)
 + [Tree Manipulation](treeManipulation.html)
 + [Tree Annotation](treeAnnotation.html)
8dab5192
 + [Phylomoji](https://guangchuangyu.github.io/software/ggtree/vignettes/phylomoji.html)
7918c8b0
 + [Annotating phylogenetic tree with images](https://guangchuangyu.github.io/software/ggtree/vignettes/ggtree-ggimage.html)
 + [Annotate a phylogenetic tree with insets](https://guangchuangyu.github.io/software/ggtree/vignettes/ggtree-inset.html)
3be56e90
 
25ce9608
 
2ea8a3b5
 **ggtree homepage**: <https://guangchuangyu.github.io/software/ggtree> (contains more
2794a0e5
 information about the package, more documentation, a gallery of beautiful
 published images and links to related resources).
0ae9e98a
 
c69c6bd7
 -->
 
25ce9608