...
|
...
|
@@ -444,16 +444,16 @@ you should assume the latter to be correct.
|
444
|
444
|
|
445
|
445
|
\subsection{GenomeAxisTrack}
|
446
|
446
|
\Rclass{GenomeAxisTrack} objects can be used to add some reference to
|
447
|
|
-the currently displayed genomic location to a \mgg plot. In its most
|
448
|
|
-basic form it is really just a horizontal axis with genomic coordinate
|
449
|
|
-tickmarks. Using the \Rfunction{GenomeAxisTrack} constructor function
|
450
|
|
-is the recommended way to instantiate objects from the class. There is
|
451
|
|
-no need to know in advance about a particular genomic location when
|
452
|
|
-constructing the object. Instead, the displayed coordinates will be
|
453
|
|
-determined from the context, e.g., from the \Rfunarg{from} and
|
454
|
|
-\Rfunarg{to} arguments of the \Rfunction{plotTracks} function, or,
|
455
|
|
-when plotted together with other track objects, from their genomic
|
456
|
|
-locations.
|
|
447
|
+the currently displayed genomic location to a \mgg plot. In their most
|
|
448
|
+basic form they are really just a horizontal axis with genomic
|
|
449
|
+coordinate tickmarks. Using the \Rfunction{GenomeAxisTrack}
|
|
450
|
+constructor function is the recommended way to instantiate objects
|
|
451
|
+from the class. There is no need to know in advance about a particular
|
|
452
|
+genomic location when constructing the object. Instead, the displayed
|
|
453
|
+coordinates will be determined from the context, e.g., from the
|
|
454
|
+\Rfunarg{from} and \Rfunarg{to} arguments of the
|
|
455
|
+\Rfunction{plotTracks} function, or, when plotted together with other
|
|
456
|
+track objects, from their genomic locations.
|
457
|
457
|
|
458
|
458
|
<<GenomeAxisTrackClass1, fig=TRUE, width=7.5, height=0.5>>=
|
459
|
459
|
axisTrack <- GenomeAxisTrack()
|
...
|
...
|
@@ -464,7 +464,7 @@ As an optional feature one can highlight particular regions
|
464
|
464
|
on the axis, for instance to indicated stretches of N nucleotides or
|
465
|
465
|
gaps in genomic alignments. Such regions have to be supplied to the
|
466
|
466
|
optional \Rfunarg{range} argument of the constructor function as
|
467
|
|
-either an \Rclass{IRanges} or an \Rclass{IRanges} object.
|
|
467
|
+either an \Rclass{GRanges} or an \Rclass{IRanges} object.
|
468
|
468
|
|
469
|
469
|
<<GenomeAxisTrackClass2, fig=TRUE, width=7.5, height=0.5>>=
|
470
|
470
|
axisTrack <- GenomeAxisTrack(range=IRanges(start=c(2e6, 4e6), end=c(3e6, 7e6)))
|
...
|
...
|
@@ -512,20 +512,20 @@ plotTracks(axisTrack, from=1e6, to=9e6, labelPos="below")
|
512
|
512
|
Sometimes a full-blown axis is just too much, and all we really need
|
513
|
513
|
in the plot is a small scale indicator. We can change the appearance
|
514
|
514
|
of the \Rclass{GenomeAxisTrack} object to such a limited
|
515
|
|
-representation by setting the \Rfunarg{scale} parameter. Typically,
|
516
|
|
-this will be a numeric value between 0 and 1, which is interpreted as
|
517
|
|
-the fraction of the plotting region used for the scale. The plotting
|
518
|
|
-method will apply some rounding to come up with reasonable and
|
519
|
|
-human-readable values. For even more control we can pass in a value
|
520
|
|
-larger than 1, which is considered to be an absolute range length. In
|
521
|
|
-this case, the user is responsible for the scale to actually fit in
|
522
|
|
-the current plotting range.
|
|
515
|
+representation by setting the \Rfunarg{scale} display
|
|
516
|
+parameter. Typically, this will be a numeric value between 0 and 1,
|
|
517
|
+which is interpreted as the fraction of the plotting region used for
|
|
518
|
+the scale. The plotting method will apply some rounding to come up
|
|
519
|
+with reasonable and human-readable values. For even more control we
|
|
520
|
+can pass in a value larger than 1, which is considered to be an
|
|
521
|
+absolute range length. In this case, the user is responsible for the
|
|
522
|
+scale to actually fit in the current plotting range.
|
523
|
523
|
|
524
|
524
|
<<GenomeAxisTrackClass7, fig=TRUE, width=7.5, height=0.5>>=
|
525
|
525
|
plotTracks(axisTrack, from=1e6, to=9e6, scale=0.5)
|
526
|
526
|
@
|
527
|
527
|
|
528
|
|
-We still have control over the placement of the label via the
|
|
528
|
+We still have control over the placement of the scale label via the
|
529
|
529
|
\Rfunarg{labelPos}, parameter, which now takes the values
|
530
|
530
|
\code{above}, \code{below} and \code{beside}.
|
531
|
531
|
|
...
|
...
|
@@ -544,18 +544,20 @@ addParTable("GenomeAxisTrack")
|
544
|
544
|
\subsection{IdeogramTrack}
|
545
|
545
|
While a genomic axis provides helpful points of reference to a plot,
|
546
|
546
|
it is sometimes important to show the currently displayed region in
|
547
|
|
-the broader context of a chromosme. Are we looking at distal regions,
|
548
|
|
-or somewhere close to the centromer? And how much of the complete
|
549
|
|
-chromosome is covered in our plot. To that end the \mgg package
|
550
|
|
-defines the \Rclass{IdeogramTrack} class, which is an idealized
|
551
|
|
-representation of a single chromosome. When plotted, these track
|
552
|
|
-objects will always show the whole chromosome, regardless of the
|
553
|
|
-selected genomic region. However, this selection is indicated by a
|
554
|
|
-box. The chromosomal data necessary to draw the ideogram is not part
|
555
|
|
-of the \mgg package itself, but it is rather downloaded from an online
|
556
|
|
-source (UCSC). Thus it is important to use both chromosome and genome
|
557
|
|
-names that are recognizable in the UCSC data base. You might want to
|
558
|
|
-consult their webpage (\url{http://genome.ucsc.edu/}) or use the
|
|
547
|
+the broader context of the whole chromosme. Are we looking at distal
|
|
548
|
+regions, or somewhere close to the centromer? And how much of the
|
|
549
|
+complete chromosome is covered in our plot. To that end the \mgg
|
|
550
|
+package defines the \Rclass{IdeogramTrack} class, which is an
|
|
551
|
+idealized representation of a single chromosome. When plotted, these
|
|
552
|
+track objects will always show the whole chromosome, regardless of the
|
|
553
|
+selected genomic region. However, the displayed coordinates are
|
|
554
|
+indicated by a box that sits on the ideogram image. The chromosomal
|
|
555
|
+data necessary to draw the ideogram is not part of the \mgg package
|
|
556
|
+itself, instead it is downloaded from an online source (UCSC). Thus it
|
|
557
|
+is important to use both chromosome and genome names that are
|
|
558
|
+recognizable in the UCSC data base when dealing with
|
|
559
|
+\Rclass{IdeogramTrack} objects. You might want to consult the UCSC
|
|
560
|
+webpage (\url{http://genome.ucsc.edu/}) or use the
|
559
|
561
|
\Rfunction{ucscGenomes} function in the \Rpackage{rtracklayer} package
|
560
|
562
|
for a listing of available genomes.
|
561
|
563
|
|
...
|
...
|
@@ -581,6 +583,13 @@ if(hasUcscConnection){
|
581
|
583
|
plotTracks(ideoTrack, from=85e6, to=129e6)
|
582
|
584
|
@
|
583
|
585
|
|
|
586
|
+We can turn off the explicit plotting of the chromosome name by
|
|
587
|
+setting the \Rfunarg{showId} display parameter to \code{FALSE}.
|
|
588
|
+
|
|
589
|
+<<IdeogramTrackClass2, fig=TRUE, width=7.5, height=0.5>>=
|
|
590
|
+plotTracks(ideoTrack, from=85e6, to=129e6, showId=FALSE)
|
|
591
|
+@
|
|
592
|
+
|
584
|
593
|
\subsubsection*{Display parameters for IdeogramTrack objects}
|
585
|
594
|
|
586
|
595
|
For a complete listing of all the available display parameters please
|
...
|
...
|
@@ -606,16 +615,17 @@ group information. Thus the starting point for creating
|
606
|
615
|
the form of an \Rclass{IRanges} or \Rclass{GRanges} object, or
|
607
|
616
|
individually as start and end coordinates or widths. The second
|
608
|
617
|
ingredient is a numeric vector of the same length as the number of
|
609
|
|
-ranges, or a numeric matrix with the same number of columns. We can
|
610
|
|
-pass this information, along with the genome and the chromosome
|
611
|
|
-identifiers, to the \Rfunction{DataTrack} constructor function to
|
|
618
|
+ranges, or a numeric matrix with the same number of columns. Those may
|
|
619
|
+even already be part of the input \Rclass{GRanges} object as
|
|
620
|
+\code{elemenMetadata} values. For a complete description of all the
|
|
621
|
+possible inputs please see the class' online documentation. We can
|
|
622
|
+pass all this information to the \Rfunction{DataTrack} constructor function to
|
612
|
623
|
instantiate an object. We will load our sample data from an
|
613
|
624
|
\Rclass{GRanges} object that comes as part of the \mgg package.
|
614
|
625
|
|
615
|
626
|
<<DataClass1, fig=TRUE, width=7.5, height=1.5>>=
|
616
|
627
|
data(twoGroups)
|
617
|
|
-dTrack <- DataTrack(twoGroups, data=t(as.data.frame(elementMetadata(twoGroups))), genome="hg19",
|
618
|
|
- chromosome="chrX", name="uniform")
|
|
628
|
+dTrack <- DataTrack(twoGroups, name="uniform")
|
619
|
629
|
plotTracks(dTrack)
|
620
|
630
|
@
|
621
|
631
|
|
...
|
...
|
@@ -656,7 +666,7 @@ names(dTrack) <- "uniform"
|
656
|
666
|
@
|
657
|
667
|
|
658
|
668
|
You will notice that some of the plot types work better for univariate
|
659
|
|
-data while others are clearly designed for multivariate data. The
|
|
669
|
+data while others are clearly designed for multivariate inputs. The
|
660
|
670
|
\Rfunarg{a} type for instance averages the values at each genomic
|
661
|
671
|
location before plotting the derived values as a line. The decision
|
662
|
672
|
for a particular plot type is totally up to the user, and one could
|
...
|
...
|
@@ -695,8 +705,8 @@ parameters that control the layout of both grouped and of ungrouped
|
695
|
705
|
\Rclass{DataTracks}. You may want to check the class' help page for
|
696
|
706
|
details.
|
697
|
707
|
|
698
|
|
-<<typeGroupedPlots, fig=TRUE, width=7.5, height=4.2, echo=FALSE, results=hide>>=
|
699
|
|
-pushViewport(viewport(layout=grid.layout(nrow=6, ncol=1)))
|
|
708
|
+<<typeGroupedPlots, fig=TRUE, width=7.5, height=4.9, echo=FALSE, results=hide>>=
|
|
709
|
+pushViewport(viewport(layout=grid.layout(nrow=7, ncol=1)))
|
700
|
710
|
i <- 1
|
701
|
711
|
for(t in c("a", "s", "smooth", "histogram", "boxplot", "heatmap"))
|
702
|
712
|
{
|
...
|
...
|
@@ -706,7 +716,10 @@ for(t in c("a", "s", "smooth", "histogram", "boxplot", "heatmap"))
|
706
|
716
|
i <- i+1
|
707
|
717
|
popViewport(1)
|
708
|
718
|
}
|
709
|
|
-popViewport(1)
|
|
719
|
+pushViewport(viewport(layout.pos.col=((7-1)%%1)+1, layout.pos.row=((7-1)%/%1)+1))
|
|
720
|
+names(dTrack) <- "hor. hist."
|
|
721
|
+plotTracks(dTrack, type="histogram", stackedBars=FALSE, add=TRUE, cex.title=0.8, groups=rep(1:2, each=3), margin=0.5)
|
|
722
|
+popViewport(2)
|
710
|
723
|
names(dTrack) <- "uniform"
|
711
|
724
|
@
|
712
|
725
|
|
...
|
...
|
@@ -732,21 +745,22 @@ plotTracks(dTrack.big, type="hist")
|
732
|
745
|
Since the available resolution on our screen is limited we can no
|
733
|
746
|
longer distinguish between individual coordinate ranges. The \mgg
|
734
|
747
|
package tries to avoid overplotting by collapsing overlapping ranges
|
735
|
|
-(assuming the \Rfunarg{collapseTracks} is set to
|
|
748
|
+(assuming the \Rfunarg{collapseTracks} parameter is set to
|
736
|
749
|
\code{TRUE}). However, it is often desirable to summarize the data,
|
737
|
|
-for instance by binning values into a fixed number of windows and
|
738
|
|
-subsequent calculation of a summary statistic. This can be archived by
|
739
|
|
-a combination of the \Rfunarg{window} and \Rfunarg{aggregation}
|
740
|
|
-display parameters. The former can be an integer value greater than
|
741
|
|
-zero giving the number of evenly-sized bins to aggregate the data
|
742
|
|
-in. The latter is supposed to be a user-supplied function that accepts
|
743
|
|
-a numeric vector as a single input parameter and returns a single
|
744
|
|
-aggregated numerical value. For simplicity, the most obvious
|
745
|
|
-aggregation functions can be selected by passing in a character scalar
|
746
|
|
-rather than a function. Possible values are \code{mean},
|
747
|
|
-\code{median}, \code{extreme}, \code{sum}, \code{min} and
|
748
|
|
-\code{max}. The default is to compute the mean value of all the binned
|
749
|
|
-data points.
|
|
750
|
+for instance by binning values into a fixed number of windows followe
|
|
751
|
+by the calculation of a meaningful summary statistic. This can be
|
|
752
|
+archived by a combination of the \Rfunarg{window} and
|
|
753
|
+\Rfunarg{aggregation} display parameters. The former can be an integer
|
|
754
|
+value greater than zero giving the number of evenly-sized bins to
|
|
755
|
+aggregate the data in. The latter is supposed to be a user-supplied
|
|
756
|
+function that accepts a numeric vector as a single input parameter and
|
|
757
|
+returns a single aggregated numerical value. For simplicity, the most
|
|
758
|
+obvious aggregation functions can be selected by passing in a
|
|
759
|
+character scalar rather than a function. Possible values are
|
|
760
|
+\code{mean}, \code{median}, \code{extreme}, \code{sum}, \code{min} and
|
|
761
|
+\code{max}. These presets are also much faster because they have been
|
|
762
|
+optimized to operate on large numeric matrices. The default is to
|
|
763
|
+compute the mean value of all the binned data points.
|
750
|
764
|
|
751
|
765
|
<<aggregation, fig=TRUE, results=hide, width=7.5, height=1.5>>=
|
752
|
766
|
plotTracks(dTrack.big, type="hist", window=50)
|
...
|
...
|
@@ -770,7 +784,7 @@ plotTracks(dTrack.big, type="hist", window=-1, windowSize=2500)
|
770
|
784
|
In addition to transforming the data on the x-axis we can also apply
|
771
|
785
|
arbitrary transformation functions on the y-axis. One obvious use-case
|
772
|
786
|
would be to log-transform the data prior to plotting. The framework is
|
773
|
|
-flexible enough however to allow arbitrary transformation
|
|
787
|
+flexible enough however to allow for arbitrary transformation
|
774
|
788
|
operations. The mechanism works by providing a function as the
|
775
|
789
|
\Rfunarg{transformation} display parameter, which takes as input a
|
776
|
790
|
numeric vector and returns a transformed numeric vector of the same
|
...
|
...
|
@@ -781,6 +795,25 @@ values greater than zero.
|
781
|
795
|
plotTracks(dTrack.big, type="l", transformation=function(x){x[x<0] <- 0; x})
|
782
|
796
|
@
|
783
|
797
|
|
|
798
|
+As seen before, the \Rfunarg{a} type allows to plot average values for
|
|
799
|
+each of the separate groups. There is however an additional parameter
|
|
800
|
+\Rfunarg{aggregateGroups} that generalizes group value
|
|
801
|
+aggregations. In the following example we display, for each group and
|
|
802
|
+at each position, the average values in the form of a dot-and-lines
|
|
803
|
+plot.
|
|
804
|
+
|
|
805
|
+<<groupingAv1, fig=TRUE, results=hide, width=7.5, height=1.5>>=
|
|
806
|
+plotTracks(dTrack, groups=rep(c("control", "treated"), each=3), type=c("b"), aggregateGroups=TRUE)
|
|
807
|
+@
|
|
808
|
+
|
|
809
|
+This functionality again also relies on the setting of the
|
|
810
|
+\Rfunarg{aggregation} parameter, and we can easily change it to
|
|
811
|
+display the maximum group values instead.
|
|
812
|
+
|
|
813
|
+<<groupingAv2, fig=TRUE, results=hide, width=7.5, height=1.5>>=
|
|
814
|
+plotTracks(dTrack, groups=rep(c("control", "treated"), each=3), type=c("b"), aggregateGroups=TRUE, aggregation="max")
|
|
815
|
+@
|
|
816
|
+
|
784
|
817
|
\subsubsection*{Display parameters for DataTrack objects}
|
785
|
818
|
|
786
|
819
|
For a complete listing of all the available display parameters please
|
...
|
...
|
@@ -811,8 +844,11 @@ we try to be flexible in the way this information can be passed to the
|
811
|
844
|
function, either in the form of separate function arguments, as
|
812
|
845
|
\Rclass{IRanges} or \Rclass{GRanges} objects. Optionally, we can pass
|
813
|
846
|
in the strand information for the annotation features and some useful
|
814
|
|
-identifiers. For the full details on the constructor function and the
|
815
|
|
-accepted arguments see \code{?AnnotationTrack}.
|
|
847
|
+identifiers. A somewhat special case is to build the object from a
|
|
848
|
+\Rclass{GRangesList} object, which will automatically preserve the
|
|
849
|
+element grouping information contained in the list structure. For the
|
|
850
|
+full details on the constructor function and the accepted arguments
|
|
851
|
+see \code{?AnnotationTrack}.
|
816
|
852
|
|
817
|
853
|
<<anntrack1, fig=TRUE, results=hide, width=7.5, height=0.5>>=
|
818
|
854
|
aTrack <- AnnotationTrack(start=c(10, 40, 120), width=15, chromosome="chrX",
|
...
|
...
|
@@ -858,10 +894,11 @@ plotTracks(aTrack.groups, showId=TRUE)
|
858
|
894
|
|
859
|
895
|
Arranging items on the plotting canvas is relatively straight forward
|
860
|
896
|
as long as there are no overlaps between invidiual regions or groups
|
861
|
|
-of regions. A logical solution to this problem is to stack overlapping
|
862
|
|
-items in separate horizontal lines, thus extending the height of the
|
863
|
|
-track to accomodate all of them. This involves some optimization, and
|
864
|
|
-the \mgg package automatically tries to come up with the most compact
|
|
897
|
+of regions. Those inevitably cause overplotting which could seriously
|
|
898
|
+obfuscate the information on the plot. A logical solution to this
|
|
899
|
+problem is to stack overlapping items in separate horizontal lines to
|
|
900
|
+accomodate all of them. This involves some optimization, and the \mgg
|
|
901
|
+package automatically tries to come up with the most compact
|
865
|
902
|
arrangement. Let's exemplify this feature with a slightly modified
|
866
|
903
|
\Rclass{AnnotationTrack} object.
|
867
|
904
|
|
...
|
...
|
@@ -903,14 +940,14 @@ feature(aTrack.stacked)[1:4] <- c("foo", "bar", "bar", "bar")
|
903
|
940
|
@
|
904
|
941
|
|
905
|
942
|
Unless we tell the \mgg package how to deal with the respective
|
906
|
|
-feature types they will all be treated similar, i.e., they will be
|
907
|
|
-plotted using the default color as defined by the \Rfunarg{fill}
|
908
|
|
-display paramter. To define colors for individual feature types we
|
909
|
|
-simply have to add them as additional display parameters, where the
|
910
|
|
-parameter name matches to the feature type and its value is supposed
|
911
|
|
-to be a valid R color qualifier. Of course this implies that we can
|
912
|
|
-only use type names that are not already taken by other display
|
913
|
|
-parameters defined in the package.
|
|
943
|
+feature types they will all be treated in a similar fashion, i.e.,
|
|
944
|
+they will be plotted using the default color as defined by the
|
|
945
|
+\Rfunarg{fill} display paramter. To define colors for individual
|
|
946
|
+feature types we simply have to add them as additional display
|
|
947
|
+parameters, where the parameter name matches to the feature type and
|
|
948
|
+its value is supposed to be a valid R color qualifier. Of course this
|
|
949
|
+implies that we can only use type names that are not already taken by
|
|
950
|
+other display parameters defined in the package.
|
914
|
951
|
|
915
|
952
|
|
916
|
953
|
<<featuresPlot, fig=TRUE, results=hide, width=7.5, height=0.5>>=
|
...
|
...
|
@@ -986,9 +1023,11 @@ plotTracks(ctrack, extend.left=1800)
|
986
|
1023
|
The first thing to notice is that the for item \code{d} we do see the
|
987
|
1024
|
item identifier but not the range itself. This is due to the fact that
|
988
|
1025
|
the with of the item is smaller than a single pixel, and hence the
|
989
|
|
-graphics system can not display it. There are also the two items
|
990
|
|
-\code{e} and \code{f} which seem to overlay each other completely, and
|
991
|
|
-another two items which appear to be one joined item (\code{k} and
|
|
1026
|
+graphics system can not display it (Note that this is only true for
|
|
1027
|
+certain devices. The quartz device on the Mac seems to be a little
|
|
1028
|
+smarter about this). There are also the two items \code{e} and
|
|
1029
|
+\code{f} which seem to overlay each other completely, and another two
|
|
1030
|
+items which appear to be one joined item (\code{k} and
|
992
|
1031
|
\code{l}). Again, this is a resolution issue as their relative
|
993
|
1032
|
distance is smaller than a single pixel, so all we see is a single
|
994
|
1033
|
range and some ugly overplotted identifiers. We can control the first
|
...
|
...
|
@@ -1013,7 +1052,7 @@ of the identifiers for you, too. The merging operation is aware of the
|
1013
|
1052
|
grouping information, so no two groups where joint together. Sometimes
|
1014
|
1053
|
a single pixel width or a single pixel distance is not enough to get a
|
1015
|
1054
|
good visualization. In these cases one could decide to enforce even
|
1016
|
|
-larger vakues. We can do this not only for the minimum width, but
|
|
1055
|
+larger values. We can do this not only for the minimum width, but
|
1017
|
1056
|
also for the minimum distance by setting the \code{min.distance}
|
1018
|
1057
|
parameter.
|
1019
|
1058
|
|
...
|
...
|
@@ -1053,21 +1092,25 @@ addParTable("AnnotationTrack")
|
1053
|
1092
|
\Rclass{GeneRegionTrack} objects are in principle very similar to
|
1054
|
1093
|
\Rclass{AnnotationTrack} objects. The only difference is that they are
|
1055
|
1094
|
a little more gene/transcript centric, both in terms of plotting
|
1056
|
|
-layout and user interaction, and that they define a global start and
|
1057
|
|
-end position. The constructor function of the same same is a
|
|
1095
|
+layout and user interaction, and that they may define a global start
|
|
1096
|
+and end position (a feature which is not particularly relevant for the
|
|
1097
|
+normal user). The constructor function of the same name is a
|
1058
|
1098
|
convenient tool to instantiate the object from a variety of different
|
1059
|
|
-sources. In a nutshell we need to pass start and end positions (or the
|
1060
|
|
-width) of each annotation feature in the track and also supply the
|
|
1099
|
+sources. In a nutshell, we need to pass start and end positions (or
|
|
1100
|
+the width) of each annotation feature in the track and also supply the
|
1061
|
1101
|
exon, transcript and gene identifiers for each item which will be used
|
1062
|
|
-to create the transcript groupings. For more details about the
|
1063
|
|
-available options see the class's manual page
|
1064
|
|
-(\code{?GeneRegionTrack}). There are a number of accessor methods that
|
1065
|
|
-make it easy to query and replace for instance exon, transcript or
|
1066
|
|
-gene assignments. There is also some support for gene aliases or gene
|
1067
|
|
-symbols which are often times more useful than cryptic data base gene
|
1068
|
|
-identifiers. The following code that re-uses the
|
1069
|
|
-\Rclass{GeneRegionTrack} object from the first section exemplifies
|
1070
|
|
-some of these features.
|
|
1102
|
+to create the transcript groupings. A somewhat special case is to
|
|
1103
|
+build a \Rclass{GeneRegionTrack} object directly from one of the
|
|
1104
|
+popular \Rclass{TranscriptDb} objects, an option that is treated in
|
|
1105
|
+more detail below. For more information about the available options
|
|
1106
|
+see the class's manual page (\code{?GeneRegionTrack}).
|
|
1107
|
+
|
|
1108
|
+There are a number of accessor methods that make it easy to query and
|
|
1109
|
+replace for instance exon, transcript or gene assignments. There is
|
|
1110
|
+also some support for gene aliases or gene symbols which are often
|
|
1111
|
+times more useful than cryptic data base gene identifiers. The
|
|
1112
|
+following code that re-uses the \Rclass{GeneRegionTrack} object from
|
|
1113
|
+the first section exemplifies some of these features.
|
1071
|
1114
|
|
1072
|
1115
|
|
1073
|
1116
|
<<generegtrack, fig=TRUE, results=hide, width=7.5, height=1.5>>=
|
...
|
...
|
@@ -1077,13 +1120,54 @@ head(gene(grtrack))
|
1077
|
1120
|
head(transcript(grtrack))
|
1078
|
1121
|
head(exon(grtrack))
|
1079
|
1122
|
head(symbol(grtrack))
|
1080
|
|
-plotTracks(grtrack, showId=TRUE)
|
|
1123
|
+plotTracks(grtrack, extend.left=20000, showId=TRUE)
|
1081
|
1124
|
@
|
1082
|
1125
|
|
1083
|
1126
|
<<generegtrack2, fig=TRUE, results=hide, width=7.5, height=2.5>>=
|
1084
|
|
-plotTracks(grtrack, showId=TRUE, geneSymbols=FALSE)
|
|
1127
|
+plotTracks(grtrack, extend.left=20000, showId=TRUE, geneSymbols=FALSE)
|
|
1128
|
+@
|
|
1129
|
+
|
|
1130
|
+Since we have the gene level information as part of our
|
|
1131
|
+\Rclass{GeneRegionTrack} objects we can ask the package to collapse
|
|
1132
|
+all of our gene models from individual exons and transcripts down to
|
|
1133
|
+gene body locations by setting the \Rfunarg{collapseTranscripts}
|
|
1134
|
+display parameter to \code{TRUE}.
|
|
1135
|
+
|
|
1136
|
+<<generegtrack3, fig=TRUE, results=hide, width=7.5, height=1>>=
|
|
1137
|
+plotTracks(grtrack, collapseTranscripts=TRUE, shape="arrow", showId=TRUE, extend.left=20000)
|
1085
|
1138
|
@
|
1086
|
1139
|
|
|
1140
|
+\subsubsection*{Building GeneRegionTrack objects from TranscriptDbs}
|
|
1141
|
+
|
|
1142
|
+The \Rpackage{GenomicFeatures} packages provides an elegant framework
|
|
1143
|
+to download gene model information from online sources and to store it
|
|
1144
|
+locally in a SQLite data base. Because these so called
|
|
1145
|
+\Rclass{TranscriptDb} objects have become the de-facto standard for
|
|
1146
|
+genome annotation information in Bioconductor we tried to make it as
|
|
1147
|
+simple as possible to convert them into
|
|
1148
|
+\Rclass{GeneRegionTracks}. Essentially one only has to call the
|
|
1149
|
+constructor function with the \Rclass{TranscriptDb} object as a single
|
|
1150
|
+argument. We exemplify this on a small sample data set that comes with
|
|
1151
|
+the \Rpackage{GenomicFeatures} package.
|
|
1152
|
+
|
|
1153
|
+<<tdb2grt1>>=
|
|
1154
|
+library(GenomicFeatures)
|
|
1155
|
+samplefile <- system.file("extdata", "UCSC_knownGene_sample.sqlite", package="GenomicFeatures")
|
|
1156
|
+txdb <- loadDb(samplefile)
|
|
1157
|
+GeneRegionTrack(txdb)
|
|
1158
|
+@
|
|
1159
|
+
|
|
1160
|
+In this context, the constructor's \Rfunarg{chromosome},
|
|
1161
|
+\Rfunarg{start} and \Rfunarg{end} argument take on a slightly differnt
|
|
1162
|
+meaning in that they can be used to subset the data that is fetched
|
|
1163
|
+from the \Rclass{TranscriptDb} object. Please note that while the
|
|
1164
|
+\Rfunarg{chromosome} alone can be supplied, providing \Rfunarg{start}
|
|
1165
|
+or \Rfunarg{end} without the chromosome information will not work.
|
|
1166
|
+
|
|
1167
|
+<<tdb2grt2>>=
|
|
1168
|
+GeneRegionTrack(txdb, chromosome="chr4", start=40000, end=60000)
|
|
1169
|
+@
|
|
1170
|
+
|
1087
|
1171
|
\subsubsection*{Display parameters for GeneRegionTrack objects}
|
1088
|
1172
|
|
1089
|
1173
|
For a complete listing of all the available display parameters please
|
...
|
...
|
@@ -1096,13 +1180,13 @@ addParTable("GeneRegionTrack")
|
1096
|
1180
|
|
1097
|
1181
|
\subsection{BiomartGeneRegionTrack}
|
1098
|
1182
|
|
1099
|
|
-It is often very useful to quickly download gene annotation
|
1100
|
|
-information from an online repositry rather than having to construct
|
1101
|
|
-it each time from scratch. To this end, the \mgg package defines the
|
1102
|
|
-\Rclass{BiomartGeneRegionTrack} class, which directly extends
|
1103
|
|
-\Rclass{GeneRegionTrack} but provides a direct interface to the
|
1104
|
|
-ENSEMBL Biomart service (yet another interface to the UCSC data base
|
1105
|
|
-content is highlighted in the next section). Rather than
|
|
1183
|
+As seen before it can be very useful to quickly download gene
|
|
1184
|
+annotation information from an online repositry rather than having to
|
|
1185
|
+construct it each time from scratch. To this end, the \mgg package
|
|
1186
|
+also defines the \Rclass{BiomartGeneRegionTrack} class, which directly
|
|
1187
|
+extends \Rclass{GeneRegionTrack} but provides a direct interface to
|
|
1188
|
+the ENSEMBL Biomart service (yet another interface to the UCSC data
|
|
1189
|
+base content is highlighted in one of the next sections). Rather than
|
1106
|
1190
|
providing all the bits and pieces for the full gene model, we just
|
1107
|
1191
|
enter a genome, chromosome and a start and end position on this
|
1108
|
1192
|
chromosome, and the constructor function
|
...
|
...
|
@@ -1110,7 +1194,7 @@ chromosome, and the constructor function
|
1110
|
1194
|
fetch the necessary information and build the gene model on the
|
1111
|
1195
|
fly. Please note that you will need an internet connection for this to
|
1112
|
1196
|
work, and that contacting Biomart can take a significant amount of
|
1113
|
|
-time depending on usage and network traffic, and that the results are
|
|
1197
|
+time depending on usage and network traffic. Hence the results are
|
1114
|
1198
|
almost never going to be returned instantaniously.
|
1115
|
1199
|
|
1116
|
1200
|
<<BiomartGeneRegionTrackShow, eval=FALSE>>=
|
...
|
...
|
@@ -1272,7 +1356,7 @@ the collapsed ranges.
|
1272
|
1356
|
|
1273
|
1357
|
<<DetailsAnnotationTrack6>>=
|
1274
|
1358
|
detFun <- function(identifier, GdObject.original, ...){
|
1275
|
|
- plotTracks(list(GenomeAxisTrack(scale=0.3, labelPos="below", size=0.2, cex=0.7), GdObject.original[group(GdObject.original)==identifier]),
|
|
1359
|
+ plotTracks(list(GenomeAxisTrack(scale=0.3, size=0.2, cex=0.7), GdObject.original[group(GdObject.original)==identifier]),
|
1276
|
1360
|
add=TRUE, showTitle=FALSE)
|
1277
|
1361
|
}
|
1278
|
1362
|
@
|
...
|
...
|
@@ -1281,8 +1365,8 @@ Finally, we load some sample data, turn it into a \code{DetailsAnnotationTrack}
|
1281
|
1365
|
|
1282
|
1366
|
<<DetailsAnnotationTrack7, results=hide, width=7.5, height=2, fig=TRUE>>=
|
1283
|
1367
|
data(geneDetails)
|
1284
|
|
-deTrack2 <- AnnotationTrack(range=geneDetails, chromosome=chr, genome=gen, fun=detFun, selectFun=selFun,
|
1285
|
|
- groupDetails=TRUE, details.size=0.3, detailsConnector.cex=0.5, detailsConnector.lty="dotted",
|
|
1368
|
+deTrack2 <- AnnotationTrack(geneDetails, fun=detFun, selectFun=selFun,
|
|
1369
|
+ groupDetails=TRUE, details.size=0.5, detailsConnector.cex=0.5, detailsConnector.lty="dotted",
|
1286
|
1370
|
shape=c("smallArrow", "arrow"), showId=TRUE)
|
1287
|
1371
|
plotTracks(deTrack2, extend.left=90000)
|
1288
|
1372
|
@
|
...
|
...
|
@@ -1349,7 +1433,7 @@ to do for us.
|
1349
|
1433
|
\begin{figure}[htb]
|
1350
|
1434
|
\centering
|
1351
|
1435
|
\includegraphics{ucsc2.pdf}
|
1352
|
|
- \label{fig:UCSC1}
|
|
1436
|
+ \label{fig:UCSC2}
|
1353
|
1437
|
\caption{A screen shot of a UCSC table browser view on the UCSC Known Genes track.}
|
1354
|
1438
|
\end{figure}
|
1355
|
1439
|
|
...
|
...
|
@@ -1451,6 +1535,30 @@ plotTracks(list(idxTrack, axTrack, knownGenes, refGenes, ensGenes, cpgIslands,
|
1451
|
1535
|
@
|
1452
|
1536
|
|
1453
|
1537
|
\section{Bioconductor integration}
|
|
1538
|
+This short section is supposed to give a very brief overview over the
|
|
1539
|
+different track classes in the \mgg package and how those can be
|
|
1540
|
+constructed from the typical Bioconductor classes that deal with
|
|
1541
|
+genomic data. The list ist by no means complete, and a closer look at
|
|
1542
|
+a track class' documentation should provide all the possible options.
|
|
1543
|
+
|
|
1544
|
+\begin{longtable}{ l | l | p{9.5cm} }
|
|
1545
|
+ \hline
|
|
1546
|
+ Gviz class & Bioconductor class & Method\\
|
|
1547
|
+ \hline
|
|
1548
|
+ AnnotationTrack & IRanges & Constructor + additional arguments \\
|
|
1549
|
+ & GRanges & Constructor or setAs method, additional data in elementMetadata \\
|
|
1550
|
+ & GRangesList & Constructor or setAs method \\
|
|
1551
|
+ \hline
|
|
1552
|
+ GeneRegionTrack & IRanges & Constructor + additonal arguments \\
|
|
1553
|
+ & GRanges & Constructor or setAs method, additional data in elementMetadata \\
|
|
1554
|
+ & GRangesList & Constructor or setAs method, additional data in elementMetadata \\
|
|
1555
|
+ & TranscriptDb & Constructor or setAs method \\
|
|
1556
|
+ \hline
|
|
1557
|
+ DataTrack & IRanges & Constructor + additional data matrix \\
|
|
1558
|
+ & GRanges & Constructor or setAs method, numeric data in elementMetadata \\
|
|
1559
|
+ \hline
|
|
1560
|
+\end{longtable}
|
|
1561
|
+
|
1454
|
1562
|
|
1455
|
1563
|
\section{Composite plots for multiple chromosomes}
|
1456
|
1564
|
|
...
|
...
|
@@ -1487,7 +1595,7 @@ mdTrack <- DataTrack(range=GRanges(seqnames=rep(chroms, c(10, 40, 20, 100)),
|
1487
|
1595
|
Now we also want a genome axis and an \Rclass{IdeogramTrack} object to indicate the genomic context.
|
1488
|
1596
|
|
1489
|
1597
|
<<multPlot2>>=
|
1490
|
|
-mgTrack <- GenomeAxisTrack(scale=0.5, labelPos="below")
|
|
1598
|
+mgTrack <- GenomeAxisTrack(scale=0.5, labelPos="below", exponent=3)
|
1491
|
1599
|
chromosome(itrack) <- "chr1"
|
1492
|
1600
|
@
|
1493
|
1601
|
|
...
|
...
|
@@ -1495,7 +1603,7 @@ Finaly, we build a layout in which the plots for each chromosome are
|
1495
|
1603
|
placed in a rectangular grid and repeatedly call
|
1496
|
1604
|
\Rfunction{plotTracks} for each chromosome.
|
1497
|
1605
|
|
1498
|
|
-<<multPlot3, fig=TRUE, results=hide, width=7.5, height=7.5>>=
|
|
1606
|
+<<multPlot3, fig=TRUE, results=hide, width=7.5, height=4>>=
|
1499
|
1607
|
ncols <- 2
|
1500
|
1608
|
nrows <- length(chroms)%/%ncols
|
1501
|
1609
|
grid.newpage()
|
...
|
...
|
@@ -1507,6 +1615,17 @@ for(i in seq_along(chroms)){
|
1507
|
1615
|
}
|
1508
|
1616
|
@
|
1509
|
1617
|
|
|
1618
|
+Maybe an even more compact version of this would be to use the lattice
|
|
1619
|
+package for building the actual trellis, with \Rfunction{plotTracks}
|
|
1620
|
+as the panel function.
|
|
1621
|
+
|
|
1622
|
+<<multPlot4, fig=TRUE, results=hide, width=7.5, height=4>>=
|
|
1623
|
+library(lattice)
|
|
1624
|
+chroms <- data.frame(chromosome=chroms)
|
|
1625
|
+xyplot(1~chromosome|chromosome, data=chroms, panel=function(x){plotTracks(list(itrack , maTrack, mdTrack, mgTrack), chromosome=x, add=TRUE, showId=FALSE)},
|
|
1626
|
+ scales=list(draw=FALSE), xlab=NULL, ylab=NULL)
|
|
1627
|
+@
|
|
1628
|
+
|
1510
|
1629
|
\clearpage
|
1511
|
1630
|
\section*{SessionInfo}
|
1512
|
1631
|
The following is the session info that generated this vignette:
|