Browse code

Doc and vignette updates

Andrew McDavid authored on 21/06/2019 04:13:35
Showing 36 changed files

... ...
@@ -9,3 +9,5 @@ manuscript/
9 9
 ^_pkgdown\.yml$
10 10
 ^docs$
11 11
 ^pkgdown$
12
+.ignore
13
+README.Rmd
... ...
@@ -4,9 +4,11 @@ export(ContigCellDB)
4 4
 export(ContigCellDB_10XVDJ)
5 5
 export(canonicalize_by_chain)
6 6
 export(canonicalize_by_prevalence)
7
-export(canonicalize_by_subset)
7
+export(canonicalize_cell)
8
+export(canonicalize_cluster)
8 9
 export(cdhit)
9 10
 export(cdhit_ccdb)
11
+export(cluster_germline)
10 12
 export(cluster_permute_test)
11 13
 export(entropy)
12 14
 export(enumerate_pairing)
... ...
@@ -17,6 +19,7 @@ export(ig_chain_recode)
17 19
 export(modal_category)
18 20
 export(np)
19 21
 export(pairing_tables)
22
+export(purity)
20 23
 export(tcr_chain_recode)
21 24
 exportMethods("$")
22 25
 exportMethods("$<-")
... ...
@@ -1,6 +1,5 @@
1 1
 # To Document
2 2
 # primary_keys, primary_keys<- -- use `$` to access/replace. Table not checked for validity, but I think should be?
3
-# canonicalize contigs on cells
4 3
 # subset (subset a table, then equalize...)
5 4
 
6 5
 
... ...
@@ -69,12 +68,12 @@ ContigCellDB_10XVDJ = function(contig_tbl, contig_pk = c('barcode', 'contig_id')
69 68
     ContigCellDB(contig_tbl = contig_tbl, contig_pk = contig_pk, cell_pk = cell_pk)
70 69
 }
71 70
 
72
-#' Creat a method for ContigCellDB object to access its slots
71
+#' Access public members of ContigCellDB object
73 72
 #'
74 73
 #' @param x A ContigCellDB object
75 74
 #' @param name a slot of a ContigCellDB object (one of  `c('contig_tbl', 'cell_tbl', 'contig_pk', 'cell_pk', 'cluster_tbl', 'cluster_pk')`)
76 75
 #'
77
-#' @return Slots of ContigCellDB
76
+#' @return Slot of ContigCellDB
78 77
 #' @export
79 78
 #'
80 79
 #' @examples
... ...
@@ -89,7 +88,7 @@ setMethod("$", signature = c(x = 'ContigCellDB'), function(x, name){
89 88
     }
90 89
 })
91 90
 
92
-#' Create a function of ContigCellDB object to replace values of its slots
91
+#' Access public members of ContigCellDB object
93 92
 #'
94 93
 #' @param x A ContigCellDB object
95 94
 #' @param name Name of a slot for a ContigCellDB object (one of  `c('contig_tbl', 'cell_tbl', 'contig_pk', 'cell_pk', 'cluster_tbl', 'cluster_pk')`)
... ...
@@ -147,7 +146,7 @@ equalize_ccdb = function(x){
147 146
 
148 147
 replace_cluster_tbl = function(ccdb, cluster_tbl, contig_tbl, cluster_pk){
149 148
     if(nrow(ccdb$cluster_tbl)>0 && !missing(cluster_pk)){
150
-        warning("Replacing `cluster_tbl` with key ccdb$cluster_pk")
149
+        warning("Replacing `cluster_tbl` with ", paste(ccdb$cluster_pk, sep = ', ', '.'))
151 150
     }
152 151
     if(!missing(cluster_pk)) ccdb$cluster_pk = cluster_pk
153 152
     ccdb@cluster_tbl = cluster_tbl
154 153
new file mode 100644
... ...
@@ -0,0 +1,55 @@
1
+---
2
+output: github_document
3
+---
4
+
5
+<!-- README.md is generated from README.Rmd. Please edit that file -->
6
+
7
+```{r, echo = FALSE}
8
+knitr::opts_chunk$set(
9
+  collapse = TRUE,
10
+  comment = "#>",
11
+  fig.path = "man/figures/"
12
+)
13
+```
14
+
15
+# CellaRepertorium
16
+
17
+This package contains methods for clustering and analyzing single cell RepSeq data, especially as generated by [10X genomics VDJ solution](https://support.10xgenomics.com/single-cell-vdj).
18
+
19
+## Installation
20
+
21
+```
22
+devtools::install_github('amcdavid/CellaRepertorium')
23
+```
24
+
25
+Requires R >= 3.5.
26
+
27
+## Data requirements and package structure
28
+
29
+The fundamental unit this package operates on is the **contig**, which is a section of contiguously stitched reads from a single **cell**.  Each contig belongs to one (and only one) cell, however, cells generate multiple contigs.  
30
+
31
+```{r, echo = FALSE}
32
+knitr::include_graphics('vignettes/figure/contig_schematic.png')
33
+```
34
+
35
+Contigs can also belong to a **cluster**.  Because of these two many-to-one mappings, these data can be thought as a series of ragged arrays.  The links between them mean they are relational data. A `ContigCellDB` object wraps each of these objects as a sequence of three `data.frames` (well, `tibbles`, actually).   `ContigCellDB` also tracks columns (the primary keys) that unique identify each row in each of these tables.  The `contig_tbl` is the `tibble` containing **contigs**, the `cell_tbl` contains the **cells**, and the `cluster_tbl` contains the **clusters**.  
36
+
37
+The `contig_pk`, `cell_pk` and `cluster_pk` identify the columns that identify a contig, cell and cluster, respectively.  These will serve as foreign keys that link the three tables together.
38
+The tables are kept in sync so that subsetting the contigs will subset the cells, and clusters, and vice-versa.
39
+
40
+```{r, echo = FALSE}
41
+knitr::include_graphics('vignettes/figure/table_schematic.png')
42
+```
43
+
44
+Of course, each of these tables can contain many other columns that will serve as covariates for various analyses, such as the CDR3 sequence, or the identity of the V, D and J regions.  Various derived quantities that describe cells and clusters can also be calculated, and added to these tables, such as the medoid of a cluster --  a contig that  minimizes the average distance to all other clusters. 
45
+
46
+## Functions
47
+
48
+[a screencap of something interesting?]
49
+
50
+*  `cdhit_ccdb`: An R interface to CDhit, which was originally ported by Thomas Lin Pedersen.
51
+*  `fine_clustering`: clustering CDR3 by edit distances (possibly using empirical amino acid substitution matrices)
52
+*  `cluster_permute_test`: permutation tests of cluster statistics
53
+*  `pairing_tables`: Generate pairings of contigs within each cell in a way that they can be plotted
54
+
55
+
... ...
@@ -1,34 +1,60 @@
1
+
2
+<!-- README.md is generated from README.Rmd. Please edit that file -->
3
+
1 4
 # CellaRepertorium
2 5
 
3
-This package contains methods for clustering and analyzing single cell RepSeq data, especially as generated by [10X genomics VDJ solution](https://support.10xgenomics.com/single-cell-vdj).
6
+This package contains methods for clustering and analyzing single cell
7
+RepSeq data, especially as generated by [10X genomics VDJ
8
+solution](https://support.10xgenomics.com/single-cell-vdj).
4 9
 
5 10
 ## Installation
6 11
 
7
-```
8
-devtools::install_github('amcdavid/CellaRepertorium')
9
-```
12
+    devtools::install_github('amcdavid/CellaRepertorium')
10 13
 
11
-Requires R>=3.5.
14
+Requires R \>= 3.5.
12 15
 
13 16
 ## Data requirements and package structure
14 17
 
15
-The fundamental unit is the **contig**, which is a section of contiguously stitched reads from a single **cell**.  Each contig belongs to one (and only one) cell, however, cells generate multiple contigs.  Contigs can also belong to a **cluster**.  Because of these two many-to-one mappings, these data can be thought as a series of ragged arrays.  The links between them mean they are relational data.
16
-
17
-[A schematic of contigs and cells should go here]
18
-
19
-A `ContigCellDB` object wraps each of these objects as a sequence of three `data.frame`s (well, `tibble`s, actually).   `ContigCellDB` also tracks columns (keys) that unique identify each row in each of these tables.  The `contig_tbl` is the `tibble` containing **contigs**, the `cell_tbl` contains the **cells**, and the `cluster_tbl` contains the **clusters**.  The `contig_pk`, `cell_pk` and `cluster_pk` identify the columns that identify a contig, cell and cluster, respectively, and must be unique in each of the respective tables.
20
-The tables are kept in sync so that subsetting the contigs will subset the cells, and clusters, and vice-versa.
21
-
22
-[A schematic showing table relations should go here]
23
-
24
-Of course, each of these tables can contain many other columns that will serve as covariates for various analysis, such as the CDR3 sequence, or the identity of the V, D and J regions.  Various derived quantities that describe cells and clusters can also be calculated, and added to these tables, such as the medoid of a cluster.
18
+The fundamental unit this package operates on is the **contig**, which
19
+is a section of contiguously stitched reads from a single **cell**. Each
20
+contig belongs to one (and only one) cell, however, cells generate
21
+multiple contigs.
22
+
23
+![](vignettes/figure/contig_schematic.png)<!-- -->
24
+
25
+Contigs can also belong to a **cluster**. Because of these two
26
+many-to-one mappings, these data can be thought as a series of ragged
27
+arrays. The links between them mean they are relational data. A
28
+`ContigCellDB` object wraps each of these objects as a sequence of three
29
+`data.frames` (well, `tibbles`, actually). `ContigCellDB` also tracks
30
+columns (the primary keys) that unique identify each row in each of
31
+these tables. The `contig_tbl` is the `tibble` containing **contigs**,
32
+the `cell_tbl` contains the **cells**, and the `cluster_tbl` contains
33
+the **clusters**.
34
+
35
+The `contig_pk`, `cell_pk` and `cluster_pk` identify the columns that
36
+identify a contig, cell and cluster, respectively. These will serve as
37
+foreign keys that link the three tables together. The tables are kept in
38
+sync so that subsetting the contigs will subset the cells, and clusters,
39
+and vice-versa.
40
+
41
+![](vignettes/figure/table_schematic.png)<!-- -->
42
+
43
+Of course, each of these tables can contain many other columns that will
44
+serve as covariates for various analyses, such as the CDR3 sequence, or
45
+the identity of the V, D and J regions. Various derived quantities that
46
+describe cells and clusters can also be calculated, and added to these
47
+tables, such as the medoid of a cluster – a contig that minimizes the
48
+average distance to all other clusters.
25 49
 
26 50
 ## Functions
27 51
 
28
-[a screencap of something interesting?]
29
-
30
-*  `cdhit`: An R interface to CDhit, which was originally ported by Thomas Lin Pedersen.
31
-*  `fine_cluster`: clustering CDR3 by edit distances (possibly using empirical amino acid substitution matrices)
32
-*  `cluster_permute_test`: permutation tests of cluster statistics
33
-
52
+\[a screencap of something interesting?\]
34 53
 
54
+  - `cdhit_ccdb`: An R interface to CDhit, which was originally ported
55
+    by Thomas Lin Pedersen.
56
+  - `fine_clustering`: clustering CDR3 by edit distances (possibly using
57
+    empirical amino acid substitution matrices)
58
+  - `cluster_permute_test`: permutation tests of cluster statistics
59
+  - `pairing_tables`: Generate pairings of contigs within each cell in a
60
+    way that they can be plotted
... ...
@@ -1 +1,41 @@
1 1
 destination: docs
2
+
3
+reference:
4
+- title: Constuction and modification
5
+  desc: "Functions to construct and modify a ContigCellDB"
6
+  contents:
7
+   - '`ContigCellDB`'
8
+   - '`$,ContigCellDB-method`'
9
+   - '`$<-,ContigCellDB-method`'
10
+- title: Clustering
11
+  desc: "Methods to cluster contigs"
12
+  contents:
13
+   - '`cluster_germline`'
14
+   - '`fine_clustering`'
15
+   - '`cdhit`'
16
+   - '`pairing_tables`'
17
+- title: Canonicalization
18
+  desc: "Methods to return single contigs for cells or clusters"
19
+  contents:
20
+   - '`canonicalize_cell`'
21
+   - '`canonicalize_cluster`'
22
+- title: Statistical testing
23
+  contents:
24
+   - '`cluster_permute_test`'
25
+   - '`purity`'
26
+- title: Datasets
27
+  contents:
28
+   - '`ccdb_ex`'
29
+   - '`contigs_qc`'
30
+   - '`canonicalize_by_prevalence`'
31
+- title: Internal or WIP
32
+  desc: "Functions that may be made internal, removed, or with interfaces subject to change."
33
+  contents:
34
+  - '`cluster_germline`'
35
+  - '`cluster_permute_test`'
36
+  - '`.cluster_permute_test`'
37
+  - '`entropy`'
38
+  - '`ig_chain_recode`'
39
+  - '`fancy_name_contigs`'
40
+  - '`fine_cluster_seqs`'
41
+  - '`get_canonical_representative`'
... ...
@@ -90,27 +90,26 @@
90 90
 
91 91
     
92 92
     
93
-<div class="sourceCode" id="cb1"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb1-1" data-line-number="1"><span class="co">#load_all()</span></a>
94
-<a class="sourceLine" id="cb1-2" data-line-number="2"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/library">library</a></span>(CellaRepertorium)</a>
95
-<a class="sourceLine" id="cb1-3" data-line-number="3"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/library">library</a></span>(dplyr)</a>
96
-<a class="sourceLine" id="cb1-4" data-line-number="4"><span class="co">#&gt; </span></a>
97
-<a class="sourceLine" id="cb1-5" data-line-number="5"><span class="co">#&gt; Attaching package: 'dplyr'</span></a>
98
-<a class="sourceLine" id="cb1-6" data-line-number="6"><span class="co">#&gt; The following objects are masked from 'package:stats':</span></a>
99
-<a class="sourceLine" id="cb1-7" data-line-number="7"><span class="co">#&gt; </span></a>
100
-<a class="sourceLine" id="cb1-8" data-line-number="8"><span class="co">#&gt;     filter, lag</span></a>
101
-<a class="sourceLine" id="cb1-9" data-line-number="9"><span class="co">#&gt; The following objects are masked from 'package:base':</span></a>
102
-<a class="sourceLine" id="cb1-10" data-line-number="10"><span class="co">#&gt; </span></a>
103
-<a class="sourceLine" id="cb1-11" data-line-number="11"><span class="co">#&gt;     intersect, setdiff, setequal, union</span></a>
104
-<a class="sourceLine" id="cb1-12" data-line-number="12"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/library">library</a></span>(ggplot2)</a>
105
-<a class="sourceLine" id="cb1-13" data-line-number="13"><span class="co">#&gt; Registered S3 methods overwritten by 'ggplot2':</span></a>
106
-<a class="sourceLine" id="cb1-14" data-line-number="14"><span class="co">#&gt;   method         from </span></a>
107
-<a class="sourceLine" id="cb1-15" data-line-number="15"><span class="co">#&gt;   [.quosures     rlang</span></a>
108
-<a class="sourceLine" id="cb1-16" data-line-number="16"><span class="co">#&gt;   c.quosures     rlang</span></a>
109
-<a class="sourceLine" id="cb1-17" data-line-number="17"><span class="co">#&gt;   print.quosures rlang</span></a>
110
-<a class="sourceLine" id="cb1-18" data-line-number="18"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/library">library</a></span>(readr)</a>
111
-<a class="sourceLine" id="cb1-19" data-line-number="19"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/library">library</a></span>(tidyr)</a>
112
-<a class="sourceLine" id="cb1-20" data-line-number="20"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/library">library</a></span>(stringr)</a>
113
-<a class="sourceLine" id="cb1-21" data-line-number="21"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/library">library</a></span>(purrr)</a></code></pre></div>
93
+<div class="sourceCode" id="cb1"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb1-1" data-line-number="1"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/library">library</a></span>(CellaRepertorium)</a>
94
+<a class="sourceLine" id="cb1-2" data-line-number="2"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/library">library</a></span>(dplyr)</a>
95
+<a class="sourceLine" id="cb1-3" data-line-number="3"><span class="co">#&gt; </span></a>
96
+<a class="sourceLine" id="cb1-4" data-line-number="4"><span class="co">#&gt; Attaching package: 'dplyr'</span></a>
97
+<a class="sourceLine" id="cb1-5" data-line-number="5"><span class="co">#&gt; The following objects are masked from 'package:stats':</span></a>
98
+<a class="sourceLine" id="cb1-6" data-line-number="6"><span class="co">#&gt; </span></a>
99
+<a class="sourceLine" id="cb1-7" data-line-number="7"><span class="co">#&gt;     filter, lag</span></a>
100
+<a class="sourceLine" id="cb1-8" data-line-number="8"><span class="co">#&gt; The following objects are masked from 'package:base':</span></a>
101
+<a class="sourceLine" id="cb1-9" data-line-number="9"><span class="co">#&gt; </span></a>
102
+<a class="sourceLine" id="cb1-10" data-line-number="10"><span class="co">#&gt;     intersect, setdiff, setequal, union</span></a>
103
+<a class="sourceLine" id="cb1-11" data-line-number="11"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/library">library</a></span>(ggplot2)</a>
104
+<a class="sourceLine" id="cb1-12" data-line-number="12"><span class="co">#&gt; Registered S3 methods overwritten by 'ggplot2':</span></a>
105
+<a class="sourceLine" id="cb1-13" data-line-number="13"><span class="co">#&gt;   method         from </span></a>
106
+<a class="sourceLine" id="cb1-14" data-line-number="14"><span class="co">#&gt;   [.quosures     rlang</span></a>
107
+<a class="sourceLine" id="cb1-15" data-line-number="15"><span class="co">#&gt;   c.quosures     rlang</span></a>
108
+<a class="sourceLine" id="cb1-16" data-line-number="16"><span class="co">#&gt;   print.quosures rlang</span></a>
109
+<a class="sourceLine" id="cb1-17" data-line-number="17"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/library">library</a></span>(readr)</a>
110
+<a class="sourceLine" id="cb1-18" data-line-number="18"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/library">library</a></span>(tidyr)</a>
111
+<a class="sourceLine" id="cb1-19" data-line-number="19"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/library">library</a></span>(stringr)</a>
112
+<a class="sourceLine" id="cb1-20" data-line-number="20"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/library">library</a></span>(purrr)</a></code></pre></div>
114 113
 <div id="load-filtered-contig-files" class="section level1">
115 114
 <h1 class="hasAnchor">
116 115
 <a href="#load-filtered-contig-files" class="anchor"></a>Load filtered contig files</h1>
... ...
@@ -160,10 +159,14 @@
160 159
 <a class="sourceLine" id="cb5-6" data-line-number="6"><span class="co">#&gt; `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.</span></a></code></pre></div>
161 160
 <p><img src="cdr3_clustering_files/figure-html/unnamed-chunk-5-1.png" width="700"></p>
162 161
 <p>We can also cluster by DNA identity.</p>
163
-<div class="sourceCode" id="cb6"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb6-1" data-line-number="1">germline_cluster =<span class="st"> </span>CellaRepertorium<span class="op">:::</span><span class="kw">cluster_germline</span>(cdb, <span class="dt">segment_keys =</span> <span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/c">c</a></span>(<span class="st">'v_gene'</span>, <span class="st">'j_gene'</span>, <span class="st">'chain'</span>), <span class="dt">cluster_name =</span> <span class="st">'segment_idx'</span>)</a>
162
+</div>
163
+<div id="cluster-by-v-j-identity" class="section level1">
164
+<h1 class="hasAnchor">
165
+<a href="#cluster-by-v-j-identity" class="anchor"></a>Cluster by V-J identity</h1>
166
+<div class="sourceCode" id="cb6"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb6-1" data-line-number="1">germline_cluster =<span class="st"> </span>CellaRepertorium<span class="op">:::</span><span class="kw"><a href="../reference/cluster_germline.html">cluster_germline</a></span>(cdb, <span class="dt">segment_keys =</span> <span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/c">c</a></span>(<span class="st">'v_gene'</span>, <span class="st">'j_gene'</span>, <span class="st">'chain'</span>), <span class="dt">cluster_name =</span> <span class="st">'segment_idx'</span>)</a>
164 167
 <a class="sourceLine" id="cb6-2" data-line-number="2"><span class="co">#&gt; Warning in replace_cluster_tbl(ccdb, cluster_tbl, cl_con_tbl, cluster_pk =</span></a>
165
-<a class="sourceLine" id="cb6-3" data-line-number="3"><span class="co">#&gt; cluster_name): Replacing `cluster_tbl` with key ccdb$cluster_pk</span></a></code></pre></div>
166
-<p>And by other features of the contigs. Here we cluster each contig based on the chain and V-J genes. This gives us the set of observed V-J pairings:</p>
168
+<a class="sourceLine" id="cb6-3" data-line-number="3"><span class="co">#&gt; cluster_name): Replacing `cluster_tbl` with DNA97, .</span></a></code></pre></div>
169
+<p>We can cluster by any other feature of the contigs. Here we cluster each contig based on the chain and V-J genes. This gives us the set of observed V-J pairings:</p>
167 170
 <div class="sourceCode" id="cb7"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb7-1" data-line-number="1">germline_cluster =<span class="st"> </span><span class="kw"><a href="../reference/fine_clustering.html">fine_clustering</a></span>(germline_cluster, <span class="dt">sequence_key =</span> <span class="st">'cdr3_nt'</span>, <span class="dt">type =</span> <span class="st">'DNA'</span>)</a>
168 171
 <a class="sourceLine" id="cb7-2" data-line-number="2"><span class="co">#&gt; Calculating intradistances on 700 clusters.</span></a>
169 172
 <a class="sourceLine" id="cb7-3" data-line-number="3"><span class="co">#&gt; Summarizing</span></a>
... ...
@@ -176,6 +179,10 @@
176 179
 <div class="sourceCode" id="cb8"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb8-1" data-line-number="1"><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/ggplot.html">ggplot</a></span>(germline_cluster<span class="op">$</span>cluster_tbl <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/filter.html">filter</a></span>(chain <span class="op">==</span><span class="st"> 'TRB'</span>), <span class="kw"><a href="https://ggplot2.tidyverse.org/reference/aes.html">aes</a></span>(<span class="dt">x =</span> v_gene, <span class="dt">y =</span> j_gene, <span class="dt">fill =</span> avg_distance)) <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/geom_tile.html">geom_tile</a></span>() <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/theme.html">theme</a></span>(<span class="dt">axis.text.x =</span> <span class="kw"><a href="https://ggplot2.tidyverse.org/reference/element.html">element_text</a></span>(<span class="dt">angle =</span> <span class="dv">90</span>))</a></code></pre></div>
177 180
 <p><img src="cdr3_clustering_files/figure-html/unnamed-chunk-8-1.png" width="700"></p>
178 181
 <p>Average Levenstein distance of CDR3 within each pair</p>
182
+</div>
183
+<div id="some-simple-phylogenetic-relationship" class="section level1">
184
+<h1 class="hasAnchor">
185
+<a href="#some-simple-phylogenetic-relationship" class="anchor"></a>Some simple phylogenetic relationship</h1>
179 186
 <div class="sourceCode" id="cb9"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb9-1" data-line-number="1"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/library">library</a></span>(ggdendro)</a>
180 187
 <a class="sourceLine" id="cb9-2" data-line-number="2"></a>
181 188
 <a class="sourceLine" id="cb9-3" data-line-number="3"><span class="co"># This should be turned into a function in the package somehow</span></a>
... ...
@@ -205,7 +212,7 @@
205 212
 <pre><code>#&gt; 
206 213
 #&gt; [[4]]</code></pre>
207 214
 <p><img src="cdr3_clustering_files/figure-html/unnamed-chunk-9-4.png" width="700"></p>
208
-<div class="sourceCode" id="cb13"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb13-1" data-line-number="1">aa80 =<span class="st"> </span>CellaRepertorium<span class="op">:::</span><span class="kw">canonicalize_cluster</span>(aa80, <span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/c">c</a></span>(<span class="st">'cdr3'</span>), <span class="dt">contig_fields =</span> <span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/c">c</a></span>(<span class="st">'cdr3'</span>, <span class="st">'cdr3_nt'</span>, <span class="st">'chain'</span>, <span class="st">'v_gene'</span>, <span class="st">'d_gene'</span>, <span class="st">'j_gene'</span>))</a></code></pre></div>
215
+<div class="sourceCode" id="cb13"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb13-1" data-line-number="1">aa80 =<span class="st"> </span><span class="kw"><a href="../reference/canonicalize_cluster.html">canonicalize_cluster</a></span>(aa80, <span class="dt">representative =</span> <span class="st">'cdr3'</span>, <span class="dt">contig_fields =</span> <span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/c">c</a></span>(<span class="st">'cdr3'</span>, <span class="st">'cdr3_nt'</span>, <span class="st">'chain'</span>, <span class="st">'v_gene'</span>, <span class="st">'d_gene'</span>, <span class="st">'j_gene'</span>))</a></code></pre></div>
209 216
 <p>Pull the fields listed in <code>contig_fields</code> into the <code>cluster_tbl</code>, using the values found in the medoid contig</p>
210 217
 <div class="sourceCode" id="cb14"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb14-1" data-line-number="1">oligo_clusters =<span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/filter.html">filter</a></span>(aa80<span class="op">$</span>cluster_tbl, n_cluster <span class="op">&gt;=</span><span class="st"> </span>MIN_OLIGO)</a>
211 218
 <a class="sourceLine" id="cb14-2" data-line-number="2">oligo_contigs =<span class="st"> </span>aa80</a>
... ...
@@ -304,47 +311,59 @@
304 311
 <a class="sourceLine" id="cb18-9" data-line-number="9"><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/stopifnot">stopifnot</a></span>( <span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/all">all</a></span>(<span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/colSums">colSums</a></span>(<span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/with">with</a></span>(oligo_cluster_stat, <span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/table">table</a></span>(chain, cluster_idx)) <span class="op">&gt;</span><span class="st"> </span><span class="dv">0</span>) <span class="op">==</span><span class="st"> </span><span class="dv">1</span>))</a>
305 312
 <a class="sourceLine" id="cb18-10" data-line-number="10"></a>
306 313
 <a class="sourceLine" id="cb18-11" data-line-number="11">mm_out =<span class="st"> </span><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/warning">suppressWarnings</a></span>(oligo_cluster_stat <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/group_by.html">group_by</a></span>(cluster_idx, chain) <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/do.html">do</a></span>( <span class="kw"><a href="https://www.rdocumentation.org/packages/lme4/topics/glmer">glmer</a></span>(<span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/cbind">cbind</a></span>(n_cluster, total_cells) <span class="op">~</span><span class="st"> </span>pop <span class="op">+</span><span class="st"> </span>weeks_premature <span class="op">+</span><span class="st"> </span>(<span class="dv">1</span><span class="op">|</span>sample), <span class="dt">data =</span> ., <span class="dt">family =</span> <span class="st">'binomial'</span>) <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="https://www.rdocumentation.org/packages/broom/topics/reexports">tidy</a></span>(<span class="dt">conf.int =</span> <span class="ot">TRUE</span>)))</a></code></pre></div>
307
-<div class="sourceCode" id="cb19"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb19-1" data-line-number="1">mm_outj =<span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/filter.html">filter</a></span>(<span class="kw"><a href="https://dplyr.tidyverse.org/reference/join.html">left_join</a></span>(<span class="kw"><a href="https://dplyr.tidyverse.org/reference/group_by.html">ungroup</a></span>(mm_out), <span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/unique">unique</a></span>(oligo_clusters_all <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/select.html">select</a></span>(cdr3_representative, cluster_idx))), term <span class="op">%in%</span><span class="st"> </span><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/c">c</a></span>(<span class="st">'popCD31Pos'</span>, <span class="st">'weeks_premature'</span>)) <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/mutate.html">mutate</a></span>(<span class="dt">ci_lo =</span> AMmisc<span class="op">::</span><span class="kw"><a href="https://www.rdocumentation.org/packages/AMmisc/topics/clamp">clamp</a></span>(conf.low), <span class="dt">ci_hi =</span> AMmisc<span class="op">::</span><span class="kw"><a href="https://www.rdocumentation.org/packages/AMmisc/topics/clamp">clamp</a></span>(conf.high)) <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/arrange.html">arrange</a></span>(<span class="kw"><a href="https://dplyr.tidyverse.org/reference/desc.html">desc</a></span>(cdr3_representative))</a>
314
+<div class="sourceCode" id="cb19"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb19-1" data-line-number="1">mm_outj =<span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/filter.html">filter</a></span>(<span class="kw"><a href="https://dplyr.tidyverse.org/reference/join.html">left_join</a></span>(<span class="kw"><a href="https://dplyr.tidyverse.org/reference/group_by.html">ungroup</a></span>(mm_out), <span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/unique">unique</a></span>(oligo_clusters_all <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/select.html">select</a></span>(representative, cluster_idx))), term <span class="op">%in%</span><span class="st"> </span><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/c">c</a></span>(<span class="st">'popCD31Pos'</span>, <span class="st">'weeks_premature'</span>)) <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/mutate.html">mutate</a></span>(<span class="dt">ci_lo =</span> AMmisc<span class="op">::</span><span class="kw"><a href="https://www.rdocumentation.org/packages/AMmisc/topics/clamp">clamp</a></span>(conf.low), <span class="dt">ci_hi =</span> AMmisc<span class="op">::</span><span class="kw"><a href="https://www.rdocumentation.org/packages/AMmisc/topics/clamp">clamp</a></span>(conf.high)) <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/arrange.html">arrange</a></span>(<span class="kw"><a href="https://dplyr.tidyverse.org/reference/desc.html">desc</a></span>(representative))</a>
308 315
 <a class="sourceLine" id="cb19-2" data-line-number="2"></a>
309
-<a class="sourceLine" id="cb19-3" data-line-number="3"><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/ggplot.html">ggplot</a></span>(mm_outj, <span class="kw"><a href="https://ggplot2.tidyverse.org/reference/aes.html">aes</a></span>(<span class="dt">x =</span> cdr3_representative, <span class="dt">ymin =</span> ci_lo, <span class="dt">ymax =</span> ci_hi, <span class="dt">y =</span> <span class="kw">clamp</span>(estimate))) <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/geom_linerange.html">geom_pointrange</a></span>() <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/facet_wrap.html">facet_wrap</a></span>(<span class="op">~</span>term, <span class="dt">scales =</span> <span class="st">'free'</span>) <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/coord_flip.html">coord_flip</a></span>() <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/ggtheme.html">theme_minimal</a></span>() <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/geom_abline.html">geom_hline</a></span>(<span class="dt">yintercept =</span> <span class="dv">0</span>, <span class="dt">lty =</span> <span class="dv">2</span>) <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/labs.html">xlab</a></span>(<span class="st">"Isomorph"</span>) <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/labs.html">ylab</a></span>(<span class="st">"log odds of isomorph"</span>)</a></code></pre></div>
316
+<a class="sourceLine" id="cb19-3" data-line-number="3"><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/ggplot.html">ggplot</a></span>(mm_outj, <span class="kw"><a href="https://ggplot2.tidyverse.org/reference/aes.html">aes</a></span>(<span class="dt">x =</span> representative, <span class="dt">ymin =</span> ci_lo, <span class="dt">ymax =</span> ci_hi, <span class="dt">y =</span> <span class="kw">clamp</span>(estimate))) <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/geom_linerange.html">geom_pointrange</a></span>() <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/facet_wrap.html">facet_wrap</a></span>(<span class="op">~</span>term, <span class="dt">scales =</span> <span class="st">'free'</span>) <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/coord_flip.html">coord_flip</a></span>() <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/ggtheme.html">theme_minimal</a></span>() <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/geom_abline.html">geom_hline</a></span>(<span class="dt">yintercept =</span> <span class="dv">0</span>, <span class="dt">lty =</span> <span class="dv">2</span>) <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/labs.html">xlab</a></span>(<span class="st">"Isomorph"</span>) <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/labs.html">ylab</a></span>(<span class="st">"log odds of isomorph"</span>)</a></code></pre></div>
310 317
 <p>We test if the binomial rate of clone expression differs between CD31+/- or term, for each clone.</p>
311 318
 </div>
312 319
 <div id="clonal-pairs" class="section level1">
313 320
 <h1 class="hasAnchor">
314 321
 <a href="#clonal-pairs" class="anchor"></a>Clonal pairs</h1>
315
-<div class="sourceCode" id="cb20"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb20-1" data-line-number="1">class_colors =<span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/reexports.html">data_frame</a></span>(<span class="dt">chain =</span>  <span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/unique">unique</a></span>(aa80<span class="op">$</span>cluster_tbl<span class="op">$</span>chain)) <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/mutate.html">mutate</a></span>(<span class="dt">class_color =</span>  RColorBrewer<span class="op">::</span><span class="kw"><a href="https://www.rdocumentation.org/packages/RColorBrewer/topics/ColorBrewer">brewer.pal</a></span>(<span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/length">length</a></span>(chain),<span class="st">"Set1"</span>)[<span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/seq">seq_along</a></span>(chain)])</a>
316
-<a class="sourceLine" id="cb20-2" data-line-number="2"><span class="co">#&gt; Warning: `data_frame()` is deprecated, use `tibble()`.</span></a>
317
-<a class="sourceLine" id="cb20-3" data-line-number="3"><span class="co">#&gt; This warning is displayed once per session.</span></a>
318
-<a class="sourceLine" id="cb20-4" data-line-number="4"><span class="co">#&gt; Warning in RColorBrewer::brewer.pal(length(chain), "Set1"): minimal value for n is 3, returning requested palette with 3 different levels</span></a>
319
-<a class="sourceLine" id="cb20-5" data-line-number="5"></a>
320
-<a class="sourceLine" id="cb20-6" data-line-number="6">aa80<span class="op">$</span>cluster_pk =<span class="st"> 'representative'</span></a>
321
-<a class="sourceLine" id="cb20-7" data-line-number="7">pairing_list =<span class="st"> </span><span class="kw"><a href="../reference/pairing_tables.html">pairing_tables</a></span>(aa80, <span class="dt">table_order =</span> <span class="dv">2</span>, <span class="dt">orphan_level =</span> <span class="dv">1</span>, <span class="dt">min_expansion =</span> <span class="dv">2</span>, <span class="dt">cluster_keys =</span> <span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/c">c</a></span>(<span class="st">'cdr3'</span>, <span class="st">'representative'</span>, <span class="st">'chain'</span>, <span class="st">'v_gene'</span>, <span class="st">'j_gene'</span>, <span class="st">'avg_distance'</span>))</a>
322
-<a class="sourceLine" id="cb20-8" data-line-number="8"><span class="co">#&gt; Warning: Factor `cluster_idx.2` contains implicit NA, consider using</span></a>
323
-<a class="sourceLine" id="cb20-9" data-line-number="9"><span class="co">#&gt; `forcats::fct_explicit_na`</span></a>
324
-<a class="sourceLine" id="cb20-10" data-line-number="10"><span class="co">#&gt; Warning: Column `representative` joining factors with different levels,</span></a>
325
-<a class="sourceLine" id="cb20-11" data-line-number="11"><span class="co">#&gt; coercing to character vector</span></a></code></pre></div>
326
-<div class="sourceCode" id="cb21"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb21-1" data-line-number="1">pairs_plt =<span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/ggplot.html">ggplot</a></span>(pairing_list<span class="op">$</span>cell_tbl, <span class="kw"><a href="https://ggplot2.tidyverse.org/reference/aes.html">aes</a></span>(<span class="dt">x =</span> cluster_idx<span class="fl">.1</span>_fct, <span class="dt">y =</span> cluster_idx<span class="fl">.2</span>_fct, <span class="dt">color =</span> sample, <span class="dt">shape =</span> pop)) <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/geom_jitter.html">geom_jitter</a></span>(<span class="dt">width =</span> <span class="fl">.3</span>, <span class="dt">height =</span> <span class="fl">.3</span>) <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/ggtheme.html">theme_minimal</a></span>()</a>
322
+<div class="sourceCode" id="cb20"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb20-1" data-line-number="1">class_colors =<span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/reexports.html">tibble</a></span>(<span class="dt">chain =</span>  <span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/unique">unique</a></span>(aa80<span class="op">$</span>cluster_tbl<span class="op">$</span>chain)) <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/mutate.html">mutate</a></span>(<span class="dt">class_color =</span>  RColorBrewer<span class="op">::</span><span class="kw"><a href="https://www.rdocumentation.org/packages/RColorBrewer/topics/ColorBrewer">brewer.pal</a></span>(<span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/length">length</a></span>(chain),<span class="st">"Set1"</span>)[<span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/seq">seq_along</a></span>(chain)])</a>
323
+<a class="sourceLine" id="cb20-2" data-line-number="2"><span class="co">#&gt; Warning in RColorBrewer::brewer.pal(length(chain), "Set1"): minimal value for n is 3, returning requested palette with 3 different levels</span></a>
324
+<a class="sourceLine" id="cb20-3" data-line-number="3"></a>
325
+<a class="sourceLine" id="cb20-4" data-line-number="4">aa80<span class="op">$</span>cluster_pk =<span class="st"> 'representative'</span></a>
326
+<a class="sourceLine" id="cb20-5" data-line-number="5">pairing_list =<span class="st"> </span><span class="kw"><a href="../reference/pairing_tables.html">pairing_tables</a></span>(aa80, <span class="dt">table_order =</span> <span class="dv">2</span>, <span class="dt">orphan_level =</span> <span class="dv">1</span>, <span class="dt">min_expansion =</span> <span class="dv">3</span>, <span class="dt">cluster_keys =</span> <span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/c">c</a></span>(<span class="st">'cdr3'</span>, <span class="st">'representative'</span>, <span class="st">'chain'</span>, <span class="st">'v_gene'</span>, <span class="st">'j_gene'</span>, <span class="st">'avg_distance'</span>))</a>
327
+<a class="sourceLine" id="cb20-6" data-line-number="6"><span class="co">#&gt; Warning: Factor `cluster_idx.2` contains implicit NA, consider using</span></a>
328
+<a class="sourceLine" id="cb20-7" data-line-number="7"><span class="co">#&gt; `forcats::fct_explicit_na`</span></a>
329
+<a class="sourceLine" id="cb20-8" data-line-number="8"><span class="co">#&gt; Warning: Column `representative` joining factors with different levels,</span></a>
330
+<a class="sourceLine" id="cb20-9" data-line-number="9"><span class="co">#&gt; coercing to character vector</span></a></code></pre></div>
331
+<div class="sourceCode" id="cb21"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb21-1" data-line-number="1">pairs_plt =<span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/ggplot.html">ggplot</a></span>(pairing_list<span class="op">$</span>cell_tbl, <span class="kw"><a href="https://ggplot2.tidyverse.org/reference/aes.html">aes</a></span>(<span class="dt">x =</span> cluster_idx<span class="fl">.1</span>_fct, <span class="dt">y =</span> cluster_idx<span class="fl">.2</span>_fct, <span class="dt">color =</span> sample, <span class="dt">shape =</span> pop)) <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/geom_jitter.html">geom_jitter</a></span>(<span class="dt">width =</span> <span class="fl">.2</span>, <span class="dt">height =</span> <span class="fl">.2</span>) <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/ggtheme.html">theme_minimal</a></span>() <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/labs.html">xlab</a></span>(<span class="st">'TRB'</span>) <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/labs.html">ylab</a></span>(<span class="st">'TRA'</span>)</a>
327 332
 <a class="sourceLine" id="cb21-2" data-line-number="2"></a>
328
-<a class="sourceLine" id="cb21-3" data-line-number="3">ylab =<span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/reexports.html">data_frame</a></span>(<span class="dt">cdr3_representative =</span>  <span class="kw"><a href="https://ggplot2.tidyverse.org/reference/ggplot_build.html">ggplot_build</a></span>(pairs_plt)<span class="op">$</span>layout<span class="op">$</span>panel_params[[<span class="dv">1</span>]]<span class="op">$</span>y.label) <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/join.html">left_join</a></span>(feature_tbl) <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/mutate.html">mutate</a></span>(<span class="dt">class_color =</span> <span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/ifelse">ifelse</a></span>(<span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/NA">is.na</a></span>(class_color), <span class="st">'#E41A1C'</span>, class_color))</a>
329
-<a class="sourceLine" id="cb21-4" data-line-number="4"></a>
330
-<a class="sourceLine" id="cb21-5" data-line-number="5">xlab =<span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/reexports.html">data_frame</a></span>(<span class="dt">cdr3_representative =</span>  <span class="kw"><a href="https://ggplot2.tidyverse.org/reference/ggplot_build.html">ggplot_build</a></span>(pairs_plt)<span class="op">$</span>layout<span class="op">$</span>panel_params[[<span class="dv">1</span>]]<span class="op">$</span>x.label) <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/join.html">left_join</a></span>(feature_tbl) <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/mutate.html">mutate</a></span>(<span class="dt">class_color =</span> <span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/ifelse">ifelse</a></span>(<span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/NA">is.na</a></span>(class_color), <span class="st">'#E41A1C'</span>, class_color))</a>
331
-<a class="sourceLine" id="cb21-6" data-line-number="6"></a>
332
-<a class="sourceLine" id="cb21-7" data-line-number="7">pairs_plt =<span class="st"> </span>pairs_plt <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/theme.html">theme</a></span>(<span class="dt">axis.text.x =</span> <span class="kw"><a href="https://ggplot2.tidyverse.org/reference/element.html">element_text</a></span>(<span class="dt">angle =</span> <span class="dv">90</span>, <span class="dt">color =</span> xlab<span class="op">$</span>class_color, <span class="dt">size =</span> <span class="dv">8</span>), <span class="dt">axis.text.y =</span> <span class="kw"><a href="https://ggplot2.tidyverse.org/reference/element.html">element_text</a></span>(<span class="dt">color =</span> ylab<span class="op">$</span>class_color, <span class="dt">size =</span> <span class="dv">8</span>))</a>
333
-<a class="sourceLine" id="cb21-8" data-line-number="8"></a>
334
-<a class="sourceLine" id="cb21-9" data-line-number="9">pairs_plt</a></code></pre></div>
333
+<a class="sourceLine" id="cb21-3" data-line-number="3">feature_tbl =<span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/join.html">left_join</a></span>(<span class="kw"><a href="https://dplyr.tidyverse.org/reference/bind.html">bind_rows</a></span>(pairing_list<span class="op">$</span>idx1_tbl, pairing_list<span class="op">$</span>idx2_tbl), class_colors)</a>
334
+<a class="sourceLine" id="cb21-4" data-line-number="4"><span class="co">#&gt; Warning in bind_rows_(x, .id): binding factor and character vector,</span></a>
335
+<a class="sourceLine" id="cb21-5" data-line-number="5"><span class="co">#&gt; coercing into character vector</span></a>
336
+<a class="sourceLine" id="cb21-6" data-line-number="6"><span class="co">#&gt; Warning in bind_rows_(x, .id): binding character and factor vector,</span></a>
337
+<a class="sourceLine" id="cb21-7" data-line-number="7"><span class="co">#&gt; coercing into character vector</span></a>
338
+<a class="sourceLine" id="cb21-8" data-line-number="8"><span class="co">#&gt; Joining, by = "chain"</span></a>
339
+<a class="sourceLine" id="cb21-9" data-line-number="9"></a>
340
+<a class="sourceLine" id="cb21-10" data-line-number="10">ylab =<span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/reexports.html">data_frame</a></span>(<span class="dt">representative =</span>  <span class="kw"><a href="https://ggplot2.tidyverse.org/reference/ggplot_build.html">ggplot_build</a></span>(pairs_plt)<span class="op">$</span>layout<span class="op">$</span>panel_params[[<span class="dv">1</span>]]<span class="op">$</span>y.label) <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/join.html">left_join</a></span>(feature_tbl) <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/mutate.html">mutate</a></span>(<span class="dt">class_color =</span> <span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/ifelse">ifelse</a></span>(<span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/NA">is.na</a></span>(class_color), <span class="st">'#E41A1C'</span>, class_color))</a>
341
+<a class="sourceLine" id="cb21-11" data-line-number="11"><span class="co">#&gt; Warning: `data_frame()` is deprecated, use `tibble()`.</span></a>
342
+<a class="sourceLine" id="cb21-12" data-line-number="12"><span class="co">#&gt; This warning is displayed once per session.</span></a>
343
+<a class="sourceLine" id="cb21-13" data-line-number="13"><span class="co">#&gt; Joining, by = "representative"</span></a>
344
+<a class="sourceLine" id="cb21-14" data-line-number="14"></a>
345
+<a class="sourceLine" id="cb21-15" data-line-number="15">xlab =<span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/reexports.html">data_frame</a></span>(<span class="dt">representative =</span>  <span class="kw"><a href="https://ggplot2.tidyverse.org/reference/ggplot_build.html">ggplot_build</a></span>(pairs_plt)<span class="op">$</span>layout<span class="op">$</span>panel_params[[<span class="dv">1</span>]]<span class="op">$</span>x.label) <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/join.html">left_join</a></span>(feature_tbl) <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/mutate.html">mutate</a></span>(<span class="dt">class_color =</span> <span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/ifelse">ifelse</a></span>(<span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/NA">is.na</a></span>(class_color), <span class="st">'#E41A1C'</span>, class_color))</a>
346
+<a class="sourceLine" id="cb21-16" data-line-number="16"><span class="co">#&gt; Joining, by = "representative"</span></a>
347
+<a class="sourceLine" id="cb21-17" data-line-number="17"></a>
348
+<a class="sourceLine" id="cb21-18" data-line-number="18">pairs_plt =<span class="st"> </span>pairs_plt <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/theme.html">theme</a></span>(<span class="dt">axis.text.x =</span> <span class="kw"><a href="https://ggplot2.tidyverse.org/reference/element.html">element_text</a></span>(<span class="dt">angle =</span> <span class="dv">90</span>, <span class="dt">color =</span> xlab<span class="op">$</span>class_color, <span class="dt">size =</span> <span class="dv">8</span>), <span class="dt">axis.text.y =</span> <span class="kw"><a href="https://ggplot2.tidyverse.org/reference/element.html">element_text</a></span>(<span class="dt">color =</span> ylab<span class="op">$</span>class_color, <span class="dt">size =</span> <span class="dv">8</span>))</a>
349
+<a class="sourceLine" id="cb21-19" data-line-number="19"></a>
350
+<a class="sourceLine" id="cb21-20" data-line-number="20">pairs_plt</a></code></pre></div>
351
+<p><img src="cdr3_clustering_files/figure-html/plot_expanded-1.png" width="700"></p>
335 352
 <div id="expanded-clones" class="section level2">
336 353
 <h2 class="hasAnchor">
337 354
 <a href="#expanded-clones" class="anchor"></a>Expanded clones</h2>
338
-<div class="sourceCode" id="cb22"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb22-1" data-line-number="1">pairing_list =<span class="st"> </span><span class="kw"><a href="../reference/pairing_tables.html">pairing_tables</a></span>(oligo_clusters_all <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/select.html">select</a></span>(cdr3_representative, dataset, barcode, chain, umis, reads), <span class="dt">cluster_idx =</span> <span class="st">'cdr3_representative'</span>, <span class="dt">cell_identifiers =</span> <span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/c">c</a></span>(<span class="st">'dataset'</span>, <span class="st">'barcode'</span>), <span class="dt">canonicalize_fun =</span> canonicalize_by_prevalence, <span class="dt">table_order =</span> <span class="dv">2</span>, <span class="dt">orphan_level =</span> <span class="dv">1</span>, <span class="dt">min_expansion =</span> <span class="dv">4</span>, <span class="dt">feature_tbl =</span> feature_tbl, <span class="dt">cell_tbl =</span> good_cells, <span class="dt">cluster_whitelist =</span> <span class="kw"><a href="https://dplyr.tidyverse.org/reference/filter.html">filter</a></span>(oligo_clusters, n_cluster<span class="op">&gt;</span><span class="dv">8</span>) <span class="op">%&gt;%</span><span class="st"> </span>dplyr<span class="op">::</span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/select.html">select</a></span>(<span class="dt">cluster_idx.1 =</span> cdr3_representative) <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/unique">unique</a></span>())</a>
339
-<a class="sourceLine" id="cb22-2" data-line-number="2">pairs_plt =<span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/ggplot.html">ggplot</a></span>(pairing_list<span class="op">$</span>cell_tbl, <span class="kw"><a href="https://ggplot2.tidyverse.org/reference/aes.html">aes</a></span>(<span class="dt">x =</span> cluster_idx<span class="fl">.1</span>_fct, <span class="dt">y =</span> cluster_idx<span class="fl">.2</span>_fct, <span class="dt">color =</span> sample, <span class="dt">shape =</span> pop)) <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/geom_jitter.html">geom_jitter</a></span>(<span class="dt">width =</span> <span class="fl">.3</span>, <span class="dt">height =</span> <span class="fl">.3</span>) <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/ggtheme.html">theme_minimal</a></span>()</a>
355
+<div class="sourceCode" id="cb22"><pre class="sourceCode r"><code class="sourceCode r"><a class="sourceLine" id="cb22-1" data-line-number="1">pairing_list =<span class="st"> </span><span class="kw"><a href="../reference/pairing_tables.html">pairing_tables</a></span>(oligo_clusters_all <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/select.html">select</a></span>(representative, dataset, barcode, chain, umis, reads), <span class="dt">cluster_idx =</span> <span class="st">'representative'</span>, <span class="dt">cell_identifiers =</span> <span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/c">c</a></span>(<span class="st">'dataset'</span>, <span class="st">'barcode'</span>), <span class="dt">canonicalize_fun =</span> canonicalize_by_prevalence, <span class="dt">table_order =</span> <span class="dv">2</span>, <span class="dt">orphan_level =</span> <span class="dv">1</span>, <span class="dt">min_expansion =</span> <span class="dv">4</span>, <span class="dt">feature_tbl =</span> feature_tbl, <span class="dt">cell_tbl =</span> good_cells, <span class="dt">cluster_whitelist =</span> <span class="kw"><a href="https://dplyr.tidyverse.org/reference/filter.html">filter</a></span>(oligo_clusters, n_cluster<span class="op">&gt;</span><span class="dv">8</span>) <span class="op">%&gt;%</span><span class="st"> </span>dplyr<span class="op">::</span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/select.html">select</a></span>(<span class="dt">cluster_idx.1 =</span> representative) <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/unique">unique</a></span>())</a>
356
+<a class="sourceLine" id="cb22-2" data-line-number="2">pairs_plt =<span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/ggplot.html">ggplot</a></span>(pairing_list<span class="op">$</span>cell_tbl, <span class="kw"><a href="https://ggplot2.tidyverse.org/reference/aes.html">aes</a></span>(<span class="dt">x =</span> cluster_idx<span class="fl">.1</span>_fct, <span class="dt">y =</span> cluster_idx<span class="fl">.2</span>_fct, <span class="dt">color =</span> sample, <span class="dt">shape =</span> pop)) <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/geom_jitter.html">geom_jitter</a></span>(<span class="dt">width =</span> <span class="fl">.2</span>, <span class="dt">height =</span> <span class="fl">.2</span>) <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/ggtheme.html">theme_minimal</a></span>() <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/labs.html">xlab</a></span>(<span class="st">'TRB'</span>) <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/labs.html">ylab</a></span>(<span class="st">'TRA'</span>)</a>
340 357
 <a class="sourceLine" id="cb22-3" data-line-number="3"></a>
341
-<a class="sourceLine" id="cb22-4" data-line-number="4">ylab =<span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/reexports.html">data_frame</a></span>(<span class="dt">cdr3_representative =</span>  <span class="kw"><a href="https://ggplot2.tidyverse.org/reference/ggplot_build.html">ggplot_build</a></span>(pairs_plt)<span class="op">$</span>layout<span class="op">$</span>panel_params[[<span class="dv">1</span>]]<span class="op">$</span>y.label) <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/join.html">left_join</a></span>(feature_tbl) <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/mutate.html">mutate</a></span>(<span class="dt">class_color =</span> <span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/ifelse">ifelse</a></span>(<span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/NA">is.na</a></span>(class_color), <span class="st">'#E41A1C'</span>, class_color))</a>
358
+<a class="sourceLine" id="cb22-4" data-line-number="4">feature_tbl =<span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/join.html">left_join</a></span>(<span class="kw"><a href="https://dplyr.tidyverse.org/reference/bind.html">bind_rows</a></span>(pairing_list<span class="op">$</span>idx1_tbl, pairing_list<span class="op">$</span>idx2_tbl), class_colors)</a>
342 359
 <a class="sourceLine" id="cb22-5" data-line-number="5"></a>
343
-<a class="sourceLine" id="cb22-6" data-line-number="6">xlab =<span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/reexports.html">data_frame</a></span>(<span class="dt">cdr3_representative =</span>  <span class="kw"><a href="https://ggplot2.tidyverse.org/reference/ggplot_build.html">ggplot_build</a></span>(pairs_plt)<span class="op">$</span>layout<span class="op">$</span>panel_params[[<span class="dv">1</span>]]<span class="op">$</span>x.label) <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/join.html">left_join</a></span>(feature_tbl) <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/mutate.html">mutate</a></span>(<span class="dt">class_color =</span> <span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/ifelse">ifelse</a></span>(<span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/NA">is.na</a></span>(class_color), <span class="st">'#E41A1C'</span>, class_color))</a>
360
+<a class="sourceLine" id="cb22-6" data-line-number="6">ylab =<span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/reexports.html">data_frame</a></span>(<span class="dt">representative =</span>  <span class="kw"><a href="https://ggplot2.tidyverse.org/reference/ggplot_build.html">ggplot_build</a></span>(pairs_plt)<span class="op">$</span>layout<span class="op">$</span>panel_params[[<span class="dv">1</span>]]<span class="op">$</span>y.label) <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/join.html">left_join</a></span>(feature_tbl) <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/mutate.html">mutate</a></span>(<span class="dt">class_color =</span> <span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/ifelse">ifelse</a></span>(<span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/NA">is.na</a></span>(class_color), <span class="st">'#E41A1C'</span>, class_color))</a>
344 361
 <a class="sourceLine" id="cb22-7" data-line-number="7"></a>
345
-<a class="sourceLine" id="cb22-8" data-line-number="8">pairs_plt =<span class="st"> </span>pairs_plt <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/theme.html">theme</a></span>(<span class="dt">axis.text.x =</span> <span class="kw"><a href="https://ggplot2.tidyverse.org/reference/element.html">element_text</a></span>(<span class="dt">angle =</span> <span class="dv">90</span>, <span class="dt">color =</span> xlab<span class="op">$</span>class_color, <span class="dt">size =</span> <span class="dv">8</span>), <span class="dt">axis.text.y =</span> <span class="kw"><a href="https://ggplot2.tidyverse.org/reference/element.html">element_text</a></span>(<span class="dt">color =</span> ylab<span class="op">$</span>class_color, <span class="dt">size =</span> <span class="dv">8</span>))</a>
362
+<a class="sourceLine" id="cb22-8" data-line-number="8">xlab =<span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/reexports.html">data_frame</a></span>(<span class="dt">representative =</span>  <span class="kw"><a href="https://ggplot2.tidyverse.org/reference/ggplot_build.html">ggplot_build</a></span>(pairs_plt)<span class="op">$</span>layout<span class="op">$</span>panel_params[[<span class="dv">1</span>]]<span class="op">$</span>x.label) <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/join.html">left_join</a></span>(feature_tbl) <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/mutate.html">mutate</a></span>(<span class="dt">class_color =</span> <span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/ifelse">ifelse</a></span>(<span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/NA">is.na</a></span>(class_color), <span class="st">'#E41A1C'</span>, class_color))</a>
346 363
 <a class="sourceLine" id="cb22-9" data-line-number="9"></a>
347
-<a class="sourceLine" id="cb22-10" data-line-number="10">pairs_plt</a></code></pre></div>
364
+<a class="sourceLine" id="cb22-10" data-line-number="10">pairs_plt =<span class="st"> </span>pairs_plt <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/theme.html">theme</a></span>(<span class="dt">axis.text.x =</span> <span class="kw"><a href="https://ggplot2.tidyverse.org/reference/element.html">element_text</a></span>(<span class="dt">angle =</span> <span class="dv">90</span>, <span class="dt">color =</span> xlab<span class="op">$</span>class_color, <span class="dt">size =</span> <span class="dv">8</span>), <span class="dt">axis.text.y =</span> <span class="kw"><a href="https://ggplot2.tidyverse.org/reference/element.html">element_text</a></span>(<span class="dt">color =</span> ylab<span class="op">$</span>class_color, <span class="dt">size =</span> <span class="dv">8</span>))</a>
365
+<a class="sourceLine" id="cb22-11" data-line-number="11"></a>
366
+<a class="sourceLine" id="cb22-12" data-line-number="12">pairs_plt</a></code></pre></div>
348 367
 </div>
349 368
 </div>
350 369
 <div id="length-of-cdr3" class="section level1">
... ...
@@ -368,7 +387,7 @@
368 387
 <a class="sourceLine" id="cb24-12" data-line-number="12"><span class="co">#&gt; coercing into character vector</span></a>
369 388
 <a class="sourceLine" id="cb24-13" data-line-number="13"><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/ggplot.html">ggplot</a></span>(cdr_len <span class="op">%&gt;%</span><span class="st"> </span><span class="kw"><a href="https://dplyr.tidyverse.org/reference/filter.html">filter</a></span>(group <span class="op">==</span><span class="st"> 'fixed'</span>, term <span class="op">!=</span><span class="st"> '(Intercept)'</span>), <span class="kw"><a href="https://ggplot2.tidyverse.org/reference/aes.html">aes</a></span>(<span class="dt">x =</span> <span class="kw"><a href="https://www.rdocumentation.org/packages/base/topics/interaction">interaction</a></span>(chain, term), <span class="dt">y =</span> estimate, <span class="dt">ymin =</span> conf.low, <span class="dt">ymax =</span> conf.high)) <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/geom_linerange.html">geom_pointrange</a></span>() <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/ggtheme.html">theme_minimal</a></span>() <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/coord_flip.html">coord_flip</a></span>() <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/labs.html">ylab</a></span>(<span class="st">'Length(CDR3 Nt)'</span>) <span class="op">+</span><span class="st"> </span><span class="kw"><a href="https://ggplot2.tidyverse.org/reference/labs.html">xlab</a></span>(<span class="st">'Term/Chain'</span>)</a></code></pre></div>
370 389
 <p><img src="cdr3_clustering_files/figure-html/cdr3_len-1.png" width="288"></p>
371
-<p>As was suggested by the histogram, there doesn’t seem to be an obvious <code>pop</code> effect.</p>
390
+<p>We end up with a convergence warning. This is not a suprise, because the <code>samples</code> aren’t actually replicates – they are just subsamples drawn for illustrative purposes. The Balbc mice have .5 fewer nucleotides per contig, on average, and this is not significant.</p>
372 391
 </div>
373 392
   </div>
374 393
 
... ...
@@ -381,6 +400,8 @@
381 400
       <li><a href="#chain-pairings">Chain pairings</a></li>
382 401
       <li><a href="#cluster-cdr3-protein-sequences">Cluster CDR3 protein sequences</a></li>
383 402
       <li><a href="#cluster-cdr3-dna-sequences">Cluster CDR3 DNA sequences</a></li>
403
+      <li><a href="#cluster-by-v-j-identity">Cluster by V-J identity</a></li>
404
+      <li><a href="#some-simple-phylogenetic-relationship">Some simple phylogenetic relationship</a></li>
384 405
       <li><a href="#oligo-clusters">Oligo clusters</a></li>
385 406
       <li><a href="#formal-testing-for-frequency-differences">Formal testing for frequency differences</a></li>
386 407
       <li>
387 408
new file mode 100644
388 409
Binary files /dev/null and b/docs/articles/cdr3_clustering_files/figure-html/plot_expanded-1.png differ
389 410
Binary files a/docs/articles/mouse_tcell_qc_files/figure-html/unnamed-chunk-5-1.png and b/docs/articles/mouse_tcell_qc_files/figure-html/unnamed-chunk-5-1.png differ
390 411
Binary files a/docs/articles/mouse_tcell_qc_files/figure-html/unnamed-chunk-5-2.png and b/docs/articles/mouse_tcell_qc_files/figure-html/unnamed-chunk-5-2.png differ
... ...
@@ -25,7 +25,7 @@
25 25
 <![endif]-->
26 26
 </head>
27 27
 <body>
28
-    <div class="container template-home">
28
+    <div class="container template-article">
29 29
       <header><div class="navbar navbar-default navbar-fixed-top" role="navigation">
30 30
   <div class="container">
31 31
     <div class="navbar-header">
... ...
@@ -85,7 +85,12 @@
85 85
 
86 86
       
87 87
       </header><div class="row">
88
-  <div class="contents col-md-9">
88
+  <div class="col-md-9 contents">
89
+    
90
+
91
+    
92
+    
93
+<!-- README.md is generated from README.Rmd. Please edit that file -->
89 94
 <div id="cellarepertorium" class="section level1">
90 95
 <div class="page-header"><h1 class="hasAnchor">
91 96
 <a href="#cellarepertorium" class="anchor"></a>CellaRepertorium</h1></div>
... ...
@@ -94,16 +99,17 @@
94 99
 <h2 class="hasAnchor">
95 100
 <a href="#installation" class="anchor"></a>Installation</h2>
96 101
 <pre><code><a href="https://www.rdocumentation.org/packages/devtools/topics/reexports">devtools::install_github('amcdavid/CellaRepertorium')</a></code></pre>
97
-<p>Requires R&gt;=3.5.</p>
102
+<p>Requires R &gt;= 3.5.</p>
98 103
 </div>
99 104
 <div id="data-requirements-and-package-structure" class="section level2">
100 105
 <h2 class="hasAnchor">
101 106
 <a href="#data-requirements-and-package-structure" class="anchor"></a>Data requirements and package structure</h2>
102
-<p>The fundamental unit is the <strong>contig</strong>, which is a section of contiguously stitched reads from a single <strong>cell</strong>. Each contig belongs to one (and only one) cell, however, cells generate multiple contigs. Contigs can also belong to a <strong>cluster</strong>. Because of these two many-to-one mappings, these data can be thought as a series of ragged arrays. The links between them mean they are relational data.</p>
103
-<p>[A schematic of contigs and cells should go here]</p>
104
-<p>A <code>ContigCellDB</code> object wraps each of these objects as a sequence of three <code>data.frame</code>s (well, <code>tibble</code>s, actually). <code>ContigCellDB</code> also tracks columns (keys) that unique identify each row in each of these tables. The <code>contig_tbl</code> is the <code>tibble</code> containing <strong>contigs</strong>, the <code>cell_tbl</code> contains the <strong>cells</strong>, and the <code>cluster_tbl</code> contains the <strong>clusters</strong>. The <code>contig_pk</code>, <code>cell_pk</code> and <code>cluster_pk</code> identify the columns that identify a contig, cell and cluster, respectively, and must be unique in each of the respective tables. The tables are kept in sync so that subsetting the contigs will subset the cells, and clusters, and vice-versa.</p>
105
-<p>[A schematic showing table relations should go here]</p>
106
-<p>Of course, each of these tables can contain many other columns that will serve as covariates for various analysis, such as the CDR3 sequence, or the identity of the V, D and J regions. Various derived quantities that describe cells and clusters can also be calculated, and added to these tables, such as the medoid of a cluster.</p>
107
+<p>The fundamental unit this package operates on is the <strong>contig</strong>, which is a section of contiguously stitched reads from a single <strong>cell</strong>. Each contig belongs to one (and only one) cell, however, cells generate multiple contigs.</p>
108
+<p><img src="../../../../Box%20Sync/research/scRNAseq/CellaRepertorium/vignettes/figure/contig_schematic.png"><!-- --></p>
109
+<p>Contigs can also belong to a <strong>cluster</strong>. Because of these two many-to-one mappings, these data can be thought as a series of ragged arrays. The links between them mean they are relational data. A <code>ContigCellDB</code> object wraps each of these objects as a sequence of three <code>data.frames</code> (well, <code>tibbles</code>, actually). <code>ContigCellDB</code> also tracks columns (the primary keys) that unique identify each row in each of these tables. The <code>contig_tbl</code> is the <code>tibble</code> containing <strong>contigs</strong>, the <code>cell_tbl</code> contains the <strong>cells</strong>, and the <code>cluster_tbl</code> contains the <strong>clusters</strong>.</p>
110
+<p>The <code>contig_pk</code>, <code>cell_pk</code> and <code>cluster_pk</code> identify the columns that identify a contig, cell and cluster, respectively. These will serve as foreign keys that link the three tables together. The tables are kept in sync so that subsetting the contigs will subset the cells, and clusters, and vice-versa.</p>
111
+<p><img src="../../../../Box%20Sync/research/scRNAseq/CellaRepertorium/vignettes/figure/table_schematic.png"><!-- --></p>
112
+<p>Of course, each of these tables can contain many other columns that will serve as covariates for various analyses, such as the CDR3 sequence, or the identity of the V, D and J regions. Various derived quantities that describe cells and clusters can also be calculated, and added to these tables, such as the medoid of a cluster – a contig that minimizes the average distance to all other clusters.</p>
107 113
 </div>
108 114
 <div id="functions" class="section level2">
109 115
 <h2 class="hasAnchor">
... ...
@@ -111,11 +117,13 @@
111 117
 <p>[a screencap of something interesting?]</p>
112 118
 <ul>
113 119
 <li>
114
-<code>cdhit</code>: An R interface to CDhit, which was originally ported by Thomas Lin Pedersen.</li>
120
+<code>cdhit_ccdb</code>: An R interface to CDhit, which was originally ported by Thomas Lin Pedersen.</li>
115 121
 <li>
116
-<code>fine_cluster</code>: clustering CDR3 by edit distances (possibly using empirical amino acid substitution matrices)</li>
122
+<code>fine_clustering</code>: clustering CDR3 by edit distances (possibly using empirical amino acid substitution matrices)</li>
117 123
 <li>
118 124
 <code>cluster_permute_test</code>: permutation tests of cluster statistics</li>
125
+<li>
126
+<code>pairing_tables</code>: Generate pairings of contigs within each cell in a way that they can be plotted</li>
119 127
 </ul>
120 128
 </div>
121 129
 </div>
... ...
@@ -145,9 +153,11 @@
145 153
 </ul>
146 154
 </div>
147 155
 
148
-  </div>
156
+      </div>
157
+
149 158
 </div>
150 159
 
160
+
151 161
       <footer><div class="copyright">
152 162
   <p>Developed by Andrew McDavid, Yu Gu.</p>
153 163
 </div>
154 164
new file mode 100644
... ...
@@ -0,0 +1,229 @@
1
+<!-- Generated by pkgdown: do not edit by hand -->
2
+<!DOCTYPE html>
3
+<html lang="en">
4
+  <head>
5
+  <meta charset="utf-8">
6
+<meta http-equiv="X-UA-Compatible" content="IE=edge">
7
+<meta name="viewport" content="width=device-width, initial-scale=1.0">
8
+
9
+<title>Find a canonical contig to represent a cell — canonicalize_cell • CellaRepertorium</title>
10
+
11
+<!-- jquery -->
12
+<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js" integrity="sha256-FgpCb/KJQlLNfOu91ta32o/NMZxltwRo8QtmkMRdAu8=" crossorigin="anonymous"></script>
13
+<!-- Bootstrap -->
14
+
15
+<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.3.7/css/bootstrap.min.css" integrity="sha256-916EbMg70RQy9LHiGkXzG8hSg9EdNy97GazNG/aiY1w=" crossorigin="anonymous" />
16
+<script src="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha256-U5ZEeKfGNOja007MMD3YBI0A3OSZOQbeG6z2f2Y0hu8=" crossorigin="anonymous"></script>
17
+
18
+<!-- Font Awesome icons -->
19
+<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css" integrity="sha256-eZrrJcwDc/3uDhsdt61sL2oOBY362qM3lon1gyExkL0=" crossorigin="anonymous" />
20
+
21
+<!-- clipboard.js -->
22
+<script src="https://cdnjs.cloudflare.com/ajax/libs/clipboard.js/2.0.4/clipboard.min.js" integrity="sha256-FiZwavyI2V6+EXO1U+xzLG3IKldpiTFf3153ea9zikQ=" crossorigin="anonymous"></script>
23
+
24
+<!-- sticky kit -->
25
+<script src="https://cdnjs.cloudflare.com/ajax/libs/sticky-kit/1.1.3/sticky-kit.min.js" integrity="sha256-c4Rlo1ZozqTPE2RLuvbusY3+SU1pQaJC0TjuhygMipw=" crossorigin="anonymous"></script>
26
+
27
+<!-- pkgdown -->
28
+<link href="../pkgdown.css" rel="stylesheet">
29
+<script src="../pkgdown.js"></script>
30
+
31
+
32
+
33
+<meta property="og:title" content="Find a canonical contig to represent a cell — canonicalize_cell" />
34
+
35
+<meta property="og:description" content="Using filtering in `...` and sorting in `tie_break_keys` and `order` find a
36
+single, canonical contig to represent each cell
37
+Fields in `contig_fields` will be copied over to the `cell_tbl`." />
38
+<meta name="twitter:card" content="summary" />
39
+
40
+
41
+
42
+<!-- mathjax -->
43
+<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js" integrity="sha256-nvJJv9wWKEm88qvoQl9ekL2J+k/RWIsaSScxxlsrv8k=" crossorigin="anonymous"></script>
44
+<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/config/TeX-AMS-MML_HTMLorMML.js" integrity="sha256-84DKXVJXs0/F8OTMzX4UR909+jtl4G7SPypPavF+GfA=" crossorigin="anonymous"></script>
45
+
46
+<!--[if lt IE 9]>
47
+<script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script>
48
+<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
49
+<![endif]-->
50
+
51
+
52
+  </head>
53
+
54
+  <body>
55
+    <div class="container template-reference-topic">
56
+      <header>
57
+      <div class="navbar navbar-default navbar-fixed-top" role="navigation">
58
+  <div class="container">
59
+    <div class="navbar-header">
60
+      <button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar" aria-expanded="false">
61
+        <span class="sr-only">Toggle navigation</span>
62
+        <span class="icon-bar"></span>
63
+        <span class="icon-bar"></span>
64
+        <span class="icon-bar"></span>
65
+      </button>
66
+      <span class="navbar-brand">
67
+        <a class="navbar-link" href="../index.html">CellaRepertorium</a>
68
+        <span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Released version">0.3.1</span>
69
+      </span>
70
+    </div>
71
+
72
+    <div id="navbar" class="navbar-collapse collapse">
73
+      <ul class="nav navbar-nav">
74
+        <li>
75
+  <a href="../index.html">
76
+    <span class="fa fa-home fa-lg"></span>
77
+     
78
+  </a>
79
+</li>
80
+<li>
81
+  <a href="../reference/index.html">Reference</a>
82
+</li>
83
+<li class="dropdown">
84
+  <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">
85
+    Articles
86
+     
87
+    <span class="caret"></span>
88
+  </a>
89
+  <ul class="dropdown-menu" role="menu">
90
+    <li>
91
+      <a href="../articles/cdr3_clustering.html">Clustering repertoire via CDR3 sequences</a>
92
+    </li>
93
+    <li>
94
+      <a href="../articles/mouse_tcell_qc.html">Quality control and Exploration of UMI-based repertoire data</a>
95
+    </li>
96
+  </ul>
97
+</li>
98
+      </ul>
99
+      
100
+      <ul class="nav navbar-nav navbar-right">
101
+        <li>
102
+  <a href="https://github.com/amcdavid/CellaRepertorium">
103
+    <span class="fa fa-github fa-lg"></span>
104
+     
105
+  </a>
106
+</li>
107
+      </ul>
108
+      
109
+    </div><!--/.nav-collapse -->
110
+  </div><!--/.container -->
111
+</div><!--/.navbar -->
112
+
113
+      
114
+      </header>
115
+
116
+<div class="row">
117
+  <div class="col-md-9 contents">
118
+    <div class="page-header">
119
+    <h1>Find a canonical contig to represent a cell</h1>
120
+    <small class="dont-index">Source: <a href='https://github.com/amcdavid/CellaRepertorium/blob/master/R/pairing-methods.R'><code>R/pairing-methods.R</code></a></small>
121
+    <div class="hidden name"><code>canonicalize_cell.Rd</code></div>
122
+    </div>
123
+
124
+    <div class="ref-description">
125
+    
126
+    <p>Using filtering in `...` and sorting in `tie_break_keys` and `order` find a
127
+single, canonical contig to represent each cell
128
+Fields in `contig_fields` will be copied over to the `cell_tbl`.</p>
129
+    
130
+    </div>
131
+
132
+    <pre class="usage"><span class='fu'>canonicalize_cell</span>(<span class='no'>ccdb</span>, <span class='no'>contig_filter_args</span>, <span class='kw'>tie_break_keys</span> <span class='kw'>=</span> <span class='fu'><a href='https://www.rdocumentation.org/packages/base/topics/c'>c</a></span>(<span class='st'>"umis"</span>,
133
+  <span class='st'>"reads"</span>), <span class='kw'>contig_fields</span> <span class='kw'>=</span> <span class='no'>tie_break_keys</span>, <span class='kw'>order</span> <span class='kw'>=</span> <span class='fl'>1</span>)</pre>
134
+    
135
+    <h2 class="hasAnchor" id="arguments"><a class="anchor" href="#arguments"></a>Arguments</h2>
136
+    <table class="ref-arguments">
137
+    <colgroup><col class="name" /><col class="desc" /></colgroup>
138
+    <tr>
139
+      <th>ccdb</th>
140
+      <td><p>`ContigCellDB`</p></td>
141
+    </tr>
142
+    <tr>
143
+      <th>contig_filter_args</th>
144
+      <td><p>an expression passed to dplyr::filter.  Unlike `filter`, multiple criteria must be `&amp;` together, rather than using commas to separate.
145
+that act on `ccdb$contig_tbl``</p></td>
146
+    </tr>
147
+    <tr>
148
+      <th>tie_break_keys</th>
149
+      <td><p>(optional) `character` naming fields in `contig_tbl`
150
+that are used sort the contig table in descending order.
151
+Used to break ties if `contig_filter_args` does not return a unique contig
152
+for each cluster</p></td>
153
+    </tr>
154
+    <tr>
155
+      <th>contig_fields</th>
156
+      <td><p>Optional fields from `contig_tbl` that will be copied into
157
+the `cluster_tbl` from the canonical contig.</p></td>
158
+    </tr>
159
+    <tr>
160
+      <th>order</th>
161
+      <td><p>The rank order of the contig, based on `tie_break_keys`
162
+to return</p></td>
163
+    </tr>
164
+    </table>
165
+    
166
+    <h2 class="hasAnchor" id="value"><a class="anchor" href="#value"></a>Value</h2>
167
+
168
+    <p>`ContigCellDB` with additional fields in `cell_tbl`</p>
169
+    
170
+    <h2 class="hasAnchor" id="see-also"><a class="anchor" href="#see-also"></a>See also</h2>
171
+
172
+    <div class='dont-index'><p>canonicalize_cluster</p></div>
173
+    
174
+
175
+    <h2 class="hasAnchor" id="examples"><a class="anchor" href="#examples"></a>Examples</h2>
176
+    <pre class="examples"><div class='input'><span class='co'># Report beta chain with highest umi-count, breaking ties with reads</span>
177
+<span class='no'>beta</span> <span class='kw'>=</span> <span class='fu'>canonicalize_cell</span>(<span class='no'>ccdb_ex</span>, <span class='no'>chain</span> <span class='kw'>==</span> <span class='st'>'TRB'</span>,
178
+<span class='kw'>tie_break_keys</span> <span class='kw'>=</span> <span class='fu'><a href='https://www.rdocumentation.org/packages/base/topics/c'>c</a></span>(<span class='st'>'umis'</span>, <span class='st'>'reads'</span>),
179
+<span class='kw'>contig_fields</span> <span class='kw'>=</span> <span class='fu'><a href='https://www.rdocumentation.org/packages/base/topics/c'>c</a></span>(<span class='st'>'umis'</span>, <span class='st'>'reads'</span>, <span class='st'>'chain'</span>, <span class='st'>'v_gene'</span>, <span class='st'>'d_gene'</span>, <span class='st'>'j_gene'</span>))
180
+<span class='fu'><a href='https://www.rdocumentation.org/packages/utils/topics/head'>head</a></span>(<span class='no'>beta</span>$<span class='no'>cell_tbl</span>)</div><div class='output co'>#&gt; <span style='color: #555555;'># A tibble: 6 x 9</span><span>
181
+#&gt; </span><span style='color: #555555;'># Groups:   pop, sample, barcode [6]</span><span>
182
+#&gt;    umis reads chain v_gene d_gene j_gene  pop   sample barcode           
183
+#&gt;   </span><span style='color: #555555;font-style: italic;'>&lt;dbl&gt;</span><span> </span><span style='color: #555555;font-style: italic;'>&lt;dbl&gt;</span><span> </span><span style='color: #555555;font-style: italic;'>&lt;chr&gt;</span><span> </span><span style='color: #555555;font-style: italic;'>&lt;chr&gt;</span><span>  </span><span style='color: #555555;font-style: italic;'>&lt;chr&gt;</span><span>  </span><span style='color: #555555;font-style: italic;'>&lt;chr&gt;</span><span>   </span><span style='color: #555555;font-style: italic;'>&lt;chr&gt;</span><span> </span><span style='color: #555555;font-style: italic;'>&lt;chr&gt;</span><span>  </span><span style='color: #555555;font-style: italic;'>&lt;chr&gt;</span><span>             
184
+#&gt; </span><span style='color: #555555;'>1</span><span>     6 </span><span style='text-decoration: underline;'>37</span><span>898 TRB   TRBV31 None   TRBJ1-5 b6    4      AAAGTAGTCGCGCCAA-1
185
+#&gt; </span><span style='color: #555555;'>2</span><span>    </span><span style='color: #BB0000;'>NA</span><span>    </span><span style='color: #BB0000;'>NA</span><span> </span><span style='color: #BB0000;'>NA</span><span>    </span><span style='color: #BB0000;'>NA</span><span>     </span><span style='color: #BB0000;'>NA</span><span>     </span><span style='color: #BB0000;'>NA</span><span>      b6    4      AACCATGCATTTGCCC-1
186
+#&gt; </span><span style='color: #555555;'>3</span><span>     6 </span><span style='text-decoration: underline;'>33</span><span>548 TRB   TRBV5  TRBD2  TRBJ2-2 b6    4      AACTGGTGTCTGATCA-1
187
+#&gt; </span><span style='color: #555555;'>4</span><span>     6 </span><span style='text-decoration: underline;'>26</span><span>928 TRB   TRBV4  TRBD2  TRBJ2-4 b6    4      AAGCCGCAGTAAGTAC-1
188
+#&gt; </span><span style='color: #555555;'>5</span><span>     4 </span><span style='text-decoration: underline;'>18</span><span>435 TRB   TRBV1  None   TRBJ2-4 b6    4      AAGTCTGGTTCAACCA-1
189
+#&gt; </span><span style='color: #555555;'>6</span><span>     9 </span><span style='text-decoration: underline;'>46</span><span>156 TRB   TRBV5  TRBD2  TRBJ2-7 b6    4      ACACCAAAGTCCAGGA-1</div><div class='input'>
190
+<span class='co'># Only adds fields to `cell_tbl`</span>
191
+<span class='fu'><a href='https://www.rdocumentation.org/packages/base/topics/stopifnot'>stopifnot</a></span>(<span class='fu'><a href='https://www.rdocumentation.org/packages/base/topics/all.equal'>all.equal</a></span>(<span class='no'>beta</span>$<span class='no'>cell_tbl</span>[<span class='no'>ccdb_ex</span>$<span class='no'>cell_pk</span>],
192
+<span class='no'>ccdb_ex</span>$<span class='no'>cell_tbl</span>[<span class='no'>ccdb_ex</span>$<span class='no'>cell_pk</span>]))
193
+
194
+<span class='co'>#Report cdr3 with highest UMI count, but only when &gt; 5 UMIs support it</span>
195
+<span class='no'>umi5</span> <span class='kw'>=</span> <span class='fu'>canonicalize_cell</span>(<span class='no'>ccdb_ex</span>, <span class='no'>umis</span> <span class='kw'>&gt;</span> <span class='fl'>5</span>,
196
+<span class='kw'>tie_break_keys</span> <span class='kw'>=</span> <span class='fu'><a href='https://www.rdocumentation.org/packages/base/topics/c'>c</a></span>(<span class='st'>'umis'</span>, <span class='st'>'reads'</span>), <span class='kw'>contig_fields</span> <span class='kw'>=</span> <span class='fu'><a href='https://www.rdocumentation.org/packages/base/topics/c'>c</a></span>(<span class='st'>'umis'</span>, <span class='st'>'cdr3'</span>))
197
+<span class='fu'><a href='https://www.rdocumentation.org/packages/base/topics/stopifnot'>stopifnot</a></span>(<span class='fu'><a href='https://www.rdocumentation.org/packages/base/topics/all'>all</a></span>(<span class='no'>umi5</span>$<span class='no'>cell_tbl</span>$<span class='no'>umis</span> <span class='kw'>&gt;</span> <span class='fl'>5</span>, <span class='kw'>na.rm</span> <span class='kw'>=</span> <span class='fl'>TRUE</span>))</div></span></pre>
198
+  </div>
199
+  <div class="col-md-3 hidden-xs hidden-sm" id="sidebar">
200
+    <h2>Contents</h2>
201
+    <ul class="nav nav-pills nav-stacked">
202
+      <li><a href="#arguments">Arguments</a></li>
203
+      
204
+      <li><a href="#value">Value</a></li>
205
+
206
+      <li><a href="#see-also">See also</a></li>
207
+      
208
+      <li><a href="#examples">Examples</a></li>
209
+    </ul>
210
+
211
+  </div>
212
+</div>
213
+
214
+      <footer>
215
+      <div class="copyright">
216
+  <p>Developed by Andrew McDavid, Yu Gu.</p>
217
+</div>
218
+
219
+<div class="pkgdown">
220
+  <p>Site built with <a href="https://pkgdown.r-lib.org/">pkgdown</a> 1.3.0.</p>
221
+</div>
222
+      </footer>
223
+   </div>
224
+
225
+  
226
+
227
+  </body>
228
+</html>
229
+
0 230
new file mode 100644
... ...
@@ -0,0 +1,238 @@
1
+<!-- Generated by pkgdown: do not edit by hand -->
2
+<!DOCTYPE html>
3
+<html lang="en">
4
+  <head>
5
+  <meta charset="utf-8">
6
+<meta http-equiv="X-UA-Compatible" content="IE=edge">
7
+<meta name="viewport" content="width=device-width, initial-scale=1.0">
8
+
9
+<title>Find a canonical contig to represent a cluster — canonicalize_cluster • CellaRepertorium</title>
10
+
11
+<!-- jquery -->
12
+<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js" integrity="sha256-FgpCb/KJQlLNfOu91ta32o/NMZxltwRo8QtmkMRdAu8=" crossorigin="anonymous"></script>
13
+<!-- Bootstrap -->
14
+
15
+<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.3.7/css/bootstrap.min.css" integrity="sha256-916EbMg70RQy9LHiGkXzG8hSg9EdNy97GazNG/aiY1w=" crossorigin="anonymous" />
16
+<script src="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha256-U5ZEeKfGNOja007MMD3YBI0A3OSZOQbeG6z2f2Y0hu8=" crossorigin="anonymous"></script>
17
+
18
+<!-- Font Awesome icons -->
19
+<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css" integrity="sha256-eZrrJcwDc/3uDhsdt61sL2oOBY362qM3lon1gyExkL0=" crossorigin="anonymous" />
20
+
21
+<!-- clipboard.js -->
22
+<script src="https://cdnjs.cloudflare.com/ajax/libs/clipboard.js/2.0.4/clipboard.min.js" integrity="sha256-FiZwavyI2V6+EXO1U+xzLG3IKldpiTFf3153ea9zikQ=" crossorigin="anonymous"></script>
23
+
24
+<!-- sticky kit -->
25
+<script src="https://cdnjs.cloudflare.com/ajax/libs/sticky-kit/1.1.3/sticky-kit.min.js" integrity="sha256-c4Rlo1ZozqTPE2RLuvbusY3+SU1pQaJC0TjuhygMipw=" crossorigin="anonymous"></script>
26
+
27
+<!-- pkgdown -->
28
+<link href="../pkgdown.css" rel="stylesheet">
29
+<script src="../pkgdown.js"></script>
30
+
31
+
32
+
33
+<meta property="og:title" content="Find a canonical contig to represent a cluster — canonicalize_cluster" />
34
+
35
+<meta property="og:description" content="Find a canonical contig to represent a cluster" />
36
+<meta name="twitter:card" content="summary" />
37
+
38
+
39
+
40
+<!-- mathjax -->
41
+<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js" integrity="sha256-nvJJv9wWKEm88qvoQl9ekL2J+k/RWIsaSScxxlsrv8k=" crossorigin="anonymous"></script>
42
+<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/config/TeX-AMS-MML_HTMLorMML.js" integrity="sha256-84DKXVJXs0/F8OTMzX4UR909+jtl4G7SPypPavF+GfA=" crossorigin="anonymous"></script>
43
+
44
+<!--[if lt IE 9]>
45
+<script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script>
46
+<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
47
+<![endif]-->
48
+
49
+
50
+  </head>
51
+
52
+  <body>
53
+    <div class="container template-reference-topic">
54
+      <header>
55
+      <div class="navbar navbar-default navbar-fixed-top" role="navigation">
56
+  <div class="container">
57
+    <div class="navbar-header">
58
+      <button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar" aria-expanded="false">
59
+        <span class="sr-only">Toggle navigation</span>
60
+        <span class="icon-bar"></span>
61
+        <span class="icon-bar"></span>
62
+        <span class="icon-bar"></span>
63
+      </button>
64
+      <span class="navbar-brand">
65
+        <a class="navbar-link" href="../index.html">CellaRepertorium</a>
66
+        <span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Released version">0.3.1</span>
67
+      </span>
68
+    </div>
69
+
70
+    <div id="navbar" class="navbar-collapse collapse">
71
+      <ul class="nav navbar-nav">
72
+        <li>
73
+  <a href="../index.html">
74
+    <span class="fa fa-home fa-lg"></span>
75
+     
76
+  </a>
77
+</li>
78
+<li>
79
+  <a href="../reference/index.html">Reference</a>
80
+</li>
81
+<li class="dropdown">
82
+  <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">
83
+    Articles
84
+     
85
+    <span class="caret"></span>
86
+  </a>
87
+  <ul class="dropdown-menu" role="menu">
88
+    <li>
89
+      <a href="../articles/cdr3_clustering.html">Clustering repertoire via CDR3 sequences</a>
90
+    </li>
91
+    <li>
92
+      <a href="../articles/mouse_tcell_qc.html">Quality control and Exploration of UMI-based repertoire data</a>
93
+    </li>
94
+  </ul>
95
+</li>
96
+      </ul>
97
+      
98
+      <ul class="nav navbar-nav navbar-right">
99
+        <li>
100
+  <a href="https://github.com/amcdavid/CellaRepertorium">
101
+    <span class="fa fa-github fa-lg"></span>
102
+     
103
+  </a>
104
+</li>
105
+      </ul>
106
+      
107
+    </div><!--/.nav-collapse -->
108
+  </div><!--/.container -->
109
+</div><!--/.navbar -->
110
+
111
+      
112
+      </header>
113
+
114
+<div class="row">
115
+  <div class="col-md-9 contents">
116
+    <div class="page-header">
117
+    <h1>Find a canonical contig to represent a cluster</h1>
118
+    <small class="dont-index">Source: <a href='https://github.com/amcdavid/CellaRepertorium/blob/master/R/clustering-methods.R'><code>R/clustering-methods.R</code></a></small>
119
+    <div class="hidden name"><code>canonicalize_cluster.Rd</code></div>
120
+    </div>
121
+
122
+    <div class="ref-description">
123
+    
124
+    <p>Find a canonical contig to represent a cluster</p>
125
+    
126
+    </div>
127
+
128
+    <pre class="usage"><span class='fu'>canonicalize_cluster</span>(<span class='no'>ccdb</span>, <span class='kw'>contig_filter_args</span> <span class='kw'>=</span> <span class='no'>is_medoid</span>,
129
+  <span class='kw'>tie_break_keys</span> <span class='kw'>=</span> <span class='fu'><a href='https://www.rdocumentation.org/packages/base/topics/character'>character</a></span>(), <span class='kw'>order</span> <span class='kw'>=</span> <span class='fl'>1</span>,
130
+  <span class='kw'>representative</span> <span class='kw'>=</span> <span class='no'>ccdb</span>$<span class='no'>cluster_pk</span>[<span class='fl'>1</span>], <span class='kw'>contig_fields</span> <span class='kw'>=</span> <span class='fu'><a href='https://www.rdocumentation.org/packages/base/topics/c'>c</a></span>(<span class='st'>"cdr3"</span>,
131
+  <span class='st'>"cdr3_nt"</span>, <span class='st'>"chain"</span>, <span class='st'>"v_gene"</span>, <span class='st'>"d_gene"</span>, <span class='st'>"j_gene"</span>))</pre>
132
+    
133
+    <h2 class="hasAnchor" id="arguments"><a class="anchor" href="#arguments"></a>Arguments</h2>
134
+    <table class="ref-arguments">
135
+    <colgroup><col class="name" /><col class="desc" /></colgroup>
136
+    <tr>
137
+      <th>ccdb</th>
138
+      <td><p>`ContigCellDB`</p></td>
139
+    </tr>
140
+    <tr>
141
+      <th>contig_filter_args</th>
142
+      <td><p>an expression passed to dplyr::filter.  Unlike `filter`, multiple criteria must be `&amp;` together, rather than using commas to separate.
143
+that act on `ccdb$contig_tbl``</p></td>
144
+    </tr>
145
+    <tr>
146
+      <th>tie_break_keys</th>
147
+      <td><p>(optional) `character` naming fields in `contig_tbl`
148
+that are used sort the contig table in descending order.
149
+Used to break ties if `contig_filter_args` does not return a unique contig
150
+for each cluster</p></td>
151
+    </tr>
152
+    <tr>
153
+      <th>order</th>
154
+      <td><p>The rank order of the contig, based on `tie_break_keys`
155
+to return</p></td>
156
+    </tr>
157
+    <tr>
158
+      <th>representative</th>
159
+      <td><p>an optional field from `contig_tbl` that will be made
160
+unique. Serve as a surrogate `cluster_pk`.</p></td>
161
+    </tr>
162
+    <tr>
163
+      <th>contig_fields</th>
164
+      <td><p>Optional fields from `contig_tbl` that will be copied into
165
+the `cluster_tbl` from the canonical contig.</p></td>
166
+    </tr>
167
+    </table>
168
+    
169
+    <h2 class="hasAnchor" id="value"><a class="anchor" href="#value"></a>Value</h2>
170
+
171
+    <p>`ContigCellDB`</p>
172
+    
173
+    <h2 class="hasAnchor" id="see-also"><a class="anchor" href="#see-also"></a>See also</h2>
174
+
175
+    <div class='dont-index'><p>canonicalize_cell</p></div>
176
+    
177
+
178
+    <h2 class="hasAnchor" id="examples"><a class="anchor" href="#examples"></a>Examples</h2>
179
+    <pre class="examples"><div class='input'><span class='fu'><a href='https://www.rdocumentation.org/packages/base/topics/library'>library</a></span>(<span class='no'>dplyr</span>)</div><div class='output co'>#&gt; <span class='message'></span>
180
+#&gt; <span class='message'>Attaching package: ‘dplyr’</span></div><div class='output co'>#&gt; <span class='message'>The following object is masked from ‘package:testthat’:</span>
181
+#&gt; <span class='message'></span>
182
+#&gt; <span class='message'>    matches</span></div><div class='output co'>#&gt; <span class='message'>The following objects are masked from ‘package:stats’:</span>
183
+#&gt; <span class='message'></span>
184
+#&gt; <span class='message'>    filter, lag</span></div><div class='output co'>#&gt; <span class='message'>The following objects are masked from ‘package:base’:</span>
185
+#&gt; <span class='message'></span>
186
+#&gt; <span class='message'>    intersect, setdiff, setequal, union</span></div><div class='input'><span class='no'>ccdb_ex_small</span> <span class='kw'>=</span> <span class='no'>ccdb_ex</span>
187
+<span class='no'>ccdb_ex_small</span>$<span class='no'>cell_tbl</span> <span class='kw'>=</span> <span class='no'>ccdb_ex_small</span>$<span class='no'>cell_tbl</span>[<span class='fl'>1</span>:<span class='fl'>200</span>,]
188
+<span class='no'>ccdb_ex_small</span> <span class='kw'>=</span> <span class='fu'><a href='cdhit.html'>cdhit_ccdb</a></span>(<span class='no'>ccdb_ex_small</span>,
189
+<span class='kw'>sequence_key</span> <span class='kw'>=</span> <span class='st'>'cdr3_nt'</span>, <span class='kw'>type</span> <span class='kw'>=</span> <span class='st'>'DNA'</span>, <span class='kw'>cluster_name</span> <span class='kw'>=</span> <span class='st'>'DNA97'</span>,
190
+<span class='kw'>identity</span> <span class='kw'>=</span> <span class='fl'>.965</span>, <span class='kw'>min_length</span> <span class='kw'>=</span> <span class='fl'>12</span>, <span class='kw'>G</span> <span class='kw'>=</span> <span class='fl'>1</span>)
191
+<span class='no'>ccdb_ex_small</span> <span class='kw'>=</span> <span class='fu'><a href='fine_clustering.html'>fine_clustering</a></span>(<span class='no'>ccdb_ex_small</span>, <span class='kw'>sequence_key</span> <span class='kw'>=</span> <span class='st'>'cdr3_nt'</span>, <span class='kw'>type</span> <span class='kw'>=</span> <span class='st'>'DNA'</span>)</div><div class='output co'>#&gt; <span class='message'>Calculating intradistances on 329 clusters.</span></div><div class='output co'>#&gt; <span class='message'>Summarizing</span></div><div class='input'>
192
+<span class='co'># Canonicalize with the medoid contig is probably what is most common</span>
193
+<span class='no'>ccdb_medoid</span> <span class='kw'>=</span> <span class='fu'>canonicalize_cluster</span>(<span class='no'>ccdb_ex_small</span>)
194
+
195
+<span class='co'># But there are other possibilities.</span>
196
+<span class='co'># To pass multiple "AND" filter arguments must use &amp;</span>
197
+<span class='no'>ccdb_umi</span> <span class='kw'>=</span> <span class='fu'>canonicalize_cluster</span>(<span class='no'>ccdb_ex_small</span>,
198
+<span class='kw'>contig_filter_args</span> <span class='kw'>=</span> <span class='no'>chain</span> <span class='kw'>==</span> <span class='st'>'TRA'</span> <span class='kw'>&amp;</span> <span class='no'>length</span> <span class='kw'>&gt;</span> <span class='fl'>500</span>, <span class='kw'>tie_break_keys</span> <span class='kw'>=</span> <span class='st'>'umis'</span>,
199
+<span class='kw'>contig_fields</span> <span class='kw'>=</span> <span class='fu'><a href='https://www.rdocumentation.org/packages/base/topics/c'>c</a></span>(<span class='st'>'chain'</span>, <span class='st'>'length'</span>))</div><div class='output co'>#&gt; <span class='message'>Subset of `contig_tbl` has 157 rows for 329 clusters. Filling missing values and breaking ties </span></div><div class='output co'>#&gt; <span class='message'>with umis.</span></div><div class='input'><span class='no'>ccdb_umi</span>$<span class='no'>cluster_tbl</span> <span class='kw'>%&gt;%</span> <span class='kw pkg'>dplyr</span><span class='kw ns'>::</span><span class='fu'><a href='https://dplyr.tidyverse.org/reference/select.html'>select</a></span>(<span class='no'>chain</span>, <span class='no'>length</span>) <span class='kw'>%&gt;%</span> <span class='fu'><a href='https://www.rdocumentation.org/packages/base/topics/summary'>summary</a></span>()</div><div class='output co'>#&gt;     chain               length      
200
+#&gt;  Length:329         Min.   : 503.0  
201
+#&gt;  Class :character   1st Qu.: 558.0  
202
+#&gt;  Mode  :character   Median : 607.0  
203
+#&gt;                     Mean   : 620.0  
204
+#&gt;                     3rd Qu.: 665.5  
205
+#&gt;                     Max.   :1006.0  
206
+#&gt;                     NA's   :186     </div></pre>
207
+  </div>
208
+  <div class="col-md-3 hidden-xs hidden-sm" id="sidebar">
209
+    <h2>Contents</h2>
210
+    <ul class="nav nav-pills nav-stacked">
211
+      <li><a href="#arguments">Arguments</a></li>
212
+      
213
+      <li><a href="#value">Value</a></li>
214
+
215
+      <li><a href="#see-also">See also</a></li>
216
+      
217
+      <li><a href="#examples">Examples</a></li>
218
+    </ul>
219
+
220
+  </div>
221
+</div>
222
+
223
+      <footer>
224
+      <div class="copyright">
225
+  <p>Developed by Andrew McDavid, Yu Gu.</p>
226
+</div>
227
+
228
+<div class="pkgdown">
229
+  <p>Site built with <a href="https://pkgdown.r-lib.org/">pkgdown</a> 1.3.0.</p>
230
+</div>
231
+      </footer>
232
+   </div>
233
+
234
+  
235
+
236
+  </body>
237
+</html>
238
+
... ...
@@ -6,7 +6,7 @@
6 6
 <meta http-equiv="X-UA-Compatible" content="IE=edge">
7 7
 <meta name="viewport" content="width=device-width, initial-scale=1.0">
8 8
 
9
-<title>Creat a method for ContigCellDB object to access its slots — $,ContigCellDB-method • CellaRepertorium</title>
9
+<title>Access public members of ContigCellDB object — $,ContigCellDB-method • CellaRepertorium</title>
10 10
 
11 11
 <!-- jquery -->
12 12
 <script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js" integrity="sha256-FgpCb/KJQlLNfOu91ta32o/NMZxltwRo8QtmkMRdAu8=" crossorigin="anonymous"></script>
... ...
@@ -30,9 +30,9 @@
30 30
 
31 31
 
32 32
 
33
-<meta property="og:title" content="Creat a method for ContigCellDB object to access its slots — $,ContigCellDB-method" />
33
+<meta property="og:title" content="Access public members of ContigCellDB object — $,ContigCellDB-method" />
34 34
 
35
-<meta property="og:description" content="Creat a method for ContigCellDB object to access its slots" />
35
+<meta property="og:description" content="Access public members of ContigCellDB object" />
36 36
 <meta name="twitter:card" content="summary" />
37 37
 
38 38
 
... ...
@@ -114,14 +114,14 @@
114 114
 <div class="row">
115 115
   <div class="col-md-9 contents">
116 116
     <div class="page-header">
117
-    <h1>Creat a method for ContigCellDB object to access its slots</h1>
117
+    <h1>Access public members of ContigCellDB object</h1>
118 118
     <small class="dont-index">Source: <a href='https://github.com/amcdavid/CellaRepertorium/blob/master/R/ContigCellDB-methods.R'><code>R/ContigCellDB-methods.R</code></a></small>
119 119
     <div class="hidden name"><code>cash-ContigCellDB-method.Rd</code></div>
120 120
     </div>
121 121
 
122 122
     <div class="ref-description">
123 123
     
124
-    <p>Creat a method for ContigCellDB object to access its slots</p>
124
+    <p>Access public members of ContigCellDB object</p>
125 125
     
126 126
     </div>
127 127
 
... ...
@@ -143,7 +143,7 @@ $(x, name)</pre>
143 143
     
144 144
     <h2 class="hasAnchor" id="value"><a class="anchor" href="#value"></a>Value</h2>
145 145
 
146
-    <p>Slots of ContigCellDB</p>
146
+    <p>Slot of ContigCellDB</p>
147 147
     
148 148
 
149 149
     <h2 class="hasAnchor" id="examples"><a class="anchor" href="#examples"></a>Examples</h2>
... ...
@@ -6,7 +6,7 @@
6 6
 <meta http-equiv="X-UA-Compatible" content="IE=edge">
7 7
 <meta name="viewport" content="width=device-width, initial-scale=1.0">
8 8
 
9
-<title>Create a function of ContigCellDB object to replace values of its slots — $&lt;-,ContigCellDB-method • CellaRepertorium</title>
9
+<title>Access public members of ContigCellDB object — $&lt;-,ContigCellDB-method • CellaRepertorium</title>
10 10
 
11 11
 <!-- jquery -->
12 12
 <script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js" integrity="sha256-FgpCb/KJQlLNfOu91ta32o/NMZxltwRo8QtmkMRdAu8=" crossorigin="anonymous"></script>
... ...
@@ -30,9 +30,9 @@
30 30
 
31 31
 
32 32
 
33
-<meta property="og:title" content="Create a function of ContigCellDB object to replace values of its slots — $&lt;-,ContigCellDB-method" />
33
+<meta property="og:title" content="Access public members of ContigCellDB object — $&lt;-,ContigCellDB-method" />
34 34
 
35
-<meta property="og:description" content="Create a function of ContigCellDB object to replace values of its slots" />
35
+<meta property="og:description" content="Access public members of ContigCellDB object" />
36 36
 <meta name="twitter:card" content="summary" />
37 37
 
38 38
 
... ...
@@ -114,14 +114,14 @@
114 114
 <div class="row">
115 115
   <div class="col-md-9 contents">
116 116
     <div class="page-header">
117
-    <h1>Create a function of ContigCellDB object to replace values of its slots</h1>
117
+    <h1>Access public members of ContigCellDB object</h1>
118 118
     <small class="dont-index">Source: <a href='https://github.com/amcdavid/CellaRepertorium/blob/master/R/ContigCellDB-methods.R'><code>R/ContigCellDB-methods.R</code></a></small>
119 119
     <div class="hidden name"><code>cash-set-ContigCellDB-method.Rd</code></div>
120 120
     </div>
121 121
 
122 122
     <div class="ref-description">
123 123
     
124
-    <p>Create a function of ContigCellDB object to replace values of its slots</p>
124
+    <p>Access public members of ContigCellDB object</p>
125 125
     
126 126
     </div>
127 127
 
128 128
new file mode 100644
... ...
@@ -0,0 +1,198 @@
1
+<!-- Generated by pkgdown: do not edit by hand -->
2
+<!DOCTYPE html>
3
+<html lang="en">
4
+  <head>
5
+  <meta charset="utf-8">
6
+<meta http-equiv="X-UA-Compatible" content="IE=edge">
7
+<meta name="viewport" content="width=device-width, initial-scale=1.0">
8
+
9
+<title>Cluster contigs by germline properties — cluster_germline • CellaRepertorium</title>
10
+
11
+<!-- jquery -->
12
+<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js" integrity="sha256-FgpCb/KJQlLNfOu91ta32o/NMZxltwRo8QtmkMRdAu8=" crossorigin="anonymous"></script>
13
+<!-- Bootstrap -->
14
+
15
+<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.3.7/css/bootstrap.min.css" integrity="sha256-916EbMg70RQy9LHiGkXzG8hSg9EdNy97GazNG/aiY1w=" crossorigin="anonymous" />
16
+<script src="https://cdnjs.cloudflare.com/ajax/libs/twitter-bootstrap/3.3.7/js/bootstrap.min.js" integrity="sha256-U5ZEeKfGNOja007MMD3YBI0A3OSZOQbeG6z2f2Y0hu8=" crossorigin="anonymous"></script>
17
+
18
+<!-- Font Awesome icons -->
19
+<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/font-awesome/4.7.0/css/font-awesome.min.css" integrity="sha256-eZrrJcwDc/3uDhsdt61sL2oOBY362qM3lon1gyExkL0=" crossorigin="anonymous" />
20
+
21
+<!-- clipboard.js -->
22
+<script src="https://cdnjs.cloudflare.com/ajax/libs/clipboard.js/2.0.4/clipboard.min.js" integrity="sha256-FiZwavyI2V6+EXO1U+xzLG3IKldpiTFf3153ea9zikQ=" crossorigin="anonymous"></script>
23
+
24
+<!-- sticky kit -->
25
+<script src="https://cdnjs.cloudflare.com/ajax/libs/sticky-kit/1.1.3/sticky-kit.min.js" integrity="sha256-c4Rlo1ZozqTPE2RLuvbusY3+SU1pQaJC0TjuhygMipw=" crossorigin="anonymous"></script>
26
+
27
+<!-- pkgdown -->
28
+<link href="../pkgdown.css" rel="stylesheet">
29
+<script src="../pkgdown.js"></script>
30
+
31
+
32
+
33
+<meta property="og:title" content="Cluster contigs by germline properties — cluster_germline" />
34
+
35
+<meta property="og:description" content="Cluster contigs by germline properties" />
36
+<meta name="twitter:card" content="summary" />
37
+
38
+
39
+
40
+<!-- mathjax -->
41
+<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/MathJax.js" integrity="sha256-nvJJv9wWKEm88qvoQl9ekL2J+k/RWIsaSScxxlsrv8k=" crossorigin="anonymous"></script>
42
+<script src="https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.5/config/TeX-AMS-MML_HTMLorMML.js" integrity="sha256-84DKXVJXs0/F8OTMzX4UR909+jtl4G7SPypPavF+GfA=" crossorigin="anonymous"></script>
43
+
44
+<!--[if lt IE 9]>
45
+<script src="https://oss.maxcdn.com/html5shiv/3.7.3/html5shiv.min.js"></script>
46
+<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
47
+<![endif]-->
48
+
49
+
50
+  </head>
51
+
52
+  <body>
53
+    <div class="container template-reference-topic">
54
+      <header>
55
+      <div class="navbar navbar-default navbar-fixed-top" role="navigation">
56
+  <div class="container">
57
+    <div class="navbar-header">
58
+      <button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#navbar" aria-expanded="false">
59
+        <span class="sr-only">Toggle navigation</span>
60
+        <span class="icon-bar"></span>
61
+        <span class="icon-bar"></span>
62
+        <span class="icon-bar"></span>
63
+      </button>
64
+      <span class="navbar-brand">
65
+        <a class="navbar-link" href="../index.html">CellaRepertorium</a>
66
+        <span class="version label label-default" data-toggle="tooltip" data-placement="bottom" title="Released version">0.3.1</span>
67
+      </span>
68
+    </div>
69
+
70
+    <div id="navbar" class="navbar-collapse collapse">
71
+      <ul class="nav navbar-nav">
72
+        <li>
73
+  <a href="../index.html">
74
+    <span class="fa fa-home fa-lg"></span>
75
+     
76
+  </a>
77
+</li>
78
+<li>
79
+  <a href="../reference/index.html">Reference</a>
80
+</li>
81
+<li class="dropdown">
82
+  <a href="#" class="dropdown-toggle" data-toggle="dropdown" role="button" aria-expanded="false">
83
+    Articles
84
+     
85
+    <span class="caret"></span>
86
+  </a>
87
+  <ul class="dropdown-menu" role="menu">
88
+    <li>
89
+      <a href="../articles/cdr3_clustering.html">Clustering repertoire via CDR3 sequences</a>
90
+    </li>
91
+    <li>
92
+      <a href="../articles/mouse_tcell_qc.html">Quality control and Exploration of UMI-based repertoire data</a>
93
+    </li>
94
+  </ul>
95
+</li>
96
+      </ul>
97
+      
98
+      <ul class="nav navbar-nav navbar-right">
99
+        <li>
100
+  <a href="https://github.com/amcdavid/CellaRepertorium">
101
+    <span class="fa fa-github fa-lg"></span>
102
+     
103
+  </a>
104
+</li>
105
+      </ul>
106
+      
107
+    </div><!--/.nav-collapse -->
108
+  </div><!--/.container -->
109
+</div><!--/.navbar -->
110
+
111
+      
112
+      </header>
113
+
114
+<div class="row">
115
+  <div class="col-md-9 contents">
116
+    <div class="page-header">
117
+    <h1>Cluster contigs by germline properties</h1>
118
+    <small class="dont-index">Source: <a href='https://github.com/amcdavid/CellaRepertorium/blob/master/R/clustering-methods.R'><code>R/clustering-methods.R</code></a></small>
119
+    <div class="hidden name"><code>cluster_germline.Rd</code></div>
120
+    </div>
121
+
122
+    <div class="ref-description">
123
+    
124
+    <p>Cluster contigs by germline properties</p>
125
+    
126
+    </div>
127
+
128
+    <pre class="usage"><span class='fu'>cluster_germline</span>(<span class='no'>ccdb</span>, <span class='kw'>segment_keys</span> <span class='kw'>=</span> <span class='fu'><a href='https://www.rdocumentation.org/packages/base/topics/c'>c</a></span>(<span class='st'>"v_gene"</span>, <span class='st'>"j_gene"</span>, <span class='st'>"chain"</span>),
129
+  <span class='kw'>cluster_name</span> <span class='kw'>=</span> <span class='st'>"cluster_idx"</span>)</pre>
130
+    
131
+    <h2 class="hasAnchor" id="arguments"><a class="anchor" href="#arguments"></a>Arguments</h2>
132
+    <table class="ref-arguments">
133
+    <colgroup><col class="name" /><col class="desc" /></colgroup>
134
+    <tr>
135
+      <th>ccdb</th>
136
+      <td><p>`ContigCellDB`</p></td>
137
+    </tr>
138
+    <tr>
139
+      <th>segment_keys</th>
140
+      <td><p>fields in `contig_tbl` that identify a cluster</p></td>
141
+    </tr>
142
+    <tr>
143
+      <th>cluster_name</th>
144
+      <td><p>name of cluster to be added to `cluster_tbl`</p></td>
145
+    </tr>
146
+    </table>
147
+    
148
+    <h2 class="hasAnchor" id="value"><a class="anchor" href="#value"></a>Value</h2>
149
+
150
+    <p>`ContigCellDB`</p>
151
+    
152
+
153
+    <h2 class="hasAnchor" id="examples"><a class="anchor" href="#examples"></a>Examples</h2>
154
+    <pre class="examples"><div class='input'><span class='no'>ccdb_ex</span> <span class='kw'>=</span> <span class='fu'>cluster_germline</span>(<span class='no'>ccdb_ex</span>)
155
+<span class='no'>ccdb_ex</span>$<span class='no'>cluster_tbl</span></div><div class='output co'>#&gt; <span style='color: #555555;'># A tibble: 707 x 4</span><span>
156
+#&gt;    cluster_idx v_gene j_gene chain
157
+#&gt;          </span><span style='color: #555555;font-style: italic;'>&lt;int&gt;</span><span> </span><span style='color: #555555;font-style: italic;'>&lt;chr&gt;</span><span>  </span><span style='color: #555555;font-style: italic;'>&lt;chr&gt;</span><span>  </span><span style='color: #555555;font-style: italic;'>&lt;chr&gt;</span><span>
158
+#&gt; </span><span style='color: #555555;'> 1</span><span>           1 TRAV1  TRAJ15 TRA  
159
+#&gt; </span><span style='color: #555555;'> 2</span><span>           2 TRAV1  TRAJ18 TRA  
160
+#&gt; </span><span style='color: #555555;'> 3</span><span>           3 TRAV1  TRAJ22 TRA  
161
+#&gt; </span><span style='color: #555555;'> 4</span><span>           4 TRAV1  TRAJ24 TRA  
162
+#&gt; </span><span style='color: #555555;'> 5</span><span>           5 TRAV1  TRAJ26 TRA  
163
+#&gt; </span><span style='color: #555555;'> 6</span><span>           6 TRAV1  TRAJ33 TRA