--- output: github_document --- ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", cache = TRUE, out.width = "100%" ) options(tibble.print_min = 5, tibble.print_max = 5) ``` # cBioPortalData <!-- start badges here --> [](https://bioconductor.org/checkResults/release/bioc-LATEST/cBioPortalData) [](https://www.bioconductor.org/packages/release/bioc/html/cBioPortalData.html#archives) [](https://bioconductor.org/packages/stats/bioc/cBioPortalData/) <!-- end badges here --> ## cBioPortal data and MultiAssayExperiment ### Overview The `cBioPortalData` R package aims to import cBioPortal datasets as [MultiAssayExperiment][1] objects into Bioconductor. Some of the features of the package include: 1. The use of the `MultiAssayExperiment` integrative container for coordinating and representing the data. 2. The data container explicitly links all assays to the patient clinical/pathological data. 3. With a [flexible API][2], `MultiAssayExperiment` provides harmonized subsetting and reshaping into convenient wide and long formats. 4. The package provides datasets from both the API and the saved packaged data. 5. It also provides automatic local caching, thanks to [BiocFileCache][3]. [1]: http://bioconductor.org/packages/MultiAssayExperiment/ [2]: https://github.com/waldronlab/MultiAssayExperiment/wiki/MultiAssayExperiment-API [3]: https://bioconductor.org/packages/BiocFileCache/ ## MultiAssayExperiment Cheatsheet <a href="https://github.com/waldronlab/cheatsheets/blob/master/MultiAssayExperiment_QuickRef.pdf"> <img src="https://raw.githubusercontent.com/waldronlab/cheatsheets/master/pngs/MultiAssayExperiment_QuickRef.png" width="989" height="1091"/> </a> ## Quick Start ### Installation To install from Bioconductor (recommended for most users, this will install the release or development version corresponding to your version of Bioconductor): ```{r,eval=FALSE} if (!require("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("cBioPortalData") ``` To install from GitHub (for bleeding-edge, not generally necessary because changes here are also pushed to [bioc-devel](https://www.bioconductor.org/developers/how-to/useDevel/). Installing the development version and generally requires running [bioc-devel](https://www.bioconductor.org/developers/how-to/useDevel/)) ```{r,eval=FALSE} if (!require("cBioPortalData", quietly = TRUE)) BiocManager::install("waldronlab/cBioPortalData") ``` To load the package: ```{r,include=TRUE,results="hide",message=FALSE,warning=FALSE} library(cBioPortalData) ``` ```{r,results="hide",include=FALSE} cbio <- cBioPortal() studies <- getStudies(cbio, buildReport = TRUE) st <- studies packpct <- round(prop.table(table(st$pack_build))[[2]], 2) * 100 apipct <- round(prop.table(table(st$api_build))[[2]], 2) * 100 ``` ## Note `cBioPortalData` is a work in progress due to changes in data curation and cBioPortal API specification. Users can view the `data(studiesTable)` dataset to get an overview of the studies that are available and currently building as `MultiAssayExperiment` representations. About `r apipct` % of the studies via the API (`api_build`) and `r packpct` % of the package studies (`pack_build`) are building, these include additional datasets that were not previously available. Feel free to file an issue to request prioritization of fixing any of the remaining datasets. ```{r,eval=FALSE} cbio <- cBioPortal() studies <- getStudies(cbio, buildReport = TRUE) ``` ```{r} table(studies$api_build) table(studies$pack_build) ``` ### API Service Flexible and granular access to cBioPortal data from `cbioportal.org/api`. This option is best used with a particular gene panel of interest. It allows users to download sections of the data with molecular profile and gene panel combinations within a study. ```{r,warning=FALSE,message=FALSE,include=TRUE,results="hide"} gbm <- cBioPortalData(api = cbio, by = "hugoGeneSymbol", studyId = "gbm_tcga", genePanelId = "IMPACT341", molecularProfileIds = c("gbm_tcga_rppa", "gbm_tcga_mrna") ) ``` ```{r} gbm ``` ### Packaged Data Service This function will download a dataset from the `cbioportal.org/datasets` website as a packaged tarball and serve it to users as a `MultiAssayExperiment` object. This option is good for users who are interested in obtaining all the data for a particular study. ```{r,warning=FALSE,message=FALSE,include=TRUE,results="hide"} acc <- cBioDataPack("acc_tcga") ``` ```{r} acc ```