# Rcpi
## Introduction
The Rcpi package offers an R/Bioconductor package emphasizing the comprehensive integration of bioinformatics and chemoinformatics into a molecular informatics platform for drug discovery. See `vignette('Rcpi')` for the comprehensive user guide.
## Installation
To install the Rcpi package in R, simply type
source('http://bioconductor.org/biocLite.R')
biocLite('Rcpi')
To make the Rcpi package fully functional (especially the Open Babel related functionalities), we recommend the users also install the _Enhances_ packages by using:
source('http://bioconductor.org/biocLite.R')
biocLite('Rcpi', dependencies = c('Imports', 'Enhances'))
Several dependencies of the Rcpi package may require some system-level libraries, check the corresponding manuals of these packages for detailed installation guides.
## Features
Rcpi implemented and integrated the state-of-the-art protein sequence descriptors and molecular descriptors/fingerprints with R. For protein sequences, the Rcpi package could
* Calculate six protein descriptor groups composed of fourteen types of commonly used structural and physicochemical descriptors that include 9920 descriptors.
* Calculate six types of generalized scales-based descriptors derived by various dimensionality reduction methods for proteochemometric (PCM) modeling.
* Parallellized pairwise similarity computation derived by protein sequence alignment and Gene Ontology (GO) semantic similarity measures within a list of proteins.
For small molecules, the Rcpi package could
* Calculate 307 molecular descriptors (2D/3D), including constitutional, topological, geometrical, and electronic descriptors, etc.
* Calculate more than ten types of molecular fingerprints, including FP4 keys, E-state fingerprints, MACCS keys, etc., and parallelized chemical similarity search.
* Parallelized pairwise similarity computation derived by fingerprints and maximum common substructure search within a list of small molecules.
By combining various types of descriptors for drugs and proteins in different methods, interaction descriptors representing protein-protein or compound-protein interactions could be conveniently generated with Rcpi, including:
* Two types of compound-protein interaction (CPI) descriptors
* Three types of protein-protein interaction (PPI) descriptors
Several useful auxiliary utilities are also shipped with Rcpi:
* Parallelized molecule and protein sequence retrieval from several online databases, like PubChem, ChEMBL, KEGG, DrugBank, UniProt, RCSB PDB, etc.
* Loading molecules stored in SMILES/SDF files and loading protein sequences from FASTA/PDB files
* Molecular file format conversion
The computed protein sequence descriptors, molecular descriptors/fingerprints, interaction descriptors and pairwise similarities are widely used in various research fields relevant to drug disvery, primarily bioinformatics, chemoinformatics, proteochemometrics and chemogenomics.
## Links
* Bioconductor Page: http://bioconductor.org/packages/release/bioc/html/Rcpi.html
* Track Devel: https://github.com/road2stat/Rcpi
* Report Bugs: https://github.com/road2stat/Rcpi/issues
## Contact
The Rcpi package is developed by Computational Biology and Drug Design Group, Central South University, China.
* Nan Xiao <road2stat@gmail.com>
* Dongsheng Cao <oriental-cds@163.com>
* Qingsong Xu <dasongxu@gmail.com>