Citation: | Cokelaer et al. GDSCTools for mining pharmacogenomic interactions in cancer Bioinformatics, 2017, https://doi.org/10.1093/bioinformatics/btx744 |
---|---|
Note: | developed and tested for Python 2.7, 3.4, 3.5 |
Note: | The GDSCTools libary works for Python 2.7 and 3.5 but the standalone pipeline to be ran on cluster works on Python 3.5 only |
Contributions: | Please join https://github.com/CancerRxGene/gdsctools project |
Documentation: | On ReadTheDocs |
GitHub: | On github |
GDSCTools is a free open-source Python library dedicated to the study of drug responses in the context of the GDSC (Genomics of Drug Sensitivity in Cancer) project. The main developer is Thomas Cokelaer (Institut Pasteur), and it is a joint effort of the groups of Mathew Garnett (Sanger Institute) and Julio Saez-Rodriguez (RWTH Aachen & EMBL-EBI).
It contains utilities to find significant associations between drugs and genomic features (e.g., gene mutation) based on an :ref:`ANOVA <anova_partone>` analysis. Other methods, such as multi-factorial linear models based on :ref:`Elastic Net <multivariate_regression>` are also available. Besides, the library should also be useful for manipulating dedicated data sets such as :ref:`IC50 <data>` (drug response) or :ref:`MoBEM <omnibem>` (genomic features) data structures. Hence, we hope that GDSCTools serves as basis for other scientists to develop further methods.
Get started with GDSCTools
Visit our example gallery
Browse the full documentation
.. index:: installation
GDSCTools is written in Python. If you are a developer and/or knows already about the Python ecosystem and the pip command, just type the following command in a :term:`Terminal` to install GDSCTools:
pip install gdsctools
add the option --upgrade
to get the latest release. Conversely, if you are not
familiar with Python or the command above, please see the :ref:`Installation` section
for further details. Note also that we strongly recommend to use Anaconda to install dependencies (e.g., numpy, matplotlib); GDSCTools is available on bioconda channel:
conda install gdsctools.
In the following example, we provide a short Python snippet that uses the GDSCTools library. You can either copy and paste the code in a file, and execute it or type the commands in an :term:`IPython` shell. With this example we investigate the associations between the :term:`IC50` of a given drug (across 52 breast cancer cell lines) and a genomic feature (here, TP53 mutation). Drugs are refer to by a unique identifier (here 1047):
from gdsctools import ANOVA, ic50_test
gdsc = ANOVA(ic50_test)
gdsc.set_cancer_type('breast')
df = gdsc.anova_one_drug_one_feature(1047, 'TP53_mut', show=True)
.. plot:: :width: 80% from gdsctools import ANOVA, ic50_test gdsc = ANOVA(ic50_test) gdsc.set_cancer_type('breast') df = gdsc.anova_one_drug_one_feature(1047, 'TP53_mut', show=True)
The :attr:`df` object returned in the last statement is a dataframe that contains information explained in :ref:`regression` section.
.. seealso:: For more examples and explanations, please visit the :ref:`anova_partone` section.
.. index:: warnings
The previous example may be verbose with comments and warnings. You may set the verbose option to False and ignore warnings as follows:
import warnings warnings.simplefilter("ignore","exceptions.Warning") gdsc = ANOVA(ic50_test, verbose=False)
.. index:: standalone
We will see more examples on how to use GDSCTools to perform more systematic studies. However, let us note that GDSCTools also provide a standalone application called gdsctools_anova, which can be used within a standard :term:`Terminal` (same output as in the previous example):
gdsctools_anova --input-ic50 <ic50 filename> --drug 1047 --feature TP53_mut
If you want to have a go, please download this :download:`IC50 example <../gdsctools/data/test_IC50.csv>`, which is required as an input. Note that by default, GDSCTools loads a set of 50 genomic features and 1001 cell lines but in general, you should provide your own genomic feature file (see :ref:`data`). The default data set contains only a small set of genomic features and can be downloaded: :download:`GenomicFeature example <../gdsctools/data/genomic_features.tsv.gz>`, and adapted to your needs.
.. seealso:: See :ref:`standalone` section for more details about the standalone application and the :ref:`data` section to learn more about the expected input data formats.
.. toctree:: :numbered: :maxdepth: 1 installation.rst quickstart.rst data.rst anova_partone.rst anova_parttwo.rst html.rst data_packages.rst omnibem.rst regression.rst notebooks.rst standalone.rst auto_examples/index releases.rst references.rst developers.rst
Please fill bug report in https://github.com/CancerRxGene/gdsctools/issues
Please join https://github.com/CancerRxGene/gdsctools
.. toctree:: :hidden: ChangeLog.rst faqs glossary