Significant Gene-Tumor Pairs Identified by MutPanning


Download Table

Description

This table displays the significant gene-tumor pairs (i.e., associations between tumor types and significantly mutated genes) identified by MutPanning. For each gene-tumor pair, the table lists the gene name (HUGO nomenclature), the cancer type, in which the genes was found to be significantly mutated, its mutation frequency in the respective tumor type, as well as the false-discovery rate (q-value) returned by MutPanning. Further, we benchmarked our results against previous studies and datasets. For each gene-tumor pair, we searched whether it represented a canoncical cancer gene in the COSMIC cancer gene census, whether it was supported in the same tumor type in the literature or whether it was identified as significantly mutated for the same tumor type in previous computational studies.

On the left sidebar you can select the tumor types for which you would like to display the results. Some sequencing studies did not report synonymous mutations (6.1% of the samples). Cancer types that contain samples without synonymous mutations are marked by asterisks (*). Further, you can select between different FDR thresholds for which results should be included into the table. Additional information on the genes are available, including the q-values of an established recurrence-based approach and the p-values resulting from the different criteria used by MutPanning. These information can be displayed by selecting more columns in the left sidebar. Use the Cntrl and the Shift keys to select multiple columns.


Symbols

Gene reflects a canonical cancer gene in the COSMIC Cancer Gene Census

Previous studies in the literature support this gene in the same tumor type

Gene reported as significantly mutated in the TCGA marker papers

Gene reported as significantly mutated in the Bailey study (Bailey et al. Cell 2018)

Gene reported as significantly mutated on tumorportal.org (Lawrence et al. Nature 2014)

Gene identified using the dNdSCV tool (Martincorena et al. Cell 2017)

Significantly Mutated Genes Identified by MutPanning


Download Table

Description

This table displays the genes identified as significantly mutated by MutPanning. The table lists the name of each significant gene (HUGO nomenclature), the number of cancer types, in which the genes was found to be significantly mutated, its maximal mutation frequency across the significantly mutated tumor types, as well as the best (i.e., minimal) false-discovery rate (q-value) returned by MutPanning. Further, we benchmarked our results against previous studies and datasets. For each significantly mutated gene, we searched whether it represented a canoncical cancer gene in the COSMIC Cancer Gene Census, whether it was implicated in cancer in the literature or whether it was identified as significantly mutated for any tumor type in previous computational studies.

On the left sidebar you can select the tumor types for which you would like to display the results. Some sequencing studies did not report synonymous mutations (6.1% of the samples). Cancer types that contain samples without synonymous mutations are marked by asterisks (*). Further, you can select between different FDR thresholds for which results should be included into the table. Additional information on the genes are available, including the names of the tumor entities in which it was significantly mutated, the best q-value of an established recurrence-based approach and the best p-values resulting from the different criteria used by MutPanning. These information can be displayed by selecting more columns in the left sidebar. Use the Cntrl and the Shift keys to select multiple columns.


Symbols

Gene reflects a canonical cancer gene in the COSMIC Cancer Gene Census

Previous studies in the literature support this gene in the same tumor type

Gene reported as significantly mutated in the TCGA marker papers

Gene reported as significantly mutated in the Bailey study (Bailey et al. Cell 2018)

Gene reported as significantly mutated on tumorportal.org (Lawrence et al. Nature 2014)

Gene identified using the dNdSCV tool (Martincorena et al. Cell 2017)

Gene identified using a recurrence-based approch (Lawrence et al. Nature 2013)

Background

MutPanning is designed to detect rare cancer driver genes from aggregated whole-exome sequencing data. Most approaches detect cancer genes based on their mutational excess, i.e. they search for genes with an increased number of nonsynonymous mutations above the background mutation rate. MutPanning further accounts for the nucleotide context around mutations and searches for genes with an excess of mutations in unusual sequence contexts that deviate from the characteristic sequence context around passenger mutations.

The name MutPanning is inspired by the words "mutation" and "panning". The goal of the MutPanning algorithm is to discover new tumor genes in aggregated sequencing data, i.e. to "pan" the few tumor-relevant driver mutations from the abundance of functionally neutral passenger mutations in the background. Previous approaches for cancer gene discovery were mostly based on mutational recurrence, i.e. they detected cancer genes based on their excess of nonsynonymous mutation above the local background mutation rate. Further, they search for mutations that occur in functionally important genomic positions, as predicted by bioinformatical scores). These approaches are highly effective in tumor types, for which the average background mutation rate (i.e., the total mutational burden) is low or moderate.

The ability to detect driver genes can be increased by considering the nucleotide context around mutations in the statistical model. MutPanning utilizes the observation that most passenger mutations are surrounded by characteristic nucleotide sequence contexts, reflecting the background mutational process active in a given tumor. In contrast, driver mutations are localized towards functionally important positions, which are not necessarily surrounded by the same nucleotide contexts as passenger mutations. Hence, in addition to mutational excess, MutPanning searches for genes with an excess of mutations in unusual sequence contexts that deviate from the characteristic sequence context around passenger mutations. That way, MutPanning actively suppresses mutations in its test statistics that are likely to be passenger mutations based on their surrounding nucleotide contexts. Considering the nucleotide context is particularly useful in tumor types with high background mutation rates and high nucleotide context specificity (e.g., melanoma, bladder, endometrial, or colorectal cancer).

Algorithm

Most passenger mutations occur in characteristic nucleotide contexts that reflect the mutational process active in a given tumor. MutPanning searches for mutations in “unusual” nucleotide contexts that deviate from this background mutational process. In these positions, passenger mutations are rare and mutations are thus a strong indicator of the shift of driver mutations towards functionally important positions.

The main steps of MutPanning are as follows (adopted from Dietlein et al.):

(i) Model the mutation probability of each genomic position in the human exome depending on its surrounding nucleotide context and the regional background mutation rate.

(ii) Given a gene with n nonsynonymous mutations, use a Monte Carlo simulation approach to simulate a large number of random “scenarios” in which n or more nonsynonymous mutations are randomly distributed along the same gene .

(iii) Compare the number and positions of mutations in each random scenario with the observed mutations in gene . Based on these comparisons, derive a p-value for the gene.

(iv) Combine this p-value with additional statistical components that account for insertions and deletions, the abundance of deleterious mutations, and mutational clustering.

The following figure (adopted from Dietlein et al.) illustrates how MutPanning works.

Running MutPanning

You have two options to run MutPanning. You can either run MutPanning as a module on the GenePattern platform (genepattern.org). Registration to GenePattern is free and you can conveniently launch MutPanning from your webbrowser and add files via drag and drop.

Further, we provide an interactive desktop version of MutPanning, which allows you to run MutPanning on your own computer (at least 8 GB memory required) and add files through a dialog window

Input Files

A detailed description of the input file formats of MutPanning can be found in the MutPanning Software User Manual, which can be downloaded under the Downloads section. In brief, MutPanning needs the following two input files.

1. A Mutation Annotation Format (MAF) file that lists the positions of all somatic mutations.

Each row in this file corresponds to an individual mutation that was detected in one of the tumor samples. This file should have the following columns: Hugo_Symbol, Chromosome, Start_Position, End_Position, Strand, Variand_Classification, Variant_Type, Reference_Allele, Tumor_Seq_Allele1, Tumor_Seq_Allele2, Tumor_Sample_Barcode

2. A sample annotation file that contains the unique identifiers of all samples and annotates their cancer type.

Each row in this file corresponds to an individual sample. This file should have the following columns: ID, Sample, Cohort

Output Files

The output of MutPanning is a zip folder that contains a tab-delimited significance file for each cancer type, contained in the sample annotation file. This file lists the significance of each gene in the respective tumor type. Further, the zip folder contains a significance file for the pan-cancer cohort that lists the significance of each gene across the entire cohort. Each row in these files annotates the signifificance of a gene. These files contain the following columns:

gene: gene name in HUGO nomenclature

target_n: nonsynonymous target size

target_s: synonymous target size

count_n: nonsynonymous mutation count

count_s: synonymous mutation count

p: p-value derived by the MutPanning method

q: q-value (false-discovery rate) derived by correcting the p-value for multiple hypothesis testing.

MutPanning Software

MutPanning is distributed under the BSD-3-Clause open source license. By downloading the MutPanning software, you acknowledge and agree to the terms of this license.

Download MutPanning for MacOS

Download MutPanning for Windows

Download MutPanning as a JAR file

Instead of downloading the software, MutPanning is also publically available as a module on the GenePattern server. After free registration, you can run MutPanning as an interactive module on genepattern.org from your browser.

Sequencing Datasets

The sequencing datasets from our study can be downloaded below. Sequencing data in the maf file have been compiled from different sequencing studies. Differences in tissue collection protocols, variant calling pipelines and mutation reports (e.g., synonymous mutations were not reported in 6.1% of the samples) may represent a potential source of heterogeneity. Please keep this limitation in mind when using this dataset.

The full whole-exome sequencing dataset from 11,873 tumors that we used in the study can be downloaded below. MutPanning needs approx. 2 hrs 10 min to run on these data.

Download Full Dataset

If you would like to test MutPanning, you can use the subset of 582 melanoma samples below. MutPanning needs approx. 20 min to run on these data.

Download Melanoma Subset

Even if you do not intend to use these data for testing purposes, we encourage you downloading these data, in order to familiarize yourself with the format and prepare your own data accordingly. Detailed instructions on how to prepare your data for MutPanning are provided in the MutPanning software dialog

Using MutPanning?

If you use MutPanning or any of the tumor genes shown on this website, please cite our preprint on biorxiv.org, while the MutPanning manuscript is under review. We will update this reference, as soon as the MutPanning manuscript has been accepted for publication at a scientific journal.

Contact

If you have any questions, suggestions or feedback, we would love to hear them. Further, if you need any technical assistance with the MutPanning software, the sequencing dataset, or the findings reported on this website, we are there to help. Please contact us via email under help@cancer-genes.org

Disclaimer

The information and software on this website are for general information purposes only. While we endeavour to keep the information up to date and correct, we make no representations or warranties of any kind, express or implied, about the completeness, accuracy, reliability, suitability or availability with respect to the website or the information, products, services, or related graphics contained on the website for any purpose. Any reliance you place on such information is therefore strictly at your own risk.

In no event will we be liable for any loss or damage including without limitation, indirect or consequential loss or damage, or any loss or damage whatsoever arising from loss of data or profits arising out of, or in connection with, the use of this website or software. The information and software provided on this website are not intended for medical or diagnostic purposes.

Through this website you may be able to link to other websites. We have no control over the nature, content and availability of those sites. The inclusion of any links does not necessarily imply a recommendation or endorse the views expressed within them.

Every effort is made to keep the website and software up and running smoothly. However, we takes no responsibility for, and will not be liable for, the website being temporarily unavailable due to technical issues beyond our control.

MutPanning is distributed under the BSD-3-Clause open source license. By downloading the MutPanning software, you acknowledge and agree to the terms of this license.