Menu

expression_plot

Keith Ching

[Instructions]

expression_plot : 2D scatter plot of molecular features.

The expression_plot functionality is a complicated, yet powerful way to render multiple molecular datatypes onto a 2D chart.

HUGO :The first gene symbol will be plotted on the x-axis. The second gene symbol will be used for the y-axis. eg. BRAF KRAS
Additional genes beyond the 2nd gene will be colored blue/red (low/high) via median centering and assigned a different shape.

Modifier flags. Unmodified gene symbols refer to gene expression and will use the dataset specified by EXPRESSION
-c : adding a -c to the gene symbol means to use the CNV dataset specified by CNV SOURCE eg. KRAS KRAS-c will plot the gene expression on the x-axis and the CNV level on the y-axis
-m : adding a -m to the gene symbol means to plot mutation data from MUTATION. Note that mutation data must be in the 3rd or greater position since a x-axis and y-axis datatype must be selected as mutation values are not quantitative. eg. KRAS KRAS-c KRAS-m will add mutations to the samples.
-p : adding a -p means to use the RPPA protein level for an antibody targeting the gene. eg. esr1 esr1-p Multiple antibodies to the same gene will be plotted according to their order. The color scheme differs from gene expression. Darker brown indicates higher protein levels relative to light brown.

Alternatively, if you wish to select a particular antibody, another pulldown menu with the list of antibodies will appear once you have selected a RPPA source and resubmitted the form.

-l : adding -l to the gene symbol for the 3rd gene will color in a solid color samples which are <= the Percentile cutoff
-h : adding -h to the gene symbol for the 3rd gene colors samples >= 100-Percentile cutoff

For -l or -h, a fisher-exact test table is output comparing the number of samples meeting the criteria versus membership in cluster1 vs. cluster2. (only good for K clusters=2) The test is useful to examine if there is enrichment or depletion of a gene's expression between two groups defined by the x and y axes. eg. esr1 erbb2 rb1-l in the TCGA-BRCA-RSEM dataset

Special names: Instead of a gene symbol, there are a few special names.
IC50-DRUG : plot the IC50 value of drug specified by Compound and Compound Source as an axis. (x-axis if first, y-axis if second)
IC50-NN : use the compound specified by compound ID number (listed in the Compound pulldown) eg. IC50-1
META-NN : use the meta label specified by meta ID number (listed in the Metavalue pulldown) (must be in 3rd or greater position) Note: the Metavalue pulldown will appear after a Metadata source is selected and the form resubmitted.
Note: numerical META can be plotted on the x or y-axis using META-NN in the first or second position respectively.

MUT-COUNT : plot as an axis (1st or 2nd position) the absolute number of mutations per sample annotated in the database. Certain samples, especially in the TCGA-UCEC dataset have an extremely high mutational load, so any conclusions about a given mutation's importance should be taken in this context.
CNV-GIN : plot as an axis a measure of genomic instability derived from the CNV data, calculated using the number of segments, size of segments, and absolute amplitude of segments. Normal HapMap samples have very low CNV-GIN.

hsa-xxx-xxxx : microRNA ids start with hsa eg.( hsa-mir-1200, hsa-mir-103-1-as, hsa-let-7e, hsa-let-7a-1 ). To plot the microRNA values, choose a miRNA source and enter the identifier. Example plot

Parameters:

Percentile cutoff : the threshold for plotting when using -l or -h (default 5)
K clusters : the plot automatically plots a pamclust using the x and y axis data for K number of groups. (default 2)
CNV amp / del : when plotting CNV in 3rd or greater position, the cutoffs for plotting amplifications or deletions. Focal alterations (<= 10MB) are marked with a 'F'. (default 2 / -2)
IC50 low/high : when plotting a drug in 3rd or greater position, or when Compound / Compound Source selected, the cutoffs for plotting sensitivity / resistance. (default 100 / 500)
short meta label : When plotting meta labels (Metadata selected) only plot the first 3 characters instead of the full label.
color meta label : color each unique meta value with a different color
meta colors : if you want a certain label to be a particular color, specify the list of colors to use for meta labels
exp cutoff low/high : used for the summary table at the end for calculating number of samples exceeding thresholds.
plot sample label : plot the name of the sample on the scatter plot. Most useful when there are few points, or an extreme outlier you want to label.
color clusters : color the points on the plot by their cluster membership as defined by pamclust.
plot legend : plot the legend within the plot. The legend is plotted on page 2. If there is an empty area you can render the image within the plot without obscuring the data.
plot position : which corner of the plot to render the lengend.
plot normals : TCGA normal samples are excluded by default. Plot the normal samples in yellow.

plot mutation/cnv barplot : Plot an additional series of plots. For any CNV or mutation gene, plot barplots of expression divided by CNV or mutation status and calculate t-tests. Note: it is not recommended to plot multiple CNV or mutation at the same time for this plot. Choose two expression genes for the x and y axis, then choose either a CNV or mutation gene, followed by additional expression genes if desired. If a CNV gene is chosen, an additional plot will be generated that plots the difference in expression between baseline CNV -0.25 to 0.25 and samples >= various CNV cutoffs.
fusions only barplot : only plot fusion events for the plot mutation barplot.

3D plot : (beta POC) choose three genes, expression and/or CNV. Render an interactive 3D plot using Javascript. Only works with Firefox.
Turn 3D model: Hold left mouse button and move mouse
Zoom in/zoom out: Scroll wheel or hold right mouse button and move mouse
Move: Hold Ctrl key and a mouse button and move the mouse

Tissues : restrict results to selected tissue type
Cell Type : include tissue type and tumor type information in exported data table
Cell Lines : restrict results to listed samples
IC50 : specify custom IC50 values. see [matrix]

YouTube Demo Dumping underlying expression values using expression_plot.

Examples:

Basic plot of ESR1 vs. ERBB2 expression in TCGA breast

Development:

Mulitple metadata filtering. Example for single cell melanoma, the cell type class can be filtered so that only the melanoma cells are plotted. Choose the Metadata dataset. Then choose the Metda data field. A Filter dropdown will appear populated with the available metadata values. If you select one of these, eg. Melanoma then only the melanoma cells will be plotted. For multiple metadata filtering, eg by cell_type AND tumor number there is a more advanced syntax. For the plotting one would normally use something like META-22696 to plot the tumor number metadata onto the plot. However, if you add -filter-item1,item2,item3,etc then the plot will be restricted to item1 item2 and item3. eg. META-22696-filter-tumor79,tumor80


Related

Wiki: Instructions
Wiki: matrix
Wiki: target