The Results

The sample results below illustrate the effectiveness of BatchMatch with three visualization techniques and the analysis of variance. A publicly available toxicogenomics dataset (Fielden, Brennan and Gollub, Toxicol Sci 2007 Sep; 99(1): 90-100) is used: http://www.ncbi.nlm.nih.gov/projects/geo/query/acc.cgi?acc=GSE8251.

The main goal of this study is to evaluate whether non-genotoxic chemicals are likely to induce hepatic tumors based on transcript profiles in the liver. Batch effect can be observed from two “batches” hybridized at different time periods.

2001 dataset (11/08/01 to 12/10/01) consists of 17 samples treated with Non-hepatotumorigen (NHT) and 24 samples treated with Non-genotoxic hepatic tumorigen (NGHT).

2002 dataset (4/18/02 to 7/18/02) consists of 39 samples treated with NHT and 32 samples treated with NGHT.

Pearson Correlation Heat Map

Before batch effect removal, the heat map below shows high within-year correlation, and low cross-year correlation. The thick black lines are borders between the 2001 dataset and 2002 dataset. After the removal step using BatchMatch software with 2002 batch as a reference batch, cross-year correlation increases significantly and the heat map becomes more uniform.

 
Principal Component Analysis (PCA)

PCA score plots are used to visualize the relationship between the samples in 2001 (green) and 2002 (red) datasets. Strong batch effect can be observed by the clear separation of samples between 2001 and 2002 dataset before its removal. After removal using BatchMatch, the samples in the two years become well mixed.

Hierarchical Clustering

The dendrograms of hierarchical clustering before batch effect removal shows two well-separated clusters corresponding to 2001 samples (with short labels) and 2002 samples (with long labels). After the removal, samples from different years are mixed together.

ANOVA (Analysis of Variance)

Two-way ANOVA is used to quantify the changes of various variances before and after the batch effect removal.

The pie charts above show the percentages of variance due to different effects. The variance percent due to batch effect (red slice) decreases significantly after removal with BatchMatch (from 25.45% to 0.06%). There is a slight increase in the variance percent due to treatment effect (from 2.01% to 2.66%).

 

Back to BatchMatch
   
Copyright © 2006-2012 - Systems Analytics - Designed by Weboart