High-resolution spatial normalization for microarrays containing embedded technical replicates

Daniel S Yuan; Rafael A Irizarry

doi:10.1093/bioinformatics/btl542

. Author manuscript; available in PMC: 2008 Mar 4.

Published in final edited form as: Bioinformatics. 2006 Oct 23;22(24):3054–3060. doi: 10.1093/bioinformatics/btl542

High-resolution spatial normalization for microarrays containing embedded technical replicates

Daniel S Yuan ^1,^2,^3,^*, Rafael A Irizarry ³

PMCID: PMC2262854 NIHMSID: NIHMS40648 PMID: 17060357

Abstract

Motivation

Microarray data are susceptible to a wide-range of artifacts, many of which occur on physical scales comparable to the spatial dimensions of the array. These artifacts introduce biases that are spatially correlated. The ability of current methodologies to detect and correct such biases is limited.

Results

We introduce a new approach for analyzing spatial artifacts, termed ‘conditional residual analysis for microarrays’ (CRAM). CRAM requires a microarray design that contains technical replicates of representative features and a limited number of negative controls, but is free of the assumptions that constrain existing analytical procedures. The key idea is to extract residuals from sets of matched replicates to generate residual images. The residual images reveal spatial artifacts with single-feature resolution. Surprisingly, spatial artifacts were found to coexist independently as additive and multiplicative errors. Efficient procedures for bias estimation were devised to correct the spatial artifacts on both intensity scales. In a survey of 484 published single-channel datasets, variance fell 4- to 12-fold in 5% of the datasets after bias correction. Thus, inclusion of technical replicates in a microarray design affords benefits far beyond what one might expect with a conventional ‘n = 5’ averaging, and should be considered when designing any microarray for which randomization is feasible.

1 INTRODUCTION

Microarrays are now well-established as a methodology for detecting many thousands of DNA or RNA sequences in parallel. As physical entities, microarrays are fundamentally a technology of surfaces. The typical microarray deploys thousands of miniaturized assays as features on a surface, and results of these assays are captured as images of this surface.

Because a high-quality microarray experiment requires pristine microarray surfaces, microarray data are susceptible to a wide-range of artifacts. Dust particles, fingerprints, scratches and fluorescent residues are typical of artifacts introduced from the ambient environment. A separate class of artifacts arises from the physical processes that microarrays are subjected to, ranging from their manufacture and storage to their hybridization and scanning. The notion that all physical variables, such as temperature, reagent concentration and alignment are held constant across the microarray surface during these processes is an idealization that can never be fully realized. Deviations will lead to spatial inhomogeneities. These two mechanisms alone indicate that spatial artifacts may be abundant in microarray data.

Techniques for detecting spatial artifacts are a relatively recent development in the microarray literature. One class of techniques estimates the expected value of a feature by local smoothing, i.e. by combining the values of its neighboring features as an average (or related estimator). Spatial artifacts are then detected as deviations from the expected value. For this approach to succeed, the values of the neighboring features must be comparable. Such an assumption is untenable for single-channel datasets, where signal intensities typically span a 10- to 1000-fold range, but may be reasonable for two-channel microarray experiments that compare two similar samples. In such cases artifact detection is accomplished by smoothing the microarray image. Several versions of this class of techniques have been described, involving a median kernel smoother (Wilson et al., 2003), a 2D loess (‘local regression’) smoother (Cleveland, 1979; Yang et al., 2002; Colantuoni et al., 2004; Futschik and Crompton, 2004), or a binning procedure combined with smoothing by neural networks (Tarca et al., 2005).

A second class of techniques leverages the fact that a scientific project may involve many datasets that use the same microarray platform. The idea is to average values across these datasets. Here the underlying assumption is that the errors being averaged out have mean 0 at every microarray feature. The prototype of this class of techniques is Robust Multichip Average (RMA), a widely used technique for identifying outliers in single-channel microarrays based on the Affymetrix GeneChip microarray platform (Irizarry et al., 2003a; Irizarry et al., 2003b; Gautier et al., 2004; highlighted in Allison et al., 2006). Extensions of this technique have been developed recently, including affyPLM (Bolstad et al., 2005), SmudgeMiner (Reimers and Weinstein, 2005) and Harshlight (Suarez-Farinas et al., 2005).

This paper introduces a third approach, termed ‘conditional residual analysis for microarrays’ (CRAM). The only prerequisite of CRAM is a microarray design containing a limited number of technical replicates and negative controls. For any given replicated feature, its expected value is the robust average of its replicates. Deviations from this expected value, obtained as residuals, then provide raw estimates of microarray errors. Since residuals cannot be defined for those microarray features that lack replicates, more refined estimates are subsequently derived as conditional expectations (thus ‘CRAM’) of the residuals using spatial smoothing techniques. The smoothing procedure extends error estimation to the entire array and yields bias estimates that can be used for bias correction.

It is important to observe that CRAM is free of the assumptions that constrain its predecessors. With CRAM, the expected value for a microarray feature depends only on identical features replicated in the same dataset. Data from neighboring features play no role, and no reference is made to other datasets. Thus, unlike its predecessors, CRAM can be applied to a free-standing, single-channel dataset containing arbitrary data. CRAM is also powerful. In this paper, CRAM is used to demonstrate that spatial artifacts coexist on linear and logarithmic intensity scales as superimposed additive and multiplicative errors. Existing techniques for microarray normalization do not easily accommodate such a complicated error model. In contrast, a cross-validated and fully documented computation with CRAM requires only a few minutes on a personal computer. Thus, CRAM may have wide applicability for detecting, estimating and correcting many of the spatial artifacts that potentially contaminate microarray data.

2 METHODS

2.1 A microarray design for CRAM

The principles of CRAM are depicted in Figure 1. Microarrays on which CRAM can be implemented have been described elsewhere (Yuan et al., 2005). Briefly, Hopkins TAG Arrays (TAG arrays) were designed to profile pools of cells derived from the Yeast Knockout (YKO) strain collection (Giaever et al., 2002), based on a methodology pioneered by Shoemaker et al. (1996). The methodology exploits unique 20 base oligonucleotide ‘TAG’ sequences that were incorporated into the genome of each YKO strain. Different TAGs flank each gene knockout cassette at each end; the upstream and downstream TAGs are termed UPTAGs and DOWNTAGs, respectively. The TAGs in genomic DNA derived from a complex pool of cells can be amplified and labeled using the PCR. Hybridization of the PCR products to the microarray yields measurements of relative TAG abundances, which in principle will reflect the relative abundances of strains in the pool being studied.

Fig. 1 — Principles underlying the detection of spatial artifacts by embedded technical replicates. (A) Replicate features of the microarray (open and filled spots) are deployed as artifact detectors in close contact with the systematic features (⊗). Shadings highlight two of the replicate sets to emphasize that the replicates are deployed in a randomized order. (B) Certain replicate features detect a spatial artifact as a perturbation from the mean. (C) The systematic features are ignored. (D) Information pooled from the detectors delineates the artifact.

In addition to a systematic survey of the 6018 UPTAG-DOWNTAG pairs used in the YKO strain collection, TAG arrays contain the replicate features highlighted in this paper. Of the 21 939 customizable features in TAG arrays, 9590 features were replicate features. Eight thousand of these features provided 5-fold replicates for the UPTAGs and DOWNTAGs of 800 selected genes. The rest were for negative-control random sequences for 159 additional ‘genes’. Of note, any given TAG was represented by (2 × 6018 + 9590)/(2 × 6018) = 1.8 microarray features, on average.

The systematic and replicate features were arranged to maximize the number of contacts made by any given systematic feature with neighboring replicate features. At the same time, correlation between the features of any given replicate set was minimized. These design goals were achieved by alternating the two types of features in a checkerboard pattern (to the extent possible), and by scrambling the order of the replicates within the array (see Fig. 1).

2.2 Overview of data analysis for CRAM

The hoptag R package (version 2.3–4) was written to analyze the replicate features in TAG arrays. It uses the object model developed in the marray R package, a core package of the open-source BioConductor Project (Dudoit and Yang, 2002; Gentleman et al., 2004). Version 1.0 of hoptag accompanied the initial description of TAG arrays (Yuan et al., 2005).

A panel is the unit of data analysis in hoptag. It represents data for a particular TAG type (Up/Dn) and detection channel (Green/Red). The rationale for analyzing each channel/TAG type combination separately rather than as a two-channel ratio is that a different PCR product gives rise to each combination. It is therefore best regarded as an independent entity. Because TAG arrays happen to be rectangular, data in panels are conveniently represented as matrices. This is not only helpful for constructing images but is required for the 2D fast Fourier transform, a vital component of the default smoothing procedure used in hoptag.

Analysis of a panel involves rearranging the replicate features as a matrix. The 959 columns of the matrix represent the (800 + 159) replicate sets, while the five rows correspond to the five elements in each replicate set (Fig. 2). Residuals are calculated by subtracting a robust mean from each column. Subsets of columns are masked, i.e. converted to missing values, for a variety of special purposes: to focus attention on replicate sets with low or high average intensity, to remove replicate sets that contain extreme outliers, and to perform cross-validation. The residuals left unmasked are then mapped back to their locations in the panel to obtain a residual image. As already discussed, CRAM uses the residual image to derive a bias estimate, which in turn serves as the basis for bias correction.

Fig. 2 — A four-step iterative procedure for bias estimation and bias correction. (1) Replicate sets are extracted from the raw data. (2) Residuals are extracted from the replicate sets. (3) A bias estimate is obtained by smoothing the residual image. (4) The raw data are corrected using the bias estimate. “NA’s” signify missing values.

2.3 An additive error model

As a first approximation, signal intensities I are modeled as the sum

I = μ + δ,

where each variable is an image, i.e. a function of (x, y) spatial coordinates. μ represents the unobserved true values to be estimated. The discrepancy δ between the observed and unobserved values can in turn be modeled as a sum of components (ν + ε). ν embodies all of the spatially correlated structure in δ and is regarded as an unknown but fixed (‘nuisance’) image. Smoothness is not required, as ν potentially includes scratches and blotches as well as smudges and gradients. ε is devoid of spatial autocorrelation and is treated as a realization of a random field of unspecified distribution with mean 0. The variance of ε may change with spatial position, i.e. stationarity is not required.

The goal of spatial normalization is to develop an estimate μ̂ (where [^] denotes estimation) of the true values μ, given the observed signal intensities I. Symbolically,

\hat{μ} = I - \hat{δ} .

The challenge is to derive the error estimates δ̂ from I. The replicates are the key to meeting this challenge because they yield μ̂, at least for the replicate features: For each replicate set r, μ̂_r is just an average of I_r. Thus,

{\hat{δ}}_{r} = I_{r} - {\bar{I}}_{r} .

Since these terms are defined only for the replicate features, the definition needs to be extended to the entire microarray. This is accomplished by applying an image smoothing procedure S. Because smoothing procedures penalize sharp changes in the slope of the fitted curve (e.g. Loader, 2004), they effectively diminish the amplitude of high-frequency components in their inputs. The effect is to remove most of the ε component of δ. This yields an estimate of ν that is also taken as the desired expression for δ̂:

\hat{δ} \equiv \hat{ν} \approx S {\hat{δ}}_{r} .

2.4 An additive-multiplicative error model

For both physical and empirical reasons to be discussed later, it is necessary to generalize the additive model to include multiplicative terms. Subscripting the additive (linear-scale) terms by ‘1’ and the multiplicative (log2-scale) terms by ‘2’, the model becomes

I = μ \cdot 2^{δ_{2}} + δ_{1} .

The desired estimator of μ then consists of I corrected for additive and multiplicative bias:

\hat{μ} = (I - {\hat{δ}}_{1}) \cdot 2^{- {\hat{δ}}_{2}} .

The image-valued unknowns’ δ₁ and δ₂ can be difficult to estimate by numerical optimization procedures. Not only do the two variables coexist nonlinearly in the same region of space, but they may also contain irregular spatial artifacts that would be challenging to parametrize in a statistical model. Fortunately, excellent approximations may be obtained by taking advantage of the large dynamic range of most microarray scanners. The idea is to estimate each bias term separately as an asymptote, i.e. using only subsets of replicates in which the other term is negligible. Roughly speaking, δ₁ can be estimated with ‘low-intensity’ replicate sets. For these replicate sets, the multiplicative term δ₂ will be negligible, by definition. Likewise, δ₂ can be estimated from ‘high-intensity’ replicate sets. For those replicate sets, δ₂ will be largely compared to the additive term δ₁, so that δ₁ can be ignored. Each term is estimated using the scheme in Figure 2.

The low-intensity and high-intensity replicate sets are defined using the negative controls included in the microarray. The median value of these negative controls defines ‘zero’, and their median absolute deviation defines a scale. The low-intensity sets consist of all replicate sets whose μ̂_r values are at most one scale unit from ‘zero’. Of the remaining sets, those with extreme values of μ̂_r are removed, leaving the high-intensity replicate sets.

It is worth emphasizing that these definitions rely only on the existence of negative control features. The low- and high-intensity replicates are identified dynamically for each dataset solely based on their signal intensities in that particular dataset. There is no need to argue about the validity of pre-declared positive controls, such as ‘housekeeping genes’. The definitions are therefore compact, flexible and universal.

2.5 Choice of a kernel smoother

The critical step of smoothing the residuals to estimate δ̂ is accomplished using a kernel smoother (e.g. Loader, 2004; Ruppert et al., 2003). Although many spatial smoothing techniques now exist, older kriging techniques, 2D local-regression smoothers, and 2D smoothing splines proved to be time-consuming, and wavelet-based smoothers were unworkable because of the missing values. The smoother used here was refined by Nychka et al. (Fields Development Team, 2004, http://www.cgd.ucar.edu/Software/Fields) for use with geospatial problems and was not only the fastest by far but also gave the best results. Briefly, missing values are first imputed. A Nadaraya-Watson smoothing kernel with exponential covariance is then convolved with the image using a fast Fourier transform. The effective radius of the smoothing kernel is determined by the desired average number of non-missing values in the kernel.

Because the covariance is modeled as a stationary function, non-stationary artifacts of the sort encountered in microarray data will not be smoothed well unless the scale of the artifact is large compared to the kernel radius. Another well-known weakness of kernel smoothers is that they may introduce bias at the edges of the image (e.g. Ruppert et al., 2003). Fortunately, the density of points in the imaged residuals appears to be adequate for most artifacts, and edge effects are usually insignificant. A third issue is the blurring introduced when smoothing strong point artifacts. Such artifacts are removed before smoothing by masking replicate sets containing large (>8 σ) residuals. The standard deviation σ is calculated robustly using an M-estimator (Venables and Ripley, 2001).

A kernel smoother using a robust estimator is sometimes needed for bias estimation when a spatial artifact is both strong and highly localized. This smoother uses a 25% trimmed mean in conjunction with a disk-shaped smoothing kernel. Outliers are not filtered out. To avoid problems with sparse data, kernels containing less than a preset minimum number of points are ignored. Any remaining missing values are subsequently patched using the available smoothed values.

2.6 Bias correction

Bias correction involves subtraction of the estimated biases δ̂ from the raw signal intensities I. Although seemingly elementary, the subtraction operation requires attention to several technical issues.

2.6.1 Use of a generalized log transformation

Bias correction involves iterative subtraction on both logarithmic and linear intensity scales. Linear-scale bias subtraction may result in tiny or negative values that will not convert meaningfully to the log2-scale. To interconvert between these two scales, CRAM replaces the logarithm function conventionally used for transformation with the ‘generalized logarithm’ (glog) function (Munson, 2001, http://www.stat.berkeley.edu/~terry/zarray/Affy/GL_Workshop/Munson.ppt; Huber et al., 2002; Durbin et al., 2002). Under a suitable linear transformation, this function is asymptotic with log2 for large arguments but linear for small arguments.

Although the glog function happens to be the variance-stabilizing transformation for a random variable with additive and multiplicative errors (see references above), CRAM makes no use of this property. The transformation is used only as a device to ensure that bias subtraction can operate iteratively on both linear and log2 scales.

The glog transformation was further modified here for numerical stability so that it decreases linearly for all negative arguments, rather than as log2(|x|). The resulting transformation is easily inverted and is twice-differentiable. Finally, estimated linear-scale biases are truncated to zero if the existing linear-scale intensity values are already negative.

2.6.2 Use of a bias augmentation procedure

Bias correction is necessarily an iterative process because the value of δ̂ calculated for the replicate features is only an approximation to δ. Indeed, the original bias reappears as a ghost image when residuals are recalculated from the corrected replicate sets and reimaged (see Supplementary Figure S1 and S2). Interestingly, iteration can be accelerated if δ̂ is multiplied by a constant κ before subtracting it from I. The ghost image is quantified by fitting it to δ̂ by linear regression. The slope quantifies the relationship and usually changes sign as κ increases from 0 to 2, typically at κ ~ 1.3. (If no sign change occurs, the estimate is presumed to be spurious and κ is set to 0.)

κ is reminiscent of ‘Bessel’s correction’, the $\sqrt{N / (N - 1)}$ multiplier of the sample standard deviation in elementary statistics. In both cases, residuals are calculated based on an estimated mean rather than a true mean. The multiplier makes the estimator unbiased. κ is larger than one might expect for N = 5, probably because the replicate means are estimated with a robust estimator.

2.6.3 Assessment of bias correction

Progress in bias correction is monitored by visual inspection of the residual images, by plots of autocorrelation in the residuals along the horizontal and vertical axes, and by the overall (robust) scale of the residuals after they are recalculated. Overfitting is detected by embedding the bias correction procedure in a cross-validation framework.

With an optimal bias augmentation constant κ, one iteration of bias estimation and bias correction is adequate as long as the artifacts being removed are reasonably smooth. A second iteration usually yields little further reduction in autocorrelation and overall variance, and the iteration often terminates because the corrections are so small that values of κ between 0 and 2 cannot be identified. Thus, in a crude sense, the bias correction procedure appears to converge to a self-consistent state (Tarpey and Flury, 1996; Kepler et al., 2002).

3 RESULTS

3.1 The additive-multiplicative error model

Contrast enhancement of residual images revealed that some were especially ‘noisy’, but only in certain regions of the array (Fig. 3a and b). Insight into this peculiar phenomenon came from reanalyzing the residuals based on whether they came from ‘low-intensity’ or ‘high-intensity’ replicate sets. The resulting partial residual images were much less ‘noisy’. Many datasets were found where the spatial artifacts were stronger at one intensity scale than the other, and some datasets contained spatial artifacts that were unmistakably different on the two intensity scales (Fig. 3c and d). These observations resolved the ‘noise’ into two coherent spatial artifacts coexisting on linear and logarithmic intensity scales.

To accommodate the two intensity scales in existing spatial normalization techniques, an additive-multiplicative model of microarray errors was modified to incorporate spatially correlated biases on both intensity scales (see Section 2.4). The bias correction procedure was modified accordingly. Cross-validation was performed to detect overfitting. With these changes, spatial correlation between residuals was dramatically reduced. Moreover, no new spatial artifacts were uncovered on intermediate intensity scales (see Supplementary Figure S3). These observations provided evidence that the two intensity scales are sufficient as well as necessary. Thus, CRAM uncovered new experimental support for an additive-multiplicative model of microarray errors. In turn, the additive-multiplicative error model came to play a central role in the bias correction procedures in CRAM.

3.2 Benefits of linear-scale bias correction

Linear-scale bias correction has the same goal as ‘background subtraction’: the correction of additive error. By convention, ‘background’ is estimated for each microarray feature using the signals in the immediate periphery of the feature. The analogous definition of ‘background’ in CRAM is linear-scale bias, as estimated from the ‘foreground’ values of low-intensity replicate features. Of note, these two definitions are independent because they involve different subsets of the pixels that comprise a feature.

To compare the efficacy of linear-scale bias correction with that of conventional background subtraction, foreground and background residuals were compared. The residual images were sometimes but not always similar (see Supplementary Figure S4). Even when the residual images were similar, the slope of the foreground versus background relationship was almost 4, far from the expected value of 1 (Fig. 4). These observations challenge the view that background measurements are a useful proxy for foreground additive errors.

Fig. 4 — Comparison of linear-scale (foreground) bias correction with conventional background subtraction. Residuals of foreground signal intensities in low-intensity replicate sets are plotted against the residuals for the corresponding background measurements. The correlation coefficient (Kendall’s tau) was 0.55. A robust regression fit is shown; the slope was 3.8 (SE = 0.09).

3.3 Benefits of log2-scale bias correction

Microarray normalization procedures are commonly assessed in terms of signal detection, where the ‘signal’ is a set of genes known in advance to exhibit differential representation. For assessing spatial bias correction, however, such tests can be misleading. The ‘signals’ to be detected are typically sparse, but spatial artifacts can also be highly localized, leading to a situation where the benefits of spatial bias correction depend entirely on the fortuitous colocalization of the artifacts with the signals to be detected.

To overcome this problem, CRAM was performed on a large scale. The algorithm was efficient enough to analyze 484 panels of data with 3-fold cross-validation in 6 h (Power Mac G5, 1.8 GHz). The Supplementary material includes a list of the underlying 121 datasets (Supplementary Table S1; Pan et al., 2006) and the graphics and documentation from one such panel (SourceCode/example_of_outputs). After CRAM, the overall variance of log2-scale residuals fell at least 2-fold in 220 panels (45%). Thus, spatial artifacts were the major source of variance in almost half of all panels. Variances fell 4- to 12-fold in 23 panels (5%), with even larger ratios if the variances were calculated robustly.

The biases found in two-channel data from the same dataset usually neutralized each other, but exceptions to this rule were found (e.g. dataset 3.1/Up in Supplementary Table S2), indicating that two-channel experimental designs are not a foolproof way to correct all spatial artifacts.

To assess the potential impact of CRAM in more local terms, the most extreme corrections for any given panel were identified. The log2 ratios for 10 genes changed by at least 0.5 U in 186 panels (38%) and by 1.0 U in 21 panels (4%). Since the effect size of a typical ‘discovery’ is 1.5 to 2.0 log2 units (Pan et al., 2006), CRAM may well alter false positive rates, particularly when one channel must serve as a shared control for several experimental datasets in the other channel (e.g. Lee et al., 2005).

After CRAM, variances still varied almost 100-fold from panel to panel (Fig. 5). The variability was pervasive and could not be attributed to isolated datasets. This high variability was unexpected as it is commonly assumed in ANOVA-based experimental designs that datasets from the same series of data are comparable (e.g. Kerr and Churchill, 2001).

The high spatial resolution of CRAM was especially apparent in one panel that contained several series of outliers scattered along isolated columns of the residual image (Fig. 6), revealing an unsuspected flickering that was traced to the microarray scanner hardware. This degree of spatial resolution far exceeded that achievable by a local smoothing technique (Supplementary Figure S5), as one might expect given the stringent assumptions underlying local smoothing (see Introduction). The artifacts were too narrow for accurate bias estimation using the default kernel smoother, but they were neatly removed by subtracting column-wise robust means from each column as a preprocessing step (Fig. 6c).

Fig. 6 — Example of high-resolution detection and correction of spatial artifacts. (A) Detection of spatial artifacts one feature wide. (B₁) Bias estimates obtained with the default kernel smoother. (B₂) Residuals remaining after bias correction. (C₁ and C₂) Same as (B) except that a line smoother was included as a preliminary step. The values shown reflect transformation by the function f(x) 0.4 · (x/0.4)^1/3. Full images are in the two *Images.pdf* files in SourceCode/example_of_outputs/Dn/maGf_* (pp. 9, 10, 11, 12 and 15) in the Supplementary material.

To determine whether normalization procedures involving multi-dataset averaging were capable of performing as well as CRAM (see Introduction), the large-scale analysis described above was used to test directly whether the average spatial bias at every point was in fact indistinguishable from 0. This fundamental assumption was validated (Supplementary Figure S6). Thus, for the data in this paper, multi-dataset averaging techniques could have been used as a simpler alternative to CRAM. For arbitrary collections of data, however, there is no way to guarantee that this result is generalizable, because spatial biases are a form of systematic error and are therefore intrinsically non-random.

4 DISCUSSION

CRAM appears to be a fundamentally new methodology for estimating microarray errors. It requires little more than negative-controls and a collection of technical replicates embedded in a single microarray. Technical replicates are certainly not a new concept in experimental design, and others have already pointed out their virtues in microarray data preprocessing (Fan et al., 2004; Smyth et al., 2005). Nevertheless, a consensus has developed that replicate microarrays ‘are almost never required’ when conducting biological experiments (Allison et al., 2006), implying that embedded technical replicates are even less worthwhile. CRAM’s ability to detect, estimate and correct bias—all major objectives of microarray data preprocessing (Allison et al., 2006)—may therefore be counter intuitive. Moreover, new insights provided by CRAM challenge many of the assumptions implicit in existing statistical methods for microarray data analysis.

Bias estimation using CRAM is based on a model of microarray errors that contains superimposed additive and multiplicative terms. The idea of a two-component model for microarray errors is also not new. Alluding to the linear calibration of measurements with multiplicative errors, Rocke and Durbin used this model to explain the heteroscedasticity of microarray errors (Rocke and Durbin, 2001). The model underlies a variance-stabilizing transformation (see Section 2.6.1) and continues to be central to ongoing modeling efforts (e.g. Zhou and Rocke, 2005).

What is new about the model formulated here is the inclusion of a spatially correlated image-valued term, ν₁, in the additive error. This extra complexity has a clear physical basis and is not simply a generalization for mathematical symmetry. On the one hand, microarray signal intensities are determined by factors that are multiplied together, such as template DNA concentrations, probe DNA abundances (e.g. abrasions) and laser beam intensity. Variations in these factors will translate into proportional changes in signal intensity, i.e. multiplicative errors. On the other hand, extra terms that originate in the act of measurement will be added ‘in series’ (in the physical sense) to the measured signal intensities. Examples include the dark current of the photomultiplier tubes used for signal detection and fluorescent contaminants adsorbed to the microarray surface. Importantly, both the additive and multiplicative terms potentially include spatial artifacts, because each involves physical entities impinging on a surface. Thus, both ν₁ and ν₂ should be image-valued. This prediction was strikingly confirmed by CRAM (Fig. 3).

This revised two-component model exposes problems with the way ‘background correction’ is implemented in existing procedures. One example is the longstanding convention of ‘background subtraction’. The examples provided in this paper (Fig. 4) indicate that background measurements are not a good proxy for the linear-scale biases associated with additive errors. A rationale for this observation may lie in the fact that ‘background’ pixels lack DNA and will therefore have radically different adsorptive properties. In contrast, the residuals used in CRAM are intrinsic to the features of interest and should therefore be more representative. Another example is the use of empirical Bayes estimators to estimate how measures of differential gene expression are distributed (Efron et al., 2001). It seems likely that linear- and log2-scale biases will account for at least some of the asymmetry and thick tails that characterize the prior distributions assumed by these models (Kooperberg et al., 2002; Newton et al., 2004). Because the stochastic terms in these models are assumed to be exchangeable and do not allow for spatial correlation, spatial normalization may improve the performance of these estimators.

CRAM’s focus on technical errors is a strength even if those errors are generally smaller in magnitude than ‘biological’ errors. The focus on technical factors yields sharper error estimates that are important not only for bias correction and quality control but also for comparing datasets. For example, it was surprising to find such a wide-range of residual variances in the datasets analyzed for this paper (Fig. 5), given that all of the datasets were part of a single project. Such variability is deleterious to experimental designs that rely on principles of ANOVA (e.g. Kerr and Churchill, 2001), since constancy of variance is a fundamental tenet of ANOVA.

CRAM is currently limited in its ability to estimate missing values in a residual image. This is especially evident for artifacts that are strongly anisotropic or highly localized, where visual inspection may reveal more detail than kernel smoothers can estimate (Fig. 6). It is no surprise that kernel smoothers will blur such artifacts because kernel smoothers are low-pass filters that will remove the high-frequency components of an image. As was shown, alternative smoothers will solve some of these problems, but no general solution is known.

In addition to bias correction, bias estimation also affords access to the linear- and log2-scale residual error terms ε₁ and ε₂ (see Section 2.4). The variances of ε₁ and ε₂ are already calculated in hoptag as parameters in the glog transformation that others have used for variance stabilization (e.g. Durbin et al., 2002). It turns out; however, that analysis of ε₁ and ε₂ can be used to improve on the generalized log transformation itself. This finding has several implications for analyzing data that contrast two samples, the principal application of TAG arrays (e.g. Lee et al., 2005; Pan et al., 2006). These implications will be developed elsewhere.

5 CONCLUSION

CRAM appears to be a new approach for estimating microarray errors based on technical replicates embedded in the microarray. CRAM can detect spatial artifacts with single-feature resolution and can estimate and correct much of the associated bias, accomplishing the goals of both background correction and spatial normalization. Unlike other normalization procedures in the literature, CRAM is applicable to isolated single-channel datasets and makes no assumptions about the distribution of the underlying data. The microarray design principles needed for CRAM are straightforward and should be considered when designing any microarray for which randomization is feasible.

Supplementary Material

supplement

Supplementary information: Supplementary Data are available at Bioinformatics online.

NIHMS40648-supplement-supplement.zip^{(2.8MB, zip)}

Acknowledgments

This work was supported by fellowships from the Burroughs-Wellcome Center for Computational Biology at Johns Hopkins and from NHGRI (D.S.Y.) and by NIH grant 5R01HG02432 (to J. Boeke). The authors thank J. Boeke, C. Crainiceanu, F. Pineda and S. Wheelan for reading the manuscript. Funding to pay the Open Access publication charges for this article was provided by NHGRI.

Footnotes

Availability: CRAM is implemented as version 2 of the hoptag software package for R, which is included in the Supplementary information.

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

Conflict of Interest: none declared.

References

Allison DB, et al. Microarray data analysis: from disarray to consolidation and consensus. Nat Rev Genet. 2006;7:55–65. doi: 10.1038/nrg1749. [DOI] [PubMed] [Google Scholar]
Bolstad BM, Collin F, Brettschneider J, et al. Quality assessment of affymetrix genechip data. In: Gentleman R, Carey V, Huber W, Irizarry R, Dudoit S, editors. Bioinformatics and Computational Biology Solutions Using R and Bioconductor. Springer; NY: 2005. pp. 42–47. [Google Scholar]
Cleveland WS, et al. Robust locally weighted regression and smoothing scatterplots. J Am Stat Assoc. 1979;74:829–836. [Google Scholar]
Colantuoni C, et al. Local mean normalization of microarray element signal intensities across an array surface: quality control and correction of spatially systematic artifacts. Biotechniques. 2002;32:1316–1320. doi: 10.2144/02326mt02. [DOI] [PubMed] [Google Scholar]
Dudoit S, Yang YH. Bioconductor R packages for exploratory analysis and normalization of cDNA microarray data. In: Parmigiani G, Garrett ES, Irizarry RA, Zeger SL, editors. The Analysis of Gene Expression Data: Methods and Software. Springer; NY: 2002. pp. 73–101. [Google Scholar]
Durbin BP, et al. A variance-stabilizing transformation for gene-expression microarray data. Bioinformatics. 2002;18:S105–S110. doi: 10.1093/bioinformatics/18.suppl_1.s105. [DOI] [PubMed] [Google Scholar]
Efron B, et al. Empirical Bayes analysis of a microarray experiment. J Am Stat Assoc. 2001;96:1151–1160. [Google Scholar]
Fan J, et al. Normalization and analysis of cDNA microarrays using within-array replications applied to neuroblastoma cell response to a cytokine. Proc Natl Acad Sci USA. 2004;101:1135–1140. doi: 10.1073/pnas.0307557100. [DOI] [PMC free article] [PubMed] [Google Scholar]
Fields Development Team. fields: Tools for Spatial Data. National Center for Atmospheric Research; Boulder, CO: 2004. [Google Scholar]
Futschik M, Crompton T. Model selection and efficiency testing for normalization of cDNA microarray data. Genome Biol. 2004;5:R60. doi: 10.1186/gb-2004-5-8-r60. [DOI] [PMC free article] [PubMed] [Google Scholar]
Gautier L, et al. affy-—analysis of Affymetrix GeneChip data at the probe level. Bioinformatics. 2004;20:307–315. doi: 10.1093/bioinformatics/btg405. [DOI] [PubMed] [Google Scholar]
Gentleman RC, et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 2004;5:R80. doi: 10.1186/gb-2004-5-10-r80. [DOI] [PMC free article] [PubMed] [Google Scholar]
Giaever G, et al. Functional profiling of the Saccharomyces cerevisiae genome. Nature. 2002;418:387–391. doi: 10.1038/nature00935. [DOI] [PubMed] [Google Scholar]
Huber W, et al. Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics. 2002;18(Suppl 1):S96–S104. doi: 10.1093/bioinformatics/18.suppl_1.s96. [DOI] [PubMed] [Google Scholar]
Irizarry RA, et al. Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res. 2003a;31:e15. doi: 10.1093/nar/gng015. [DOI] [PMC free article] [PubMed] [Google Scholar]
Irizarry RA, et al. Exploration, normalization and summaries of high density oligonucleotide array probe level data. Biostatistics. 2003b;4:249–264. doi: 10.1093/biostatistics/4.2.249. [DOI] [PubMed] [Google Scholar]
Kepler TB, et al. Normalization and analysis of DNA microarray data by self-consistency and local regression. Genome Biol. 2002;3:RESEARCH0037. doi: 10.1186/gb-2002-3-7-research0037. [DOI] [PMC free article] [PubMed] [Google Scholar]
Kerr MK, Churchill GA. Experimental design for gene expression microarrays. Biostatistics. 2001;2:183–201. doi: 10.1093/biostatistics/2.2.183. [DOI] [PubMed] [Google Scholar]
Kooperberg C, et al. Improved background correction for spotted DNA microarrays. J Comput Biol. 2002;9:55–66. doi: 10.1089/10665270252833190. [DOI] [PubMed] [Google Scholar]
Lee W, et al. Genome-wide requirements for resistance to functionally distinct DNA-damaging agents. PLoS Genet. 2005;1:e24. doi: 10.1371/journal.pgen.0010024. [DOI] [PMC free article] [PubMed] [Google Scholar]
Loader C. Smoothing: local regression principles. In: Gentle J, Härdle W, Mori Y, editors. Handbook of Computational Statistics. Springer-Verlag; NY: 2004. pp. 539–564. [Google Scholar]
Munson P. A ‘consistency’ test for determining the significance of gene expression changes on replicate samples and two convenient variance-stabilizing transformations. GeneLogic Workshop on Low Level Analysis of Affymetrix GeneChip Data; Nov. 19, 2001; Bethesda, MD. 2001. [Google Scholar]
Newton MA, et al. Detecting differential gene expression with a semiparametric hierarchical mixture method. Biostatistics. 2004;5:155–176. doi: 10.1093/biostatistics/5.2.155. [DOI] [PubMed] [Google Scholar]
Pan X, et al. A DNA integrity network in the yeast Saccharomyces cerevisiae. Cell. 2006;124:1069–1081. doi: 10.1016/j.cell.2005.12.036. [DOI] [PubMed] [Google Scholar]
Reimers M, Weinstein JN. Quality assessment of microarrays: visualization of spatial artifacts and quantitation of regional biases. BMC Bioinformatics. 2005;6:166. doi: 10.1186/1471-2105-6-166. [DOI] [PMC free article] [PubMed] [Google Scholar]
Rocke DM, Durbin B. A model for measurement error for gene expression arrays. J Comput Biol. 2001;8:557–569. doi: 10.1089/106652701753307485. [DOI] [PubMed] [Google Scholar]
Ruppert D, Wand MP, Carroll RJ. Semiparametric Regression. Cambridge University Press; NY: 2003. [Google Scholar]
Shoemaker DD, et al. Quantitative phenotypic analysis of yeast deletion mutants using a highly parallel molecular bar-coding strategy. Nat Genet. 1996;14:450–456. doi: 10.1038/ng1296-450. [DOI] [PubMed] [Google Scholar]
Smyth GK, et al. Use of within-array replicate spots for assessing differential expression in microarray experiments. Bioinformatics. 2005;21:2067–2075. doi: 10.1093/bioinformatics/bti270. [DOI] [PubMed] [Google Scholar]
Suarez-Farinas M, et al. ‘Harshlighting’ small blemishes on microarrays. BMC Bioinformatics. 2005;6:65. doi: 10.1186/1471-2105-6-65. [DOI] [PMC free article] [PubMed] [Google Scholar]
Tarca AL, et al. A robust neural networks approach for spatial and intensity-dependent normalization of cDNA microarray data. Bioinformatics. 2005;21:2674–2683. doi: 10.1093/bioinformatics/bti397. [DOI] [PubMed] [Google Scholar]
Tarpey T, Flury B. Self-consistency: a fundamental concept in statistics. Stat Sci. 1996;11:229–243. [Google Scholar]
Venables WN, Ripley BD. Modern Applied Statistics with S-PLUS. 3. Springer; NY: 2001. [Google Scholar]
Wilson DL, et al. New normalization methods for cDNA microarray data. Bioinformatics. 2003;19:1325–1332. doi: 10.1093/bioinformatics/btg146. [DOI] [PubMed] [Google Scholar]
Yang YH, et al. Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res. 2002;30:e15. doi: 10.1093/nar/30.4.e15. [DOI] [PMC free article] [PubMed] [Google Scholar]
Yuan DS, et al. Improved microarray methods for profiling the Yeast Knockout strain collection. Nucleic Acids Res. 2005;33:e103. doi: 10.1093/nar/gni105. [DOI] [PMC free article] [PubMed] [Google Scholar]
Zhou L, Rocke DM. An expression index for Affymetrix GeneChips based on the generalized logarithm. Bioinformatics. 2005;21:3983–3989. doi: 10.1093/bioinformatics/bti665. [DOI] [PubMed] [Google Scholar]

Associated Data

This section collects any data citations, data availability statements, or supplementary materials included in this article.

Supplementary Materials

supplement

Supplementary information: Supplementary Data are available at Bioinformatics online.

NIHMS40648-supplement-supplement.zip^{(2.8MB, zip)}

[R1] Allison DB, et al. Microarray data analysis: from disarray to consolidation and consensus. Nat Rev Genet. 2006;7:55–65. doi: 10.1038/nrg1749. [DOI] [PubMed] [Google Scholar]

[R2] Bolstad BM, Collin F, Brettschneider J, et al. Quality assessment of affymetrix genechip data. In: Gentleman R, Carey V, Huber W, Irizarry R, Dudoit S, editors. Bioinformatics and Computational Biology Solutions Using R and Bioconductor. Springer; NY: 2005. pp. 42–47. [Google Scholar]

[R3] Cleveland WS, et al. Robust locally weighted regression and smoothing scatterplots. J Am Stat Assoc. 1979;74:829–836. [Google Scholar]

[R4] Colantuoni C, et al. Local mean normalization of microarray element signal intensities across an array surface: quality control and correction of spatially systematic artifacts. Biotechniques. 2002;32:1316–1320. doi: 10.2144/02326mt02. [DOI] [PubMed] [Google Scholar]

[R5] Dudoit S, Yang YH. Bioconductor R packages for exploratory analysis and normalization of cDNA microarray data. In: Parmigiani G, Garrett ES, Irizarry RA, Zeger SL, editors. The Analysis of Gene Expression Data: Methods and Software. Springer; NY: 2002. pp. 73–101. [Google Scholar]

[R6] Durbin BP, et al. A variance-stabilizing transformation for gene-expression microarray data. Bioinformatics. 2002;18:S105–S110. doi: 10.1093/bioinformatics/18.suppl_1.s105. [DOI] [PubMed] [Google Scholar]

[R7] Efron B, et al. Empirical Bayes analysis of a microarray experiment. J Am Stat Assoc. 2001;96:1151–1160. [Google Scholar]

[R8] Fan J, et al. Normalization and analysis of cDNA microarrays using within-array replications applied to neuroblastoma cell response to a cytokine. Proc Natl Acad Sci USA. 2004;101:1135–1140. doi: 10.1073/pnas.0307557100. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R9] Fields Development Team. fields: Tools for Spatial Data. National Center for Atmospheric Research; Boulder, CO: 2004. [Google Scholar]

[R10] Futschik M, Crompton T. Model selection and efficiency testing for normalization of cDNA microarray data. Genome Biol. 2004;5:R60. doi: 10.1186/gb-2004-5-8-r60. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R11] Gautier L, et al. affy-—analysis of Affymetrix GeneChip data at the probe level. Bioinformatics. 2004;20:307–315. doi: 10.1093/bioinformatics/btg405. [DOI] [PubMed] [Google Scholar]

[R12] Gentleman RC, et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 2004;5:R80. doi: 10.1186/gb-2004-5-10-r80. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R13] Giaever G, et al. Functional profiling of the Saccharomyces cerevisiae genome. Nature. 2002;418:387–391. doi: 10.1038/nature00935. [DOI] [PubMed] [Google Scholar]

[R14] Huber W, et al. Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics. 2002;18(Suppl 1):S96–S104. doi: 10.1093/bioinformatics/18.suppl_1.s96. [DOI] [PubMed] [Google Scholar]

[R15] Irizarry RA, et al. Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res. 2003a;31:e15. doi: 10.1093/nar/gng015. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R16] Irizarry RA, et al. Exploration, normalization and summaries of high density oligonucleotide array probe level data. Biostatistics. 2003b;4:249–264. doi: 10.1093/biostatistics/4.2.249. [DOI] [PubMed] [Google Scholar]

[R17] Kepler TB, et al. Normalization and analysis of DNA microarray data by self-consistency and local regression. Genome Biol. 2002;3:RESEARCH0037. doi: 10.1186/gb-2002-3-7-research0037. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R18] Kerr MK, Churchill GA. Experimental design for gene expression microarrays. Biostatistics. 2001;2:183–201. doi: 10.1093/biostatistics/2.2.183. [DOI] [PubMed] [Google Scholar]

[R19] Kooperberg C, et al. Improved background correction for spotted DNA microarrays. J Comput Biol. 2002;9:55–66. doi: 10.1089/10665270252833190. [DOI] [PubMed] [Google Scholar]

[R20] Lee W, et al. Genome-wide requirements for resistance to functionally distinct DNA-damaging agents. PLoS Genet. 2005;1:e24. doi: 10.1371/journal.pgen.0010024. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R21] Loader C. Smoothing: local regression principles. In: Gentle J, Härdle W, Mori Y, editors. Handbook of Computational Statistics. Springer-Verlag; NY: 2004. pp. 539–564. [Google Scholar]

[R22] Munson P. A ‘consistency’ test for determining the significance of gene expression changes on replicate samples and two convenient variance-stabilizing transformations. GeneLogic Workshop on Low Level Analysis of Affymetrix GeneChip Data; Nov. 19, 2001; Bethesda, MD. 2001. [Google Scholar]

[R23] Newton MA, et al. Detecting differential gene expression with a semiparametric hierarchical mixture method. Biostatistics. 2004;5:155–176. doi: 10.1093/biostatistics/5.2.155. [DOI] [PubMed] [Google Scholar]

[R24] Pan X, et al. A DNA integrity network in the yeast Saccharomyces cerevisiae. Cell. 2006;124:1069–1081. doi: 10.1016/j.cell.2005.12.036. [DOI] [PubMed] [Google Scholar]

[R25] Reimers M, Weinstein JN. Quality assessment of microarrays: visualization of spatial artifacts and quantitation of regional biases. BMC Bioinformatics. 2005;6:166. doi: 10.1186/1471-2105-6-166. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R26] Rocke DM, Durbin B. A model for measurement error for gene expression arrays. J Comput Biol. 2001;8:557–569. doi: 10.1089/106652701753307485. [DOI] [PubMed] [Google Scholar]

[R27] Ruppert D, Wand MP, Carroll RJ. Semiparametric Regression. Cambridge University Press; NY: 2003. [Google Scholar]

[R28] Shoemaker DD, et al. Quantitative phenotypic analysis of yeast deletion mutants using a highly parallel molecular bar-coding strategy. Nat Genet. 1996;14:450–456. doi: 10.1038/ng1296-450. [DOI] [PubMed] [Google Scholar]

[R29] Smyth GK, et al. Use of within-array replicate spots for assessing differential expression in microarray experiments. Bioinformatics. 2005;21:2067–2075. doi: 10.1093/bioinformatics/bti270. [DOI] [PubMed] [Google Scholar]

[R30] Suarez-Farinas M, et al. ‘Harshlighting’ small blemishes on microarrays. BMC Bioinformatics. 2005;6:65. doi: 10.1186/1471-2105-6-65. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R31] Tarca AL, et al. A robust neural networks approach for spatial and intensity-dependent normalization of cDNA microarray data. Bioinformatics. 2005;21:2674–2683. doi: 10.1093/bioinformatics/bti397. [DOI] [PubMed] [Google Scholar]

[R32] Tarpey T, Flury B. Self-consistency: a fundamental concept in statistics. Stat Sci. 1996;11:229–243. [Google Scholar]

[R33] Venables WN, Ripley BD. Modern Applied Statistics with S-PLUS. 3. Springer; NY: 2001. [Google Scholar]

[R34] Wilson DL, et al. New normalization methods for cDNA microarray data. Bioinformatics. 2003;19:1325–1332. doi: 10.1093/bioinformatics/btg146. [DOI] [PubMed] [Google Scholar]

[R35] Yang YH, et al. Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res. 2002;30:e15. doi: 10.1093/nar/30.4.e15. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R36] Yuan DS, et al. Improved microarray methods for profiling the Yeast Knockout strain collection. Nucleic Acids Res. 2005;33:e103. doi: 10.1093/nar/gni105. [DOI] [PMC free article] [PubMed] [Google Scholar]

[R37] Zhou L, Rocke DM. An expression index for Affymetrix GeneChips based on the generalized logarithm. Bioinformatics. 2005;21:3983–3989. doi: 10.1093/bioinformatics/bti665. [DOI] [PubMed] [Google Scholar]

PERMALINK

High-resolution spatial normalization for microarrays containing embedded technical replicates

Daniel S Yuan

Rafael A Irizarry

Abstract

Motivation

Results

1 INTRODUCTION

2 METHODS

2.1 A microarray design for CRAM

Fig. 1.

2.2 Overview of data analysis for CRAM

Fig. 2.

2.3 An additive error model

2.4 An additive-multiplicative error model

2.5 Choice of a kernel smoother

2.6 Bias correction

2.6.1 Use of a generalized log transformation

2.6.2 Use of a bias augmentation procedure

2.6.3 Assessment of bias correction

3 RESULTS

3.1 The additive-multiplicative error model

Fig. 3.

3.2 Benefits of linear-scale bias correction

Fig. 4.

3.3 Benefits of log2-scale bias correction

Fig. 5.

Fig. 6.

4 DISCUSSION

5 CONCLUSION

Supplementary Material

Acknowledgments

Footnotes

References

Associated Data

Supplementary Materials

ACTIONS

PERMALINK

RESOURCES

Similar articles

Cited by other articles

Links to NCBI Databases