The limits of log-ratios
© Sharov et al; licensee BioMed Central Ltd. 2004
Received: 15 November 2003
Accepted: 08 March 2004
Published: 08 March 2004
DNA microarray assays typically compare two biological samples and present the results of those comparisons gene-by-gene as the logarithm base two of the ratio of the measured expression levels for the two samples.
Because of the fixed dynamic range of fluorescence and other detection systems, there is a limit to the range of comparisons that can be made using any array technology, and this must be taken into account when interpreting the results of any such analysis.
The dynamic range of microarray data collection systems results in limits in the comparative analyses that can be derived from such measurements and suggests that optimal results can be obtained by making measurements that avoid the boundaries of that dynamic range.
DNA microarray analysis has become one of the most widely used techniques in modern molecular genetics, and the laboratory protocols that have developed in recent years have led to increasingly robust assays. The application of microarray technologies affords great opportunities for exploring patterns of gene expression and allows users to begin investigating problems ranging from deducing biological pathways to classifying patient populations.
As with all assays, the starting point for developing a microarray study is planning the comparisons that will be made, and the simplest experimental designs are based on the comparative analysis of two classes of samples, either using a series of paired case-control comparisons, or comparisons to a common reference sample, although other approaches have been described. But the fundamental question addressed using arrays is generally a comparison between paired samples to find genes that are significantly different in their patterns of expression. For the sake of the analysis presented here, we will focus on direct pair-wise comparisons between samples using spotted DNA arrays conducted as dual-labeled co-hybridization assays. However, it must be noted that the results we present here will impact other analyses including inferred relative changes derived by comparisons to a reference sample, through more complex loop designs, or from comparisons between single-color assays such as those which are commonly performed using the Affymetrix GeneChip™ or filter array platforms.
Results and Discussion
Measuring log-ratios on microarrays
Microarray experiments generally measure relative expression levels between biological samples. However, there is a fundamental limit to the changes that can be measured on an array and understanding that that these limits exist is important for analyzing microarray experiments. This observation depends fundamentally on the manner in which most microarray scanners work. Following hybridization of spectrally distinguishable labeled targets to the arrayed probes on a microarray, the surface of the slide is generally interrogated using one or more lasers, each tuned to excite a particular fluorescent label. The fluorescent light emitted from the surface is collected through an optical system, generally spectrally separated, and focused on a photon detector, usually a photomultiplier tube (PMT). PMTs have a glass photocathode window coated by one or more alkali metals that has a high probability of converting an incoming photon to an electron. The electron emitted from the window is attracted to an alkali metal coated electrode which is maintained at a positive charge. When the initial electron strikes the electrode, it normally releases a number of additional electrons. These are attracted to a series of coated electrodes, each maintained at a slightly higher voltage than the previous, in effect multiplying the number of electrons released at each subsequent electrode. After a series of these amplification steps, the electrons are collected by a final electrode and the output current is measured. This output current depends on the intensity of the light (i.e. the number of photons) and the total voltage maintained across the PMT – a higher voltage accelerates electrons more in each step, producing a greater final current. It should be noted that this process is also stochastic, so that each photon produces a number of electrons which can be modeled as a Gaussian distribution with mean μ and standard deviation σ. It should be noted that as the light intensity increases, the number of photons increases and this has an effect on the distribution, with N photons producing approximately Nμ final electrons with a standard deviation of . This explains, in part, the reason why the variation in signal intensity, and consequently derived measurements such as log-ratios are more uncertain for genes expressed at lower levels. Finally, the signal from the PMT is converted to a digital signal using an analog-to-digital converter (ADC). Typical array scanners use 16-bit ADCs, giving the instruments an output range of 0 to 65535 (216-1) relative fluorescence units (RFUs) for each pixel. The reported intensity values for each spot on the array varies between research groups and software used for image processing. Common measures of expression include background subtracted mean or median pixel values measures for each arrayed gene. For the purposes of the analysis presented here, we will use the background-subtracted mean pixel values reported by the TIGR Spotfinder image analysis software .
It is important to note that this effect limits the dynamic range of "fold-change" (equivalent to the log-ratio) measurements on arrays, particularly as the measured intensities approach either the minimum or maximum detectable levels accessible on a particular array scanner. Furthermore, it is important to note that these limits are not unique to dual-color detection techniques. Comparisons made using single color microarrays are also limited by the dynamic range of the individual measurements and fold-change estimates in comparisons demonstrate exactly the same type of artifact.
This work was supported by grants from the National Heart, Lung, Blood Institute (NIH 1 U01 HL66580-01 and NIH-1 R33 HL3712-01), the US National Cancer Institute (NIH-U01-CA8552-01A1), and the National Science Foundation NSF-DBI9975920 and NSF-DBI-0177281).
- Saeed AI, Sharov V, White J, Li J, Liang W, Bhagabati N, Braisted J, Klapa M, Currier T, Thiagarajan M, Sturn A, Snuffin M, Rezantsev A, Popov D, Ryltsov A, Kostukovich E, Borisovsky I, Liu Z, Vinsavich A, Trush V, Quackenbush J: TM4: A Free, Open Source System for Microarray Data Management and Analysis. Biotechniques. 2003, 374-378.Google Scholar
- Yang IV, Chen E, Hasseman JP, Liang W, Frank BC, Wang S, Sharov V, Saeed AI, White J, Li J, Lee NH, Yeatman TJ, Quackenbush J: Within the fold: assessing differential expression measures and reproducibility in microarray assays. Genome Biol. 2002, 3: research0062-Google Scholar
- Yang YH, Dudoit S, Luu P, Lin DM, Peng V, Ngai J, Speed TP: Normalization for cDNA microarray data: a robust composite method addressing single and multiple slide systematic variation. Nucleic Acids Res. 2002, 30: e15-10.1093/nar/30.4.e15.View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article: verbatim copying and redistribution of this article are permitted in all media for any purpose, provided this notice is preserved along with the article's original URL.