Detecting imbalanced expression of SNP alleles by minisequencing on microarrays

Background Each of the human genes or transcriptional units is likely to contain single nucleotide polymorphisms that may give rise to sequence variation between individuals and tissues on the level of RNA. Based on recent studies, differential expression of the two alleles of heterozygous coding single nucleotide polymorphisms (SNPs) may be frequent for human genes. Methods with high accuracy to be used in a high throughput setting are needed for systematic surveys of expressed sequence variation. In this study we evaluated two formats of multiplexed, microarray based minisequencing for quantitative detection of imbalanced expression of SNP alleles. We used a panel of ten SNPs located in five genes known to be expressed in two endothelial cell lines as our model system. Results The accuracy and sensitivity of quantitative detection of allelic imbalance was assessed for each SNP by constructing regression lines using a dilution series of mixed samples from individuals of different genotype. Accurate quantification of SNP alleles by both assay formats was evidenced for by R2 values > 0.95 for the majority of the regression lines. According to a two sample t-test, we were able to distinguish 1–9% of a minority SNP allele from a homozygous genotype, with larger variation between SNPs than between assay formats. Six of the SNPs, heterozygous in either of the two cell lines, were genotyped in RNA extracted from the endothelial cells. The coefficient of variation between the fluorescent signals from five parallel reactions was similar for cDNA and genomic DNA. The fluorescence signal intensity ratios measured in the cDNA samples were compared to those in genomic DNA to determine the relative expression levels of the two alleles of each SNP. Four of the six SNPs tested displayed a higher than 1.4-fold difference in allelic ratios between cDNA and genomic DNA. The results were verified by allele-specific oligonucleotide hybridisation and minisequencing in a microtiter plate format. Conclusions We conclude that microarray based minisequencing is an accurate and accessible tool for multiplexed screening for imbalanced allelic expression in multiple samples and tissues in parallel.


Background
Single nucleotide polymorphisms (SNPs) are highly abundant in the human genome, appearing on average at 0.1% of the nucleotide positions [1]. Thus, each gene or transcriptional unit will contain multiple SNPs that potentially give rise to sequence variation between individuals and tissues on the level of RNA. Recent studies indicate that differences in the expression levels of the alleles of heterozygous SNPs may occur frequently for human genes [2][3][4][5][6]. Imbalanced allelic expression was detected in foetal liver or kidney tissues for more than half of 602 genes analysed, and one third of the genes displayed more than four-fold differences in allelic expression [3]. Another study detected lower levels of allelic imbalance for one fifth of 129 genes analysed in lymphoblastoid cell lines [4].
Non-synonymous SNPs in coding regions of genes may be functional by altering an amino acid, which in turn may affect the structure and function of the encoded protein, while synonymous SNPs may have functional consequences by affecting the stability or folding of mRNA transcripts. Intronic SNPs may give rise to alternatively spliced mRNAs, while SNPs in 5'-or 3'-untranslated mRNA regions may affect the stability or processing of the RNA. Moreover, SNPs in non-protein coding regions of genes that affect binding of regulatory factors may cause imbalanced expression of SNP alleles. This form of genetic variation has been suggested as a common cause of both normal and disease-related inter-individual variation in complex phenotypes [7]. Clearly, methods with high accuracy that can be used in a high throughput setting are needed for systematic surveys of expressed sequence variation and its molecular causes.
Owing to the high sequence specificity of nucleotide incorporation by DNA-polymerases, single nucleotide primer extension has proven to allow quantitative determination of SNPs in genomic DNA in several studies and assay formats (for a review, see Syvänen 2001 [8]). A frequently used quantitative application of the method is to determine SNP allele frequencies in pooled DNA samples [9][10][11][12][13]. The rationale for detecting imbalanced expression of the two alleles of a heterozygous SNP by minisequencing is to measure the ratio between the amounts of labelled nucleotides incorporated in the minisequencing reactions for the two SNP alleles in RNA (cDNA) samples from the tissue of interest. These ratios are then compared to the corresponding ratio measured in genomic DNA, where the two alleles are present in an equimolar ratio [2,4,[14][15][16]. Imbalanced expression of the alleles of a SNP is revealed by a difference in the ratios measured in the RNA and DNA samples.
We are currently using microarray based minisequencing for multiplex genotyping of SNPs. Our custom-made microarrays permit the genotyping of up to 100 SNPs in 80 samples per standard microscope slide, either using immobilised minisequencing primers [17,18] or using a "tag-array" format [19,20] of the method [13]. The purpose of this study was to evaluate the performance of these two microarray formats in quantitative determination of SNP alleles on the RNA level as alternatives with higher multiplexing capacity than previously used primer extension methods in which the SNPs are analysed in individual reactions. Using these systems, we were able to detect significant differences in the amounts of the two alleles of heterozygous SNPs on the RNA level.

Results
We used a panel of ten coding SNPs in five genes to choose the optimal microarray based minisequencing strategy for multiplex, quantitative genotyping of SNPs in DNA and RNA samples. The selected SNPs were located in genes shown by reverse transcriptase PCR analysis to be expressed in one or both of two endothelial cells lines, HUVEC (human umbilical vein endothelial cells) and HAEC (human aortic endothelial cells) that served as our model cell lines in this study (data not shown). We evaluated two formats of microarray based minisequencing by performing five parallel assays with each method for each sample in the evaluation. The SNPs were analysed in both DNA polarities and the evaluation of the methods was based on the DNA polarity yielding the highest signal-tonoise ratio.
In Method I, immobilised minisequencing primers are extended with fluorescently labelled ddNTPs in reactions performed on the microarray surface after annealing of the multiplex PCR products to the primers [18,21]. In Method II, cyclic primer extension reactions are performed in solution in the presence of 5'-tagged minisequencing primers, PCR products and fluorescent ddNTPs [22,23]. After the cyclic reactions the extended primers are captured on a microarray surface carrying immobilised oligonucleotides complementary to the 5'-tag sequences on the minisequencing primers. Both these systems are performed in an "array-of arrays" format developed previously in our laboratory [24].
We analysed a dilution series with mixtures of DNA from two individuals with different genotypes for the panel of ten SNPs in both DNA polarities. The genotyping results from these mixtures of known amounts of the two SNP alleles are expressed as the signal ratio between the fluorescence signals corresponding to the two alleles of each SNP. The quantitative analysis of these ten SNPs is illustrated in Figure 1 by regression lines, in which the mean signal intensity ratios are plotted as a function of the Regression lines displaying accuracy and sensitivity of quantitative genotyping of SNPs Figure 1 Regression lines displaying accuracy and sensitivity of quantitative genotyping of SNPs. The regression lines were obtained by analysing ten SNPs in a series of mixed samples with varying amounts of DNA from two individuals of different genotype. The signal intensity ratios from minisequencing using immobilised primers (Method I, black diamonds) and "tag-array" minisequencing (Method II, grey squares) are plotted as a function of the known allelic ratios in the mixed samples. The SNP names are given in the panels. The signal intensity ratios are mean values of five replicate reactions. The signal intensity ratios obtained in homozygous samples (allele ratios zero) are indicated as black diamonds and grey squares on the left vertical axis of each panel. (II) cyclic primer extension and "tag-arrays". c Coefficient of determination describing the fit between the regression lines in Figure 1, and the data points. d Level at which the minority SNP allele can be detected. The percentages correspond to the mixture with a signal intensity ratio significantly different (p < 0.05) from the signal intensity ratio of the corresponding homozygous or heterozygous (*) sample. ND: Not possible to determine due to scatter of the data points. e p-value for difference between signal ratios at the detection level and at 0% of the minority allele.
known allelic ratios in the mixed samples. The coefficient of determination (R 2 ), which describes how well the regression line fits the data points, was used to assess the accuracy of quantification of the SNP alleles by Methods I and II. As can be seen in Table 2 We also determined the sensitivity of the methods for detection of a minority allele. The detection limit was defined as the percentage of the minority allele in the mixed sample, for which the signal ratio differed from the signal ratio in the corresponding homozygous sample with a p-value < 0.05 in a two sample t-test. Depending on the genotype of the DNA samples used for the dilution series, determination of the lower limit of detection was possible for seven of the ten SNPs with allele ranges 0-50% or 0-100% in the mixed samples (Table 1). For the remaining three SNPs with the allele range 50-100%, the smallest percentage of an allele that could be distinguished from a heterozygous genotype was identified by the same approach. Using Method I, we were able to detect less than 5% of the minority allele for two SNPs (rs1042713 ADRB2 and rs5925 LDLR) and less than 9% for rs4331 ACE, rs1042719 ADRB2, rs5351 EDNRB and rs5930 LDLR (Table 2). Method II allowed more sensitive detection of minority alleles than Method I. Less than 2% was detectable for the SNPs rs1042713 ADRB2, rs1042719 ADRB2 and rs5351 EDNRB, and less than 9% was detectable for the SNPs rs4331 ACE, rs5925 LDLR, rs5930 LDLR and rs1433099 LDLR ( Table 2). For the SNPs rs1042714 ADRB2, rs1042718 ADRB2 and rs1799983 NOS3, we were able to measure 4-14% deviations from the heterozygous genotype (Table 2). These results show that the amount of SNP alleles can be accurately determined on the DNA level by Methods I and II using reference samples with the two SNP alleles present in known ratios.
Next, the performance of the two methods in quantitative analysis on the RNA level was assessed. The ten SNPs were first genotyped in genomic DNA (gDNA) from the HUVEC and HAEC cells to identify those SNPs that were heterozygous in either or both cell lines. Three SNPs in the low density lipoprotein receptor gene (LDLR; rs5925, rs5930 and rs1433099) were heterozygous in the HAEC cell line, and one SNP in each of the genes encoding angiotensin I converting enzyme (ACE rs4331), β 2 -adrenergic receptor (ADRB2 rs1042719) and endothelin receptor type B (EDNRB rs5351) were heterozygous in the HUVEC cell line. These SNPs were genotyped in cDNA produced from total RNA extracted from the cells with the corresponding gDNA as reference samples using both methods. Table 3 presents the mean fluorescence signals with coefficients of variation (CV) obtained in five parallel reactions for the six SNPs in cDNA and gDNA from the HUVEC and HAEC cells. For the heterozygous SNPs the largest difference in the variability between parallel reactions was observed between SNPs, with the lowest CV values (3.6 -8.6 %) for the rs1042719 ADRB2 SNP, and the highest CV values (13 -41%) for the rs1433099 LDLR SNP. No systematic differences in the variability of parallel reactions were observed between Method I and Method II, or between cDNA and gDNA. Table 4 shows the differences in mean signal intensity ratios between the cDNA and gDNA assays for the six SNPs that were heterozygous in HUVEC or HAEC cells, respectively, together with the corresponding normalized cDNA/gDNA ratios. The SNPs in the ACE, ADRB2 and EDNRB genes displayed significant imbalanced expression in the HUVEC cells using both methods. For the SNP rs4331 ACE, the signal intensity ratio based on the raw data obtained by Methods I and II differed from each other, but despite this large difference, both methods yielded similar levels of allelic imbalance for this SNP after normalisation against the signal ratio in gDNA (Table 4). Only for one of the three LDLR SNPs (rs5930), the difference in fluorescence intensity ratios between cDNA and gDNA from HAEC cells reached statistical significance by both methods. Allelic imbalance of the LDLR gene was detected for the LDLR SNP rs5925 using Method II only.
To test that the results on imbalanced allelic expression detected by the multiplexed microarray based methods represents the true biological situation in the cells, we analysed the heterozygous SNPs in five replicate RNA samples prepared from HUVEC or HAEC harvested at different time points from different cell culture flasks. We also analysed the three LDLR SNPs in five replicate reverse transcription reactions from the same RNA sample prepared from HAEC cells. For this analysis we used our first generation solid-phase minisequencing assay for individual SNPs in a microtiter plate format. The concordant cDNA/gDNA ratios from these control experiments from independent cell and RNA samples presented in Table 5 show that the detected allelic imbalance was not caused by the procedures for RNA extraction or cDNA synthesis. Finally, we verified the results obtained by microarray-based minisequencing for three of the SNPs by real-time PCR with allele specific hybridization probes (TaqMan). Table 4 shows these results together with the corresponding results by solid-phase minisequencing in a microtiter plate format. Allelic imbalance was detected with statistical significance for the SNP rs1042719 ADRB2 and the SNP rs1433099 LDLR by both methods. Particularly for the SNP rs1042719 ADRB2, the cDNA/gDNA ratios obtained by the two reference methods were highly similar to the results from the microarray-based methods presented in Table 4, as well as with each other. As for the microarray-based Method II, the difference in signal ratios between cDNA and gDNA measured by the TaqMan assay for the SNP rs5925 LDLR did not reach statistical significance due to large variation between parallel assays. Analysis of the SNP rs1433099 LDLR by the reference methods confirms the imbalanced expression of the LDLR receptor alleles.

Discussion
The purpose of our study was to evaluate microarray based minisequencing for multiplexed detection and   quantification of imbalanced expression of SNP alleles, as a prelude to further large scale screening for allelic imbalance. We found no significant differences in the performance of our two "in house" methods, minisequencing with primers directly immobilised on the microarrays (Method I) [18] and the "tag-array" format, based on cyclic minisequencing followed by capture on microarrays using immobilised complementary "tag" probes (Method II) [23]. Both methods showed a linear relationship between SNP allele ratios and the signal intensity measured in the four-colour fluorescence minisequencing assay for all SNPs. With respect to accuracy assessed by coefficients of variation (CV) between five parallel assays both methods performed equally well, and the CV values between parallel assays were indistinguishable between genomic DNA and reverse transcribed cDNA samples. The sensitivity of detecting a SNP allele present as a minority in a sample was defined as the percentage for which the signal ratio differed from the signal ratio in the corresponding homozygous sample with a p-value < 0.05 in a two sample t-test. The sensitivity differed between SNPs, and range from 1% to 9%, with a trend to be slightly better using the "tag-array" system (Method II). In several cases the p-values were lower than 0.05 (Table 2), which a rs4331 ACE, rs1042719 ADRB2 and rs5351 EDNRB were analysed in HUVEC and rs5925 LDLR, rs5930 LDLR and rs1433099 LDLR were analysed in HAEC. b Minisequencing using (I) immobilised primers; (II) cyclic primer extension and "tag-arrays", TaqMan (TM) and solid-phase minisequencing (MS). TMassays were not possible to design using the Assay-by-Design and Assay-on-Demand service at ABI for the remaining three SNPs, MS was performed for the SNPs with working TM-assays. c Mean and range is given for five parallel reactions. indicates that in practice the sensitivity of detection would be lower than the stringent limit set here. The sensitivity of our multiplex microarray based minisequencing methods compares well with the sensitivity of other single nucleotide primer extension assays performed for individual SNPs in recent studies [4,[25][26][27].
It is notable that the largest differences in accuracy and sensitivity were observed between SNPs. Some of the SNPto-SNP differences are likely due to differences is the accuracy and efficiency of incorporation of the four different fluorescently labelled nucleotide analogues by the DNA polymerase [13,26] as well as to other sequence context dependent factors. The large variation between parallel assays for the SNP rs1433099 LDLR prevented detection of the allelic imbalance for the LDLR gene, while imbalance was detected by the SNP rs5930 LDLR using both methods. This result demonstrates that it is preferable to analyse more than a single SNP in each gene in systematic screening for allelic imbalance in gene expression. As more data from primer extension assays accumulate, it may be possible to improve the accuracy of the system by improving the SNP selection and assay design further with the aid of algorithms developed based on this data [28,29].
Comparison of the relative amounts of the alleles of six SNPs on the RNA (cDNA) level to heterozygote SNPs in genomic DNA revealed four SNPs with imbalanced expression of the two alleles. A three-fold increase in the expression of the T-allele for the SNP rs4331 ACE was the most pronounced difference observed. In our study, 1.4-1.5-fold differences in allelic expression levels were detectable. The sensitivity of detecting a minority allele in our system would allow the distinction between 10-fold reduction in the expression of an allele and monoallelic expression, for example as a result of imprinting. Owing to its potential for high throughput screening of large numbers of samples, we have also performed a preliminary evaluation of the commercial SNPstream genotyping system (GenomeLab, Beckman Coulter) that also utilises the "tag-array" primer extension strategy in a semi-automated 384-well microtiter plate format for detection of imbalanced allelic expression [30]. The same trend of imbalanced allelic expression was observed for each of the SNPs, which is encouraging for future studies of imbalanced allelic expression in a high throughput semi-automated way. Other studies that have used fluorescent single base primer extension assays report that 1.2 -fold to 1.5 -fold differences in allelic expression are detectable [2,4,5]. Primer extension methods based on direct measurement of fluorescent signals, including the microarraybased methods evaluated here, are likely to provide better accuracy and sensitivity for allele quantification than homogeneous primer extension based on fluorescence polarisation [31,32], in which the allele quantification relies on measurement of small differences between large polarization signals.
It is also reassuring for future large scale detection of imbalanced allelic expression that the accuracy of our methods seemed to be similar for cDNA and genomic DNA. Analysis of replicate RNA samples from different batches of both cell lines using a microtiter plate format of the minisequencing method evidenced for the biological authenticity of the allelic imbalance detected using minisequencing in the microarray format. The data obtained from independent cell samples also indicate an acceptable reproducibility of RNA extraction, RNA storage and cDNA synthesis. Another important factor besides sample to sample variation that may affect the accuracy of the relative allele quantification is the amount of mRNA subjected to the analysis. At a low copy number of mRNA, the stochastic distribution of the RNA templates may be a major source of variation [33]. The reason for the large variation between parallel assays for the LDLR receptor gene observed with all four methods used in our study may reflect a low expression level of the LDLR gene in the HAEC cells. Moreover, the amount of gene specific transcript in each RNA sample may vary which makes it difficult to perform balanced multiplex RT-PCRs to screen for allelic imbalances in several genes in one reaction.
A similar minisequencing strategy as the one used for determination of imbalanced expression between SNP alleles can also be used for determination of the relative expression levels of highly homologous genes [15] and for determination of alternatively spliced transcripts [34], a resolution that is beyond the capacity of traditional microarray based RNA expression profiling.

Conclusions
Here we demonstrated the applicability of two formats of microarray based minisequencing for detecting imbalanced expression of SNP alleles. The accuracy and sensitivity of both systems allow detection of 1.4-to 10-fold differences in the expression levels of the two alleles of heterozygous SNPs. The microarray-based minisequencing systems utilise widely available reagents and equipment, and can thus easily be established "in-house". Moreover, the system is flexible with respect to number of SNPs and samples to be analyzed. Systematic quantitative screening of genetic diversity on the RNA level in multiple individuals and tissues will be a future approach in the elucidation of the molecular mechanisms that regulate gene expression.

DNA and RNA samples
DNA samples from 30 volunteer donors were genotyped by Methods I and II to identify individuals of different genotypes for the panel of ten SNPs analysed. The SNPs are described in the section "SNPs and primers" below. DNA (10 ng/µl) from one individual was serially diluted 2:1 into DNA (10 ng/µl) from a second individual, to yield a series of DNA samples with different ratios between the SNP alleles. These mixed DNA samples were used for construction of quantification standard curves.
Depending on the genotype of each SNP in the two individuals whose DNA was mixed, dilution series of samples with different allelic ranges were obtained for the ten SNPs, as specified in Table 1.

Preparation of microarrays
The minisequencing primers or the complementary tagoligonucleotides were covalently immobilised on Code-Link™ Activated Slides (Amersham Biosciences, Uppsala, Sweden) by the mediation of a NH 2 -group in their 5'-or 3'-end, respectively. The oligonucleotides were applied in duplicates to the slides at a concentration of 25 µM in 150 mM sodium phosphate pH 8.5 using a ProSys 5510A instrument (Cartesian Technologies Inc, Irvine. CA, USA) equipped with one Stealth Micro Spotting pin (SMP3B, TeleChem International Inc., Sunnyvale, CA, USA) to minimise the variation between spots in different "subarrays". The oligonucleotides were spotted in an "array-ofarrays" configuration that facilitates analysis of 80 individual samples in parallel on each microscope slide [24]. In each "subarray" a fluorophore-labelled oligonucleotide was included as a control for the immobilisation process. A reference oligonucleotide, complementary to a synthetic template included in the minisequencing reaction mixtures to monitor the difference in incorporation efficiency of the four nucleotides by the DNA polymerase, was also included in each "subarray". Finally, an oligonucleotide designed not to hybridise to any of the oligonucleotides present in the reaction mixture was included in each "sub-array" to be used for background corrections. After printing, the slides were incubated in a humid chamber for at least 24 hours, followed by treatment with ethanolamine according to the manufacturer's instruction. The slides were then stored desiccated in the dark until use.

Minisequencing using immobilised primers (Method I)
Aliquots of 7.5 µl of the concentrated PCR products were analysed in five parallel "subarrays" for each sample, essentially as described previously [18]. The PCR products were allowed to anneal to the immobilised oligonucleotides. After washing, the extension reactions were performed with 0.75 U of Thermo Sequenase™ DNA polymerase (Amersham Biosciences, Uppsala, Sweden) and 0.35 µM Texas Red-ddATP, Tamra-ddCTP, R110-ddGTP and Cy5-ddUTP (Perkin Elmer Life Sciences, Boston, MA, USA) in Thermo Sequenase™ reaction buffer in a total volume of 15 µl, followed by washing of the slide.

Minisequencing using "tag-arrays" (Method II)
Five parallel reactions with a 4.5 µl aliquot of the concentrated PCR products were analysed for each sample, as described in detail in [23]. Excess of PCR primers and dNTPs were removed by treatment with 5 U of exonuclease I and 1 U of shrimp alkaline phosphatase (USB Corporation, Cleveland, OH, USA). The cyclic minisequencing reactions were performed in the presence of the 20 tagged primers at 10 nM concentration, 0.1 µM Texas Red-ddATP, Tamra-ddCTP and R110-ddGTP, 0.2 µM Cy5-ddUTP (Perkin Elmer Life Sciences, Boston, MA, USA) and 1 U of Thermo Sequenase™ DNA polymerase (Amersham Biosciences, Uppsala, Sweden) for 55 cycles of 95°C and 55°C for 20 s each. The extension products were allowed to anneal to the immobilised complementary tag oligonucleotides at 42°C for 2.5 hours followed by washing of the slide.

Solid-phase minisequencing in a microtiter plate format
PCR was run with one of the primers biotinylated. The biotinylated PCR products were immobilised in a microtiter plate coated with streptavidin (Combiplate 8, Labsystems, Helsinki, Finland) and the unbiotinylated strand was removed with alkali treatment [9,15]. The minisequencing mixture, containing the appropriate tritium labelled dNTP (Amersham Biosciences, Uppsala, Sweden), AmpliTaq ® DNA polymerase (Applied Biosystems, Foster City, CA, USA) and the minisequencing primer was added. The extension reaction was allowed to proceed for 10 min at 50°C. The extended primers were released with alkali and the amount of incorporated tritium labelled nucleotide was measured.

Hybridisation with allele-specific TaqMan probes
Primers and probes for the TaqMan assays were designed by Applied Biosystems as Assay-by-Design (rs1042719 ADRB2 and rs5925 LDLR) or Assay-on-Demand (rs1433099 LDLR) service. The probes for the two alleles were labelled with the reporter dyes FAM and VIC respectively. The sequences of the primers and probes for the SNPs rs5925 LDLR and rs 1042719 ADRB2 are found in [Additional file 1]. The primer and probe sequences for the SNP rs1433099 LDLR were not made available to us by ABI since this SNP is included in their Assay-on-Demand program.
Real time quantitative PCR was run in 25 µl TaqMan Universal PCR Master Mix (Applied Biosystems) with 200 nM of both labelled TaqMan probes, 900 nM PCR-primers and 10 ng genomic DNA or one tenth of the cDNA products. The PCR conditions were initial activation of the enzyme at 95°C for 10 min followed by 60 cycles of 95°C for 15 sec and 60°C for 1 min in a ABI7000 instrument (Applied Biosystems, Foster City, CA, USA).
The signal intensity ratios were calculated based on normalised ∆Rn fluorescence values obtained from the assay during the exponential phase of PCR. The ∆Rn values were retrieved from cycle 38 for the SNP rs1042719 ADRB2, cycle 42 for the SNP rs5925 LDLR and cycle 43 for the SNP rs1433099 LDLR. Imbalanced expression of the SNP alleles was determined by a t-test as described below.

Signal detection and data analysis
In Methods I and II fluorescence was measured using a ScanArray ® Express instrument (Perkin Elmer Life Sciences, Boston, MA, USA) with the excitation lasers Blue Argon 488 nm, Green HeNe 543.8 nm, Yellow HeNe 594 nm and Red HeNe 632.8 nm with the laser power set to 80% and the photomultiplier tube gain adjusted to obtain equal signal intensities from reaction control spots for all four spectra. The fluorescence signals were extracted using the QuantArray ® analysis 3.1 software (Perkin Elmer Life Sciences, Boston, MA, USA). The mean of the fluorescence signals for the duplicate spots was corrected for the average background in each "sub-array" separately. The data was handled and interpreted using the Microsoft ® Excel program.
The genotype for each individual SNP was assigned by calculating a ratio between the fluorescence signals for the two alleles. Coefficients of determination (R 2 ) were assigned by linear regression analysis of the relationship between the signal intensity ratios determined from the minisequencing assay and the known allelic ratios in the mixed samples for the quantification standard curves. Two-sample t-tests with two-tailed significance levels assuming unequal variance were performed to determine the lowest level of detection of a specific allele for the quantification standard curves and to evaluate the imbalanced expression of the two alleles of the SNPs in the cell lines.

Authors' contributions
UL participated in the design of the study and in RNA and DNA extraction, and performed all the laboratory work involving "in-house" minisequencing methods, performed the statistical calculations and drafted the manuscript. MF cultured the cells, performed RNA and DNA extraction, performed the assays with the reference method, and provided input to the manuscript. AD performed the assays using the SNPstream system. A-CS conceived the study, participated in its design, coordination and in preparation of the manuscript. All authors read and approved the final manuscript.