Quantifying 35 transcripts in a single tube: model-based calibration of the GeXP multiplex RT-PCR assay

Background Quantitative analysis of differential gene expression is of central importance in molecular life sciences. The Gene eXpression Profiling technology (GeXP) relies on multiplex RT-PCR and subsequent capillary electrophoretic separation of the amplification products and allows to quantify the transcripts of at least 35 genes with a single reaction and one dye. Results We provide a kinetic model of primer binding and PCR product formation as the rational basis for taking and evaluating calibration curves. The calibration procedure and the model predictions were validated with the help of a purposefully designed data processing workflow supported by easy-to-use Perl scripts for calibration, data evaluation, and quality control. We further demonstrate the robustness and linearity of quantification of individual transcripts at variable relative abundance of other co-amplified transcripts in a complex mixture of RNAs isolated from differentiating Physarum polycephalum plasmodial cells. Conclusions We conclude that GeXP analysis is a robust, sensitive, and useful method when the transcripts of tens to few hundred genes are to be precisely quantified in a high number of samples. Supplementary Information The online version contains supplementary material available at 10.1186/s12896-021-00689-4.


Background
The quantitative analysis of differential gene expression is of central importance in cell, developmental, and systems biology. Established experimental approaches have specific strengths and weaknesses and the method of choice depends on the required accuracy for quantitative detection, the available amount of RNA per sample, the number of different transcripts that are to be quantified in one sample, and the available budget [1,2]. Transcript quantification by the next generation sequencing RNA-Seq method, for example, allows to quantify the entire transcriptome with an accuracy depending on the number of reads analysed. However, the quantification of one sample is relatively expensive in terms of sequencing costs and requires sophisticated bioinformatic data processing [3,4]. Real-time quantitative RT-PCR on the other hand is relatively inexpensive and easy to perform but only gives the transcript concentration for up to few genes per sample. A different, appealing, and useful approach is the GeXP multiplex RT-RCR developed by Beckman-Coulter. 1 This method relies on the simultaneous amplification of the cDNAs of a moderate number of genes in the same sample and the subsequent separation of the fluorescence-tagged amplification products by capillary electrophoresis. The number of transcripts that can be quantified in a single pot is limited by the number of fragments that can be thoroughly separated from each other. With the CEQ 8800™ Genome Analysis System, an 8-capillary sequencer, at least 35 different transcripts can be readily quantified using a single fluorescent dye in a single tube [5], about ten times more than by one multiplex Real-time PCR assay [2]. Thus, GeXP can considerably save time, money, and RNA if transcripts of tens to hundreds of genes are to be quantified in a high number of samples.
The method is depicted in Fig. 1. Each transcript of interest is reverse transcribed with a gene-specific RT-primer (reverse primer) and the second strand of the cDNA is synthesized with a gene-specific forward primer. Both, the forward and the reverse primers carry additional sequence stretches at their 5′-ends that are identical for all gene-specific primer pairs used in the multiplex RT-PCR reaction. During PCR these sequences serve as specific binding sites for the two universal primers. The purpose of the universal primers is to amplify all cDNAs in the reaction mix independent of their length or sequence with equal efficiency during each cycle of the PCR. With the help of a spike-in RNA, an internal standard which is reverse transcribed together with the RNAs to be analysed, the amount of cDNA fragment amplified from each gene of interest can be normalized relative to the amount of fragment amplified from the spike-in RNA so that the relative transcript abundance for each gene becomes comparable between samples. By producing fragments of different length that are separated electrophoretically (Fig. 1), the relative abundances of the 35 chosen transcripts can be determined provided the system is calibrated appropriately. An appropriate calibration procedure and its validation is described in this paper. The normalized peak area for each amplification product which is determined during the electrophoretic separation can be fitted to the total RNA concentration by linear regression of the data points on a double logarithmic plot [6,7]. Here, we demonstrate that this empirical calibration procedure complies with a kinetic model of cDNA synthesis and amplification. The kinetic model predicts for a set of different transcripts a series of curves of equal slope shifted in parallel to each other. We experimentally confirm the model predictions and the validity of the modelbased calibration procedure. We also provide so far missing necessary controls showing that the amplification efficiency of a given transcript is independent of the relative abundance of the transcripts of other genes that are co-amplified in a sample. Finally, we apply these methods to estimate the relative concentration of different transcripts of differentiation marker genes expressed in Physarum polycephalum plasmodial cells that are committed to sporulation [6] and show how the reproducibility of a high number of measurements can be computationally analysed. We also provide a set of Perl scripts as components of an integrated workflow for efficient data processing and evaluation, which is demonstrated in the context of the case study.

Kinetic model
As the concentration of the transcripts of individual genes within a given sample of total RNA may vary by orders of magnitude relative to each other, the conditions of a GeXP experiment have to be adjusted such that the amplified fragments can be reliably detected side by side when separated on the capillary sequencer. This is achieved by empirically adjusting the concentration of the gene specific reverse primers so that at intermediate physiological concentration of each transcript the amount of cDNA amplified is in the same range for all transcripts that are to be detected in one reaction tube ( Fig. 2) [5][6][7]. Hence the concentration of the primer for the reverse transcription of certain transcripts may become rate limiting for the amount of cDNA synthesized. In the following we describe a kinetic model of the RT-PCR reaction RNA → RT cDNA 1st → PCR cDNA amp as rational basis for evaluating the double logarithmically plotted calibration curves for the transcripts to be analysed. The mRNAs (RNA) are reverse transcribed to give a first-strand cDNA (cDNA 1st ) which is then amplified with a universal primer set to obtain a PCR product (cDNA amp ) for each transcript which is electrophoretically separated on a capillary sequencer, as shown in Fig. 1).
Assuming that at maximum 2% of the total RNA isolated from a eukaryotic cell is mRNA, a 10 μl RT reaction volume with the amount of total RNA as used in the experiments contains not more than 0.4 ng of mRNA. With approximately 17,000 genes expressed in a starving P. polycephalum plasmodium and an average length of a mature mRNA of approximately 1000 nucleotides [8], this is equivalent to an average concentration of 6.9 × 10 − 15 M for the mRNA of one gene corresponding to approximately 41,500 molecules per reaction volume. The smallest RT primer concentration used in our RT reaction (Table S1) is 2.5 × 10 − 11 M, as compared to 2.7 × 10 − 8 M for the average and 5 × 10 − 8 M for the maximal concentration, respectively. For a specific mRNA of 6.9 × 10 − 12 M which is 1000 times more abundant than the average, even the smallest RTprimer concentration of 2.5 × 10 − 11 M would be in excess to the concentration of the template in a RT reaction. Accordingly, the primer concentration for most if not all of the amplified fragments is expected to be in very large excess as compared to the template concentration and under these conditions the concentration of the reverse primer will not change significantly in the course of the RT reaction. Fig. 1 Schematic representation of the GeXP multiplex RT-PCR assay for quantification of relative abundances of transcripts. a The reverse transcription of a RNA is primed with a gene-specific reverse primer (GSrev) that carries a universal priming site at its 5′-end to obtain a first strand cDNA fused with the universal priming site. The synthesis of the second strand cDNA is primed with a gene-specific forward primer (GSfwd) that also carries a yet different sequence as universal priming site. Because of the two universal priming sites incorporated, the cDNA can be amplified with universal forward and reverse PCR primers (Ufwd, Urev, respectively) independently of the gene-specific sequence. As the universal forward primer is labelled with a fluorescence tag at its 5′-end, the PCR product can be detected after capillary electrophoretic separation. b Gene-specific primers are designed to yield PCR products of different lengths that can be separated by capillary electrophoresis from each other and quantified through the area of individual peaks in the elution profile (c). c The gene expression pattern of different samples can be compared through the cDNA amplified from a spike-in RNA that serves as an internal standard. A DNA ladder (size marker) added to the cDNA sample prior to capillary electrophoresis allows to determine the size of the PCR products It has been shown that binding of a primer to its template during the reassociation (renaturation) of DNA follows the rate law of a second order reaction [9,10]. This suggests that the rate limiting step for the reassociation is the nucleation reaction during which few bases of one strand bind to their fitting counterparts on the other strand. The subsequent zipping reaction must occur fast as compared to the nucleation event of base pairing because otherwise a first order instead of second order kinetics would be observed. This conclusion is plausible as the local concentration of fitting bases greatly increases if two strands are brought in spatial proximity and appropriate orientation as a consequence of the zipping reaction.
The concentration of the RT-primer for the transcript of each gene of interest has to be adjusted to attenuate the relative amount of DNA amplified from its RNA during the GeXP RT-PCR reaction while the concentration of the RTprimer (denoted as Primer in the following) is in large excess as compared to the reverse transcribed mRNA (see above). We accordingly assume that formation of the RNAprimer complex (RNA · Primer) is the rate-limiting step of cDNA synthesis by reverse transcription (RT). Once priming of the RT reaction has occurred, the polymerization reaction of cDNA synthesis is equally fast for all 1st strand cDNA species (cDNA 1st ) concurrently synthesized in the reaction mix.
Note that only a tiny fraction of the low molecular weight reactants is consumed during the RT reaction such that their concentrations do hardly change during the experiment. Based on the assumption of a ratelimiting bimolecular reaction of RT-primer binding to Fig. 2 Obtaining comparable amounts of PCR products from transcripts of highly different abundances. a When samples of total RNA containing transcripts of genes A, B, and C that yield cDNAs of highly uneven concentrations, the amounts of some transcript-specific PCR products may be so low that differences between samples (e.g. induced vs. control) cannot be precisely quantified by capillary electrophoresis (as shown for transcript C). In order to allow reliable detection of the transcripts within the dynamic range of their differential regulation, the two samples (#1 and #2) are mixed (Step 1) and the concentrations of the gene-specific reverse primers (see Fig. 1) adjusted so that approximately equal amounts of cDNA fragments are obtained by RT-PCR from each of the transcripts (Step 2). Finally, the two samples, #1 and #2 are re-analysed with the adjusted primer concentrations (Step 3). Now, the x-fold change in the abundance of the transcripts of all genes can be determined due to the sufficient amount of their corresponding cDNAs (b) the RNA template, we expect an empirical rate law for the time-dependent decay d[RNA i ]/dt of the concentration [RNA i ] of the transcript of each gene i according to where k i,j is the primer-specific rate constant for hybridization of the primer to its complementary RNA sequence, [Primer i,j ] is the concentration of primer j of given sequence and length, complementary to the corresponding sequence stretch of the mRNA of gene i, and n is the empirical order of the reaction. As the concentration of the primers in the RT reaction is in a sufficiently high excess over that of each gene-specific RNA (see estimation above), the primer concentrations can be approximated as remaining constant during the reaction. By defining the pseudo first order rate constant k where the value of k 0 i; j depends on the primer concentration that had been appropriately adjusted by the experimenter. Separation of the variables and integration over the reaction time t and the change in RNA i concentration yields In the simplest case of a first order reaction where n = 1 , [RNA i ] decays exponentially during the reverse transcription reaction of length t starting from its initial concentration [RNA i ] 0 according to Alternatively, if the reaction order is different from one (n ≠ 1), the integration yields and hence [RNA i ] decays according to In both cases, for n = 1 and for n ≠ 1, the concentration [cDNA i 1st ] of gene-specific first strand cDNA, synthesized during the entire RT reaction is The concentration of the gene i-specific cDNA fragment, ½cDNA amp i which has been amplified by PCR from the first strand cDNA (cDNA 1st i ) with the help of the universal primer pair (see Introduction) is expected to be and according to eq. (8) dependent on the initial concentration of the gene i-specific transcript Overall amplification α of each 1st strand cDNA molecule after c cycles of PCR is expected to be α ≤ 2 c−1 because the first PCR cycle is required to synthesize the second strand on the cDNA, leaving c − 1 cycles for DNA amplification while a new second strand product can be synthesized from the first strand cDNAs during each cycle of PCR. Each newly synthesized second strand cDNA molecule then starts a new exponential cascade of amplification during the remaining cycles of PCR. With eq. (4) and (7), the concentration ½cDNA amp i of the gene i-specific cDNA fragment for n = 1 is and for n ≠ 1 Calculating the concentration of cDNA ½cDNA amp i amplified from the gene i-specific mRNA as a function of the relative initial concentration [RNA i ] 0 in the RT reaction according to eq. (11) and (12) yields apparently linear relationships in a double logarithmic plot for empirical reaction orders of n ≤ 1 (Fig. S1). The concentration of any gene-specific mRNA present in the RTreaction is proportional to the concentration of total RNA [RNA tot ] 0 which is subjected to reverse transcription. Accordingly, a double logarithmic plot of the concentration of specific cDNAs amplified from the transcripts of different genes will yield a series of parallel shifted lines of equal slope and correspondingly different intercepts with the ordinate. The kinetic model with eq. (11) and (12) further predicts how the concentration of cDNA fragments depends on the initial concentration of the gene-specific reverse primer used for the RT reaction (Fig. S2).

Experimental validation of the calibration procedure
The kinetic model of the GeXP RT-PCR predicts that for a set of transcripts one obtains a series of parallel shifted lines when the concentration of amplified gene-specific cDNA fragment is double logarithmically plotted versus the total RNA concentration used in the assay. For experimental validation of this prediction, we used Physarum polycephalum total RNA isolated from plasmodial cells as a source of typical eukaryotic mRNAs [8].
Starving plasmodial cells can be induced to sporulation by a brief pulse of far-red light which is sensed by a phytochrome photoreceptor [8,11,12]. The cellular commitment to sporulation is associated to the differential expression of developmentally regulated genes [13]. Based on transcriptome data [13,14] we have previously designed gene specific primers for the quantification of the transcripts of a set of 35 genes with a single GeXP RT-PCR reaction. The concentrations of the genespecific reverse primers were attenuated (adjusted) so that amplification of each of the 35 cDNAs yields a sufficient amount of each fragment to be resolved side-byside during electrophoretic separation ( Fig. 2; Table S1) [6]. The set comprised genes that are up-or downregulated at different strengths as well as genes with constant or almost constant expression level that can serve as a reference.

Fragment quantification by capillary electrophoresis
For the inital validation of the performance of the electrophoretic quantification of cDNA fragments, we used two samples of RNA, one prepared from uninduced plasmodial cells (controls) and one from far-red light induced plasmodial cells harvested at 6 h after the far-red pulse. The induced cells were committed to sporulation but did not yet show any morphogenetic alterations. In the cells of the two groups, a number of genes are upor down-regulated [6]. A known amount of spike-in RNA was added to the two samples of total RNA isolated from the cells of the two groups and the cDNAs of the 35 transcripts were amplified by GeXP RT-PCR. Subsequently, a DNA size standard was added to each of the two samples and the fluorescence tag-labelled fragments were separated electrophoretically on the CEQ 8800™ 8-capillary sequencer (see Fig. 1c). The fragments were quantified through their peak area and the amount of each gene-specific fragment was normalized relative to the amount of fragment amplified from the spike-in RNA with the help of the Perl scripts (Supplementary Information) to give the normalized peak area (NPA).
In order to test the performance of the electrophoretic separation process and to validate the fidelity of the peak quantification software, we prepared a series of mixtures, each containing three pairs of fluorescently labelled DNA fragments of different relative concentration, mimicking different concentrations of cDNAs as obtained by RT-PCR. Each mixture was separated on the CEQ 8800™ capillary sequencer and the normalized peak area (NPA) determined for three chosen pairs of fragments that were amplified from an up-regulated and one downregulated transcript, respectively. For the three pairs of DNA fragments, the ratio of the normalized peak areas for the corresponding two fragments was plotted against the relative concentration of the two samples in the mixture. In the three pairs of fragments the normalized peak area for each fragment was proportional to its relative amount in the mixture (Fig. S3), confirming that the peak detection algorithm of the software delivered with CEQ 8800™ capillary sequencer worked correctly.

Calibration curves for the 35 transcripts
In order to obtain calibration curves for each of the 35 transcripts, we mixed two RNA samples (from lightinduced and control cells) in a 1:1 ratio in order to obtain a sample that contained even the differentially regulated transcripts at intermediate concentration (Fig. 2). This RNA mixture was serially diluted and, after addition of spike-in RNA to each sample of the dilution series, the cDNA fragments as amplified by the GeXP multiplex RT-PCR were separated electrophoretically. The normalized peak area determined for each cDNA fragment was plotted versus the concentration of total RNA. The data points obtained for all fragments were fitted to straight lines in the double logarithmic plot. The slopes as obtained by the 35 fits were averaged and the best-fitting intercept for each of the 35 curves was calculated. Accordingly, this procedure gave a series of fitted curves that only differed in being shifted with respect to the abscissa (Fig. 3). As predicted by the kinetic model, for none of the genes any systematic deviation of the data points from the fitted straight line was seen (Fig. 3). Hence, all fragments were amplified with the same efficiency by the universal primers. The parallel shift relative to the abscissa, according to the kinetic model, is due to the different absolute concentrations of the mRNAs and to the different amounts of cDNA produced during the RT reaction at the particular concentration of the RT-primer. Noteworthy, the slopes of the curves clearly were smaller than 1 as observed in most experiments performed in our lab. A slope of 1 would be expected if the cDNA synthesis occurred with first order kinetics as assumed in eq. (11). The kinetic model therefore suggests that the RT reaction occurs with a reaction order of n ≤ 1 (eq. (12)). The value of n observed in each particular experiment might slightly vary depending on the experimental conditions.

Efficiency of concurrent amplification of different transcripts in one sample
Since cDNA fragments must be of different lengths in order to get separated during capillary electrophoresis and the efficiency of amplification might depend on the fragment length, we wished to confirm that the amount of cDNA amplified for any given transcript is independent of other fragments that are concurrently amplified in the same sample. Therefore, we mixed the RNA sample from the control cells with the sample from the lightinduced cells taking different relative amounts and estimated for each gene how the normalized peak area changed with the relative amount of the RNA from light-induced cells or from the dark controls (Fig. 4). For each fragment, the normalized peak area and hence the amount of amplified cDNA was a linear function of the mixing ratio with a slope depending on whether genes were up-regulated or down-regulated. Curves for genes that are not regulated in response to the photo-stimulus fitted to a straight line that was parallel to the abscissa (Fig. 4). Hence, the amount of amplified DNA for each gene was found to be proportional to the concentration of the corresponding mRNA in the sample. Likewise, amplification of the cDNAs of those genes that are not or rarely differentially regulated (dspA, spiA, tspA, etc.) confirms that the amplification efficiency indeed does not depend on the concentration of other (differentially regulated) mRNAs that are concurrently amplified in the sample.
An experimental result consistent with the computational predictions of the kinetic model of the GeXP RT-PCR (eq. (12)) was obtained by changing the concentration of one primer while the concentration of all other primers was kept constant. Until saturation, the normalized peak area increased with the RT-primer concentration, indicating that the reverse primer was rate-limiting even though its concentration was high relative to the concentrations of the other RT-primers in the mix (Fig. S2).
Taken together, these experimental results confirm that all fragments are amplified with the same efficiency during PCR with the universal primer set regardless of their sequence or the length of the amplification products. Amplification of individual fragments is not influenced by other fragments that are concurrently amplified in the sample. This confirms (at least for the primer set chosen and for the experimental conditions applied) that the relative changes in the mRNA concentration of any fragment robustly translate into relative changes in the amount of detected amplified cDNA during a GeXP multiplex RT-PCR experiment. Certainly, equal amplification efficiency for all cDNAs quantified within one essay should be verified for any new set of primers and any new set of transcripts to be analysed. The Perl scripts provided as supplementary material (see below) facilitate this task.

Quantification of differentiation marker genes in plasmodial cells before and after induction of sporulation
By quantifying the light-induced expression of differentiation marker genes in Physarum polycephalum plasmodial cells, we present a workflow including the application of Perl scripts for calibration and data evaluation.

Stimulation of cells, sample preparation, and biological replicates
Plasmodial cells were starved for six to seven days to gain sporulation competence. Sporulation-competent plasmodial cells, each spreading the surface of a Petri dish were stimulated with a short pulse of far-red light for the induction of sporulation. Before and at different time points after application of the stimulus, samples of each plasmodial cell were harvested and frozen in liquid nitrogen for the subsequent isolation of RNA. The remainder of the plasmodium was further incubated overnight to verify whether the developmental decision to sporulation had occurred [15]. This approach gave a collection of time series where each time series consisted of samples taken from the same individual plasmodial cell. In a control experiment, light stimulation was omitted while samples of individual plasmodial cells were taken correspondingly. From each frozen sample, total RNA was isolated and the expression pattern of the set of 35 genes (Table S1) were quantified as described above.

Data processing workflow and Perl scripts
The electrophoretic separation of the GeXP multiplex RT-PCR samples was performed in the CEQ 8800™ Fig. 3 Calibration curves for the transcripts of 35 genes as amplified by GeXP RT-PCR in the same tube. A sample of a total RNA pool from lightinduced and control plasmodial cells (see Fig. 2) was serially diluted and the 35 cDNA fragments quantified after capillary electrophoretic separation. Each RNA concentration was analysed twice and each of the obtained data points, normalized peak area (NPA) versus concentration of total RNA, was displayed in a double logarithmic plot (logarithm to the base 10). All data points were fitted to a straight line of the same slope as described in Materials and Methods. The R 2 value displayed in each panel gives the correlation coefficient of the first linear regression fit (see Methods for details). The red line in each panel represents the straight line of equal slope finally fitted to the data for each of the 35 transcripts (second linear regression fit; see Methods for details). For names of genes see Table S1. The figure was prepared with R, version 3.5.2 Genome Analysis System run via the so-called Control Center, a software package for operating the system and for storing the data. All further data processing steps were performed by Perl scripts as described in the Supplementary Information (see also Figs. S4 and S5). As a standard quality control for the electrophoretic separation, the electropherogram and the detected fragments were graphically displayed and visually inspected calling the respective scripts (Fig. S4). All further data processing steps were executed using the .ceqfrag files that list peak height, peak area and length (in nucleotides) of each fragment. The Perl scripts gep_makestd and gep_calcamount perform a batch evaluation of these files to generate the standard curve and to calculate the relative transcript abundances from the measured data, respectively. This is achieved by taking the fitted mean slope and the corresponding intercept of the standard curve for each transcript into account. The final output is a table in the form of a file with tab-separated values that contains for each quantified fragment the fragment ID number, the length (nt's), peak area, normalized peak area (NPA), the relative amount of cDNA, and the detection limit of each fragment during the respective Fig. 5 Comparative display of single cell gene expression time series in response to a sporulation-inducing far-red light stimulus. Individual plasmodial cells (P1 to P26) starved for 6 or 7 days to gain sporulation competence were exposed to a far-red light pulse for 5 to 35 min. Before and at four time points after the light pulse, samples were taken from each plasmodium and the gene expression pattern analysed to give a time series for each individual cell. The relative changes of expression values were normalized to the maximal value within the time series and displayed side by side for each plasmodium and gene. The gene expression pattern in each of the RNA samples was analysed twice with independent RT-PCR reactions and based on separate calibration curves. The first measurement and the technical replicate are displayed in the left and in the right panels, respectively. The figure was prepared with Microsoft Excel for Mac, version 14.7.7 separation run. Two values, relative amount of cDNA and detection limit, are calculated based on the calibration curves. We routinely made a new standard curve for each kit in order to account for possible variations in the performance of the kit components. Details on data processing and on the determination of the transcriptspecific detection limits are given in the legend to Figs. S4 and S5. Samples of all file types used are available together with the Perl scripts (for details see Supplementary Information). With the help of the two Perl scripts, the data processing for the generation of the calibration curves and for calculating the amounts of cDNA for each fragment and sample becomes fully automatic due to the batch processing of the files. We finally store all values for calibration curves as well as for all other measurements in an SQL database for documentation and further data processing and evaluation.

Light-induced gene expression and technical replicates
For visual comparison of the light-induced differential regulation of the differentiation marker genes in individual plasmodial cells, the expression values of the 35 genes were comparatively displayed for each sample ( Fig. 5; for the complete data set see Fig. S6). Each data point of a time series originates from material taken from the same plasmodial cell, allowing to compare time series between individual cells. Strongly up-or downregulated genes are clearly evident while the expression of ribB was almost constant. Accordingly, ribB might be used as a reference gene to normalize the expression values in each sample in order to correct for pipetting errors or inaccuracies in the quantification of the total RNA samples analysed. Nevertheless, we displayed all expression values as they were obtained, i.e. without normalization. However, the five expression values for each gene in a time series taken from a given cell were then normalized to the maximal value to reveal the percental changes as a function of time in order to highlight the relative changes that occurred as a function of time.
In order to estimate how reproducible the measurements are, the RNA samples were measured again in an independent GeXP RT-PCR experiment which was evaluated based on a fresh calibration curve. The results of the technical replicate were displayed side by side with the first measurement ( Fig. 5; Fig. S6) and revealed that the time series gave similar results. For a quantitative evaluation, the reproducibility of the measurements was analysed with the help of a frequency distribution. For each single expression value obtained in the first and in the second measurement (the technical replicate) for any given sample and gene, the ratio between the two values was calculated and the absolute frequency of the log2 (ratio) displayed in the form of a histogram. For the experiment performed with light-induced plasmodia ( Fig. 6a) the variation was considerably higher than for a second experiment performed with plasmodia that had not been exposed to any light (dark controls; Fig. 6b). For the light-induced plasmodial cells, 77% of the values varied by a factor of two or less and 9% of the values by a factor of four or more. One reason for the high variation observed in the light-induced plasmodia might be that genes are strongly up-or down-regulated. Two measurements of a transcript, the concentration of which is close to the detection limit, may be inaccurate and therefore give a relatively high value for the ratio between first and second measurement. In contrast, the values measured in the dark controls were highly reproducible (Fig. 6b).

Discussion
We have designed and experimentally validated a calibration procedure for Gene eXpression Profiling (GeXP), a quantitative multiplex RT-PCR method for the simultaneous quantification of tens of transcripts. Our calibration procedure is rationally based on a kinetic model of primer binding, reverse transcription, and amplification. We have also tested an essential underlying assumption of GeXP that amplification of cDNAs synthesized from an individual transcript occurs independently of the relative abundances of the transcripts that are co-amplified in the sample. Finally, we provide an easy-to-use pipeline of Perl scripts for calibration, data evaluation, and quality control of GeXP. Our method has been successfully applied in several recent studies [6,[15][16][17][18][19][20].

Growth and preparation of sporulation-competent plasmodia
Plasmodia of the apogamic strain WT31 [21] which is wild-type with respect to the photocontrol of sporulation were grown for four days at 24°C in 1.5 L of growth medium [22] in a 5 L fermenter (Minifors, Infors HT, Bottmingen, Switzerland) inoculated with 2% of a 3.5 days old shaken culture, supplied with 0.1 L of air per minute, and stirred at 250 rpm with a marine propeller. Microplasmodia were harvested and applied to starvation agar plates (9 cm diameter) as described [21]. Plates were incubated at 22°C in complete darkness for 6 days if not stated otherwise. During this time period, one multinucleate macroplasmodium develops on each plate and, by starvation, becomes competent for sporulation. The incubation temperature is critical in order to avoid unwanted spontaneous sporulation. For the induction of sporulation, competent plasmodia were exposed to a farred light pulse (λ ≥700 nm, 13 W/m 2 ). The far-red light was generated by Concentra Weißlicht lamps (Osram, Munich, Germany) which was passed through an Orange 478 combined to a Blue 627 plexiglass filter (Röhm, Darmstadt, Germany). After light exposure, plasmodia were returned to the dark and incubated at 22°C until the samples were taken. At five subsequent time points, a portion of each plasmodium was harvested (approximately one sixth of its original biomass for each sample), shock-frozen in liquid nitrogen and stored at − 80°C until RNA isolation. After taking the samples, the remainder of each plasmodium was Fig. 6 Frequency distributions of X-fold discrepancies between the gene expression values obtained by two repeated measurements. For each RNA sample and for each of the 35 genes analysed the Log2 of the ratio of the two independently measured expression values was calculated and the number of occurrences for each interval displayed in the histograms. The distributions were shifted on the abscissa so that the most abundant class was centered at a Log2 value of zero. Expression values that were below the detection limit were not considered. Panel (a) shows the distribution of 4447 values as displayed in Fig. 5 and panel (b) shows the result of the technical replicate of a set of time series (1643 values) measured from plasmodia that had not been exposed to any light stimulus (dark controls). The figure was prepared with Kaleidagraph Version 4.1.3 (Synergy Software) maintained overnight in the dark to reveal the developmental decision (i.e. whether or not sporulation had occurred). Control plasmodia were treated identically except that the light pulse was omitted (dark controls). All manipulations were done under sterile conditions and under dim green safe light as previously described [21].

Preparation of total RNA
For extraction of total RNA, approximately 15 mg of fresh weight material which had been taken from an individual plasmodium was used. Samples were dissolved in 1.5 ml PeqGOLD TriFastTM solution (Nr 30-2020; PeqLab, Erlangen) by disrupting the cell material in the presence of glass beads ø = 0.5 mm using a Precellys 24-Dual homogenizer, (Peqlab, Erlangen, Germany) for 10 s at room temperature. The homogenate was transferred to 2 ml PeqGOLD PhaseTrap tubes (PeqLab; Nr 30-0150A-01). After addition of 300 μl CHCl 3 , vigorous shaking and centrifugation of the PhaseTrap tubes, the aqueous phase containing the RNA was transferred to a fresh Eppendorf cap following the manufacturer's instructions for the entire procedure. After addition of 0.9 volumes of a 4:1 mixture of ethanol and 1 M acetic acid, the RNA was precipitated at − 20°C for at least 24 h. The precipitate was spun down for 15 min at 18,000 x g and 4°C, the pellet washed once with 400 μl ice-cold 3 M sodium acetate and twice with ice-cold 70% ethanol followed by centrifugation for 10 min at 18,000 x g, 4°C after each washing step. Finally, the RNA was dissolved in RNAse-free water. For removal of contaminating genomic DNA, 90 μl of each RNA sample were mixed with 10 μl rDNase buffer and 1 μl rDNase (rDNase Set Macherey-Nagel: 740963) and incubated for 12 min at 37°C. The total RNA was finally purified using the RNA Clean-up Kit (Macherey-Nagel: 740948.250). Approximately 300 ng aliquots checked for integrity of the 18 S and 28 S rRNA by agarose gel electrophoresis. For subsequent steps, samples were standardized to 20 ng/μl of total RNA.

Multiplex gene expression profiling (GeXP)
Multiplex RT-PCR reactions were performed to simultaneously amplify 35 transcripts and the amplification products were quantified through separation on a Beckman Coulter 8-capillary sequencer (CEQ 8800). The GenomeLab GeXP Start Kit (Beckman Coulter; A85017) along with the ThermoStart Taq Polymerase Kit (Beckman Coulter; A25395) was used according to the GeXP Chemistry Protocol (The Beckman Coulter Capillary Electrophoresis product line including GeXP chemistry is now distributed and supported by AB SCIEX, Landwehrstr. 54, 64293 Darmstadt, Germany (www.SCIEX.com)). cDNA synthesis and polymerase chain reaction were performed in half of the recommended volume leaving the recommended concentrations unchanged. Twenty ng of total RNA was used for each reverse transcription (see below). Controls without template were included to check the kit components for contaminating nucleic acids and controls without reverse transcriptase were used to check for contaminating DNA in the RNA samples.

Primers used
Forward and reverse gene-specific primers for a set of 35 genes were originally desinged by [6] to generate amplified DNA fragments with similar GC contents and melting temperature of different lengths ranging between 114 and 357 bp (Table S1). Each primer contained a universal priming sequence at the 5′-end and a genespecific sequence at the 3′-end. The concentration of each gene-specific reverse primer in the pre-mixed reverse primer plex was experimentally adjusted to give a fluorescence signal for each fragment that was in the linear sensitivity range of the measurement system. The attenuated concentration of each reverse primer was adjusted to a value between 0.5 to 0.00025 μM (Table S1).

cDNA synthesis
The RT reaction was performed in 96-well plates in a Biometra (T Professional) thermal cycler with 20 ng of total RNA (except for the standard curves), 1 x RT Buffer, 1 μL of reverse primer mix, 1 μL KAN r RNA and 0.5 μL reverse transcriptase (20 U/μl) in a total volume of 10 μL with the following temperature program: 1 min at 48°C, 5 min at 37°C, 60 min at 42°C, 5 min at 95°C, hold at 4°C. The plate was briefly centrifuged to recollect the sample volume at the bottom of the wells.
Polymerase chain reaction 4.7 μL of the 10 μL RT reaction were used as template for the PCR. The PCR was performed with 0.35 μL thermo-start DNA polymerase, 5 mM MgCl 2 , 1 x PCR buffer and pre-mixed forward primer plex (20 nM final concentration) in a volume of 10 μL under the conditions: 10 min at 95°C, followed by 35 cycles of 30 s at 94°C, 30 s at 55°C, and 1 min at 68°C; hold at 4°C

Fragment analysis
Amplification products were separated and quantified according to the manufacturer's instructions. Briefly, the PCR products were pre-diluted 20-fold with 10 mM Tris-HCl buffer pH 8.0. One μL of the diluted sample was separated with 39 μL Sample Loading Solution (Beckman Coulter) and 0.4 μL DNA size standard-400 (Beckman Coulter) on the GenomeLab Separation Capillary Array of the CEQ8800 (Beckman Coulter) using the Frag-3 separation program (denaturation for 120 s at 90°C; sample injection for 30 s at 2.0 kV; separation of fragments for 40 min at 6.0 kV; capillary temperature at 50°C). The raw data were analysed with the Fragment analysis module of the CEQ8800 software (Beckman Coulter) to estimate the size of the obtained fragments. The peak area of each fragment was exported for subsequent data processing.

Calibration curves
A calibration curve for each transcript i was established to calculate its relative concentration in the analysed samples from the corresponding normalized fluorescence signal values integrated over the peak areas of the respective amplified DNA fragments as separated on the capillary sequencer. To obtain the calibration curve, total RNA samples from dark-and light-treated plasmodia were mixed as described in the Results section to give an appropriate analytical standard. The mix was repeatedly diluted, two-fold in each step, to obtain a set of concentrations of total RNA ranging from 100 ng to 0.78 ng per reaction volume. Each sample was measured twice with two independent RT-PCR-reactions. The relative fluorescence signal F i (which is identical to the normalized peak area NPA) as measured for each DNA fragment amplified from the mRNA of gene i was obtained by normalizing its peak area relative to the peak area of the KAN r control RNA which served as internal standard. The logarithm of the normalized mean values was plotted against the logarithm of the relative total RNA concentration as obtained in the dilution series [7]. By taking the logarithm for the base 10, a straight line in the form of log F i = a i ⋅ log[RNA] tot + b i was computed by linear regression for the n data points of each fragment amplified from the mRNA of gene i (1st linear regression) according to The slopes a i obtained for all gene-specific fragments in the 1st linear regression were averaged and the data points for each specific fragment amplified from the mRNA of gene i were again fitted to a straight line by a 2nd linear regression that computes the intercept with the ordinate b ia which is obtained for the average slope Fitting parameters for the 1st and for the 2nd linear regression were stored for documentation and quality control purposes. Data processing was done with the Perl scripts provided in conjunction with this paper (for details see legends to Figs. S4 and S5). The data points of all 35 fragments fitted the calibration curves without any systematic deviation from the straight lines. The relative concentration [RNA] i rel for each transcript of gene i was calculated on the basis of its specific calibration curve as [RNA] i rel = 10 A with A ¼ log F i −b i a :