Kinetic model
As the concentration of the transcripts of individual genes within a given sample of total RNA may vary by orders of magnitude relative to each other, the conditions of a GeXP experiment have to be adjusted such that the amplified fragments can be reliably detected side by side when separated on the capillary sequencer. This is achieved by empirically adjusting the concentration of the gene specific reverse primers so that at intermediate physiological concentration of each transcript the amount of cDNA amplified is in the same range for all transcripts that are to be detected in one reaction tube (Fig. 2) [5,6,7]. Hence the concentration of the primer for the reverse transcription of certain transcripts may become rate limiting for the amount of cDNA synthesized. In the following we describe a kinetic model of the RT-PCR reaction
$$ \mathrm{RNA}\overset{RT}{\to }{\mathrm{cDNA}}^{1 st}\overset{PCR}{\to }{\mathrm{cDNA}}^{amp} $$
as rational basis for evaluating the double logarithmically plotted calibration curves for the transcripts to be analysed. The mRNAs (RNA) are reverse transcribed to give a first-strand cDNA (cDNA1st) which is then amplified with a universal primer set to obtain a PCR product (cDNAamp) for each transcript which is electrophoretically separated on a capillary sequencer, as shown in Fig. 1).
Assuming that at maximum 2% of the total RNA isolated from a eukaryotic cell is mRNA, a 10 μl RT reaction volume with the amount of total RNA as used in the experiments contains not more than 0.4 ng of mRNA. With approximately 17,000 genes expressed in a starving P. polycephalum plasmodium and an average length of a mature mRNA of approximately 1000 nucleotides [8], this is equivalent to an average concentration of 6.9 × 10− 15 M for the mRNA of one gene corresponding to approximately 41,500 molecules per reaction volume. The smallest RT primer concentration used in our RT reaction (Table S1) is 2.5 × 10− 11 M, as compared to 2.7 × 10− 8 M for the average and 5 × 10− 8 M for the maximal concentration, respectively. For a specific mRNA of 6.9 × 10− 12 M which is 1000 times more abundant than the average, even the smallest RT-primer concentration of 2.5 × 10− 11 M would be in excess to the concentration of the template in a RT reaction. Accordingly, the primer concentration for most if not all of the amplified fragments is expected to be in very large excess as compared to the template concentration and under these conditions the concentration of the reverse primer will not change significantly in the course of the RT reaction.
It has been shown that binding of a primer to its template during the reassociation (renaturation) of DNA follows the rate law of a second order reaction [9, 10]. This suggests that the rate limiting step for the reassociation is the nucleation reaction during which few bases of one strand bind to their fitting counterparts on the other strand. The subsequent zipping reaction must occur fast as compared to the nucleation event of base pairing because otherwise a first order instead of second order kinetics would be observed. This conclusion is plausible as the local concentration of fitting bases greatly increases if two strands are brought in spatial proximity and appropriate orientation as a consequence of the zipping reaction.
The concentration of the RT-primer for the transcript of each gene of interest has to be adjusted to attenuate the relative amount of DNA amplified from its RNA during the GeXP RT-PCR reaction while the concentration of the RT-primer (denoted as Primer in the following) is in large excess as compared to the reverse transcribed mRNA (see above). We accordingly assume that formation of the RNA-primer complex (RNA · Primer)
$$ \mathrm{RNA}+\mathrm{Primer}\to \mathrm{RNA}\cdotp \mathrm{Primer}\ \overset{\mathrm{RT}}{\to }\ {\mathrm{cDNA}}^{1 st} $$
is the rate-limiting step of cDNA synthesis by reverse transcription (RT). Once priming of the RT reaction has occurred, the polymerization reaction of cDNA synthesis is equally fast for all 1st strand cDNA species (cDNA1st) concurrently synthesized in the reaction mix.
Note that only a tiny fraction of the low molecular weight reactants is consumed during the RT reaction such that their concentrations do hardly change during the experiment. Based on the assumption of a rate-limiting bimolecular reaction of RT-primer binding to the RNA template, we expect an empirical rate law for the time-dependent decay d[RNAi]/dt of the concentration [RNAi] of the transcript of each gene i according to
$$ -\frac{d\left[{\mathrm{RNA}}_i\right]}{dt}={k}_{i,j}\cdot \left[{\mathrm{Primer}}_{i,j}\right]\cdot {\left[{\mathrm{RNA}}_i\right]}^n $$
(1)
where ki,j is the primer-specific rate constant for hybridization of the primer to its complementary RNA sequence, [Primeri,j] is the concentration of primer j of given sequence and length, complementary to the corresponding sequence stretch of the mRNA of gene i, and n is the empirical order of the reaction. As the concentration of the primers in the RT reaction is in a sufficiently high excess over that of each gene-specific RNA (see estimation above), the primer concentrations can be approximated as remaining constant during the reaction. By defining the pseudo first order rate constant \( {k}_{i,j}^{\prime }={k}_{i,j}\cdot \left[{\mathrm{Primer}}_{i,j}\right] \) we obtain the empirical first order rate law
$$ -\frac{d\left[{\mathrm{RNA}}_i\right]}{dt}={k}_{i,j}^{\prime}\cdot {\left[{\mathrm{RNA}}_i\right]}^n $$
(2)
where the value of \( {k}_{i,j}^{\prime } \) depends on the primer concentration that had been appropriately adjusted by the experimenter. Separation of the variables and integration over the reaction time t and the change in RNAi concentration yields
$$ \underset{{\left[{\mathrm{RNA}}_i\right]}_0}{\overset{\left[{\mathrm{RNA}}_i\right]}{\int }}\frac{1}{{\left[{\mathrm{RNA}}_i\right]}^n}\ d\left[{\mathrm{RNA}}_i\right]=-{k}_{i,j}^{\prime}\underset{t=0}{\overset{t}{\int }} dt. $$
(3)
In the simplest case of a first order reaction where n = 1, [RNAi] decays exponentially during the reverse transcription reaction of length t starting from its initial concentration [RNAi]0 according to
$$ \left[{\mathrm{RNA}}_i\right]={\left[{\mathrm{RNA}}_i\right]}_0\cdot \exp \left(-{k}_{i,j}^{\prime}\cdot t\right). $$
(4)
Alternatively, if the reaction order is different from one (n ≠ 1), the integration yields
$$ \frac{{\left[{\mathrm{RNA}}_i\right]}^{1-n}-{\left[{\mathrm{RNA}}_i\right]}_0^{1-n}}{1-n}=-{k}_{i,j}^{\prime}\cdot t $$
(5)
or
$$ {\left[{\mathrm{RNA}}_i\right]}^{1-n}={k}_{i,j}^{\prime}\cdot t\left(n-1\right)+{\left[{\mathrm{RNA}}_i\right]}_0^{1-n} $$
(6)
and hence [RNAi] decays according to
$$ \left[{\mathrm{RNA}}_i\right]=\sqrt[1-n]{k_{i,j}^{\prime}\cdot t\left(n-1\right)+{\left[{\mathrm{RNA}}_i\right]}_0^{1-n}} $$
(7)
In both cases, for n = 1 and for n ≠ 1, the concentration [cDNAi1st] of gene-specific first strand cDNA, synthesized during the entire RT reaction is
$$ \left[{\mathrm{cDNA}}_i^{1\mathrm{st}}\right]={\left[{\mathrm{RNA}}_i\right]}_0-\left[{\mathrm{RNA}}_i\right] $$
(8)
The concentration of the gene i-specific cDNA fragment, \( \left[{\mathrm{cDNA}}_i^{amp}\right] \) which has been amplified by PCR from the first strand cDNA (\( {\mathrm{cDNA}}_i^{1\mathrm{st}} \))
$$ {\mathrm{cDNA}}_i^{1\mathrm{st}}\overset{\mathrm{PCR}}{\to }{\mathrm{cDNA}}_i^{amp} $$
with the help of the universal primer pair (see Introduction) is expected to be
$$ \left[{\mathrm{cDNA}}_i^{amp}\right]=\alpha \cdot \left[{\mathrm{cDNA}}_i^{1\mathrm{st}}\right] $$
(9)
and according to eq. (8) dependent on the initial concentration of the gene i-specific transcript
$$ \left[{\mathrm{cDNA}}_i^{amp}\right]=\alpha \cdot \left({\left[{\mathrm{RNA}}_i\right]}_0-\left[{\mathrm{RNA}}_i\right]\right) $$
(10)
Overall amplification α of each 1st strand cDNA molecule after c cycles of PCR is expected to be \( \alpha \le {2}^{c-1} \) because the first PCR cycle is required to synthesize the second strand on the cDNA, leaving c − 1 cycles for DNA amplification while a new second strand product can be synthesized from the first strand cDNAs during each cycle of PCR. Each newly synthesized second strand cDNA molecule then starts a new exponential cascade of amplification during the remaining cycles of PCR. With eq. (4) and (7), the concentration \( \left[{\mathrm{cDNA}}_i^{amp}\right] \) of the gene i-specific cDNA fragment for n = 1 is
$$ \left[{\mathrm{cDNA}}_i^{amp}\right]=\alpha \cdot {\left[{\mathrm{RNA}}_i\right]}_0\cdot \left(1-\exp \left(-{k}_{i,j}^{\prime}\cdot t\right)\right) $$
(11)
and for n ≠ 1
$$ \left[{\mathrm{cDNA}}_i^{amp}\right]=\alpha \cdot \left({\left[{\mathrm{RNA}}_i\right]}_0-\sqrt[1-n]{k_{i,j}^{\prime}\cdot t\left(n-1\right)+{\left[{\mathrm{RNA}}_i\right]}_0^{1-n}}\right) $$
(12)
Calculating the concentration of cDNA \( \left[{\mathrm{cDNA}}_i^{amp}\right] \) amplified from the gene i-specific mRNA as a function of the relative initial concentration [RNAi]0 in the RT reaction according to eq. (11) and (12) yields apparently linear relationships in a double logarithmic plot for empirical reaction orders of n ≤ 1 (Fig. S1). The concentration of any gene-specific mRNA present in the RT-reaction is proportional to the concentration of total RNA [RNAtot]0 which is subjected to reverse transcription. Accordingly, a double logarithmic plot of the concentration of specific cDNAs amplified from the transcripts of different genes will yield a series of parallel shifted lines of equal slope and correspondingly different intercepts with the ordinate. The kinetic model with eq. (11) and (12) further predicts how the concentration of cDNA fragments depends on the initial concentration of the gene-specific reverse primer used for the RT reaction (Fig. S2).
Experimental validation of the calibration procedure
The kinetic model of the GeXP RT-PCR predicts that for a set of transcripts one obtains a series of parallel shifted lines when the concentration of amplified gene-specific cDNA fragment is double logarithmically plotted versus the total RNA concentration used in the assay. For experimental validation of this prediction, we used Physarum polycephalum total RNA isolated from plasmodial cells as a source of typical eukaryotic mRNAs [8].
Starving plasmodial cells can be induced to sporulation by a brief pulse of far-red light which is sensed by a phytochrome photoreceptor [8, 11, 12]. The cellular commitment to sporulation is associated to the differential expression of developmentally regulated genes [13]. Based on transcriptome data [13, 14] we have previously designed gene specific primers for the quantification of the transcripts of a set of 35 genes with a single GeXP RT-PCR reaction. The concentrations of the gene-specific reverse primers were attenuated (adjusted) so that amplification of each of the 35 cDNAs yields a sufficient amount of each fragment to be resolved side-by-side during electrophoretic separation (Fig. 2; Table S1) [6]. The set comprised genes that are up- or down-regulated at different strengths as well as genes with constant or almost constant expression level that can serve as a reference.
Fragment quantification by capillary electrophoresis
For the inital validation of the performance of the electrophoretic quantification of cDNA fragments, we used two samples of RNA, one prepared from uninduced plasmodial cells (controls) and one from far-red light induced plasmodial cells harvested at 6 h after the far-red pulse. The induced cells were committed to sporulation but did not yet show any morphogenetic alterations. In the cells of the two groups, a number of genes are up- or down-regulated [6]. A known amount of spike-in RNA was added to the two samples of total RNA isolated from the cells of the two groups and the cDNAs of the 35 transcripts were amplified by GeXP RT-PCR. Subsequently, a DNA size standard was added to each of the two samples and the fluorescence tag-labelled fragments were separated electrophoretically on the CEQ 8800™ 8-capillary sequencer (see Fig. 1c). The fragments were quantified through their peak area and the amount of each gene-specific fragment was normalized relative to the amount of fragment amplified from the spike-in RNA with the help of the Perl scripts (Supplementary Information) to give the normalized peak area (NPA).
In order to test the performance of the electrophoretic separation process and to validate the fidelity of the peak quantification software, we prepared a series of mixtures, each containing three pairs of fluorescently labelled DNA fragments of different relative concentration, mimicking different concentrations of cDNAs as obtained by RT-PCR. Each mixture was separated on the CEQ 8800™ capillary sequencer and the normalized peak area (NPA) determined for three chosen pairs of fragments that were amplified from an up-regulated and one down-regulated transcript, respectively. For the three pairs of DNA fragments, the ratio of the normalized peak areas for the corresponding two fragments was plotted against the relative concentration of the two samples in the mixture. In the three pairs of fragments the normalized peak area for each fragment was proportional to its relative amount in the mixture (Fig. S3), confirming that the peak detection algorithm of the software delivered with CEQ 8800™ capillary sequencer worked correctly.
Calibration curves for the 35 transcripts
In order to obtain calibration curves for each of the 35 transcripts, we mixed two RNA samples (from light-induced and control cells) in a 1:1 ratio in order to obtain a sample that contained even the differentially regulated transcripts at intermediate concentration (Fig. 2). This RNA mixture was serially diluted and, after addition of spike-in RNA to each sample of the dilution series, the cDNA fragments as amplified by the GeXP multiplex RT-PCR were separated electrophoretically. The normalized peak area determined for each cDNA fragment was plotted versus the concentration of total RNA. The data points obtained for all fragments were fitted to straight lines in the double logarithmic plot. The slopes as obtained by the 35 fits were averaged and the best-fitting intercept for each of the 35 curves was calculated. Accordingly, this procedure gave a series of fitted curves that only differed in being shifted with respect to the abscissa (Fig. 3). As predicted by the kinetic model, for none of the genes any systematic deviation of the data points from the fitted straight line was seen (Fig. 3). Hence, all fragments were amplified with the same efficiency by the universal primers. The parallel shift relative to the abscissa, according to the kinetic model, is due to the different absolute concentrations of the mRNAs and to the different amounts of cDNA produced during the RT reaction at the particular concentration of the RT-primer. Noteworthy, the slopes of the curves clearly were smaller than 1 as observed in most experiments performed in our lab. A slope of 1 would be expected if the cDNA synthesis occurred with first order kinetics as assumed in eq. (11). The kinetic model therefore suggests that the RT reaction occurs with a reaction order of n ≤ 1 (eq. (12)). The value of n observed in each particular experiment might slightly vary depending on the experimental conditions.
Efficiency of concurrent amplification of different transcripts in one sample
Since cDNA fragments must be of different lengths in order to get separated during capillary electrophoresis and the efficiency of amplification might depend on the fragment length, we wished to confirm that the amount of cDNA amplified for any given transcript is independent of other fragments that are concurrently amplified in the same sample. Therefore, we mixed the RNA sample from the control cells with the sample from the light-induced cells taking different relative amounts and estimated for each gene how the normalized peak area changed with the relative amount of the RNA from light-induced cells or from the dark controls (Fig. 4). For each fragment, the normalized peak area and hence the amount of amplified cDNA was a linear function of the mixing ratio with a slope depending on whether genes were up-regulated or down-regulated. Curves for genes that are not regulated in response to the photo-stimulus fitted to a straight line that was parallel to the abscissa (Fig. 4). Hence, the amount of amplified DNA for each gene was found to be proportional to the concentration of the corresponding mRNA in the sample. Likewise, amplification of the cDNAs of those genes that are not or rarely differentially regulated (dspA, spiA, tspA, etc.) confirms that the amplification efficiency indeed does not depend on the concentration of other (differentially regulated) mRNAs that are concurrently amplified in the sample.
An experimental result consistent with the computational predictions of the kinetic model of the GeXP RT-PCR (eq. (12)) was obtained by changing the concentration of one primer while the concentration of all other primers was kept constant. Until saturation, the normalized peak area increased with the RT-primer concentration, indicating that the reverse primer was rate-limiting even though its concentration was high relative to the concentrations of the other RT-primers in the mix (Fig. S2).
Taken together, these experimental results confirm that all fragments are amplified with the same efficiency during PCR with the universal primer set regardless of their sequence or the length of the amplification products. Amplification of individual fragments is not influenced by other fragments that are concurrently amplified in the sample. This confirms (at least for the primer set chosen and for the experimental conditions applied) that the relative changes in the mRNA concentration of any fragment robustly translate into relative changes in the amount of detected amplified cDNA during a GeXP multiplex RT-PCR experiment. Certainly, equal amplification efficiency for all cDNAs quantified within one essay should be verified for any new set of primers and any new set of transcripts to be analysed. The Perl scripts provided as supplementary material (see below) facilitate this task.
Quantification of differentiation marker genes in plasmodial cells before and after induction of sporulation
By quantifying the light-induced expression of differentiation marker genes in Physarum polycephalum plasmodial cells, we present a workflow including the application of Perl scripts for calibration and data evaluation.
Stimulation of cells, sample preparation, and biological replicates
Plasmodial cells were starved for six to seven days to gain sporulation competence. Sporulation-competent plasmodial cells, each spreading the surface of a Petri dish were stimulated with a short pulse of far-red light for the induction of sporulation. Before and at different time points after application of the stimulus, samples of each plasmodial cell were harvested and frozen in liquid nitrogen for the subsequent isolation of RNA. The remainder of the plasmodium was further incubated overnight to verify whether the developmental decision to sporulation had occurred [15]. This approach gave a collection of time series where each time series consisted of samples taken from the same individual plasmodial cell. In a control experiment, light stimulation was omitted while samples of individual plasmodial cells were taken correspondingly. From each frozen sample, total RNA was isolated and the expression pattern of the set of 35 genes (Table S1) were quantified as described above.
Data processing workflow and Perl scripts
The electrophoretic separation of the GeXP multiplex RT-PCR samples was performed in the CEQ 8800™ Genome Analysis System run via the so-called Control Center, a software package for operating the system and for storing the data. All further data processing steps were performed by Perl scripts as described in the Supplementary Information (see also Figs. S4 and S5).
As a standard quality control for the electrophoretic separation, the electropherogram and the detected fragments were graphically displayed and visually inspected calling the respective scripts (Fig. S4). All further data processing steps were executed using the .ceqfrag files that list peak height, peak area and length (in nucleotides) of each fragment. The Perl scripts gep_makestd and gep_calcamount perform a batch evaluation of these files to generate the standard curve and to calculate the relative transcript abundances from the measured data, respectively. This is achieved by taking the fitted mean slope and the corresponding intercept of the standard curve for each transcript into account. The final output is a table in the form of a file with tab-separated values that contains for each quantified fragment the fragment ID number, the length (nt’s), peak area, normalized peak area (NPA), the relative amount of cDNA, and the detection limit of each fragment during the respective separation run. Two values, relative amount of cDNA and detection limit, are calculated based on the calibration curves. We routinely made a new standard curve for each kit in order to account for possible variations in the performance of the kit components. Details on data processing and on the determination of the transcript-specific detection limits are given in the legend to Figs. S4 and S5. Samples of all file types used are available together with the Perl scripts (for details see Supplementary Information). With the help of the two Perl scripts, the data processing for the generation of the calibration curves and for calculating the amounts of cDNA for each fragment and sample becomes fully automatic due to the batch processing of the files. We finally store all values for calibration curves as well as for all other measurements in an SQL database for documentation and further data processing and evaluation.
Light-induced gene expression and technical replicates
For visual comparison of the light-induced differential regulation of the differentiation marker genes in individual plasmodial cells, the expression values of the 35 genes were comparatively displayed for each sample (Fig. 5; for the complete data set see Fig. S6). Each data point of a time series originates from material taken from the same plasmodial cell, allowing to compare time series between individual cells. Strongly up- or down-regulated genes are clearly evident while the expression of ribB was almost constant. Accordingly, ribB might be used as a reference gene to normalize the expression values in each sample in order to correct for pipetting errors or inaccuracies in the quantification of the total RNA samples analysed. Nevertheless, we displayed all expression values as they were obtained, i.e. without normalization. However, the five expression values for each gene in a time series taken from a given cell were then normalized to the maximal value to reveal the percental changes as a function of time in order to highlight the relative changes that occurred as a function of time.
In order to estimate how reproducible the measurements are, the RNA samples were measured again in an independent GeXP RT-PCR experiment which was evaluated based on a fresh calibration curve. The results of the technical replicate were displayed side by side with the first measurement (Fig. 5; Fig. S6) and revealed that the time series gave similar results. For a quantitative evaluation, the reproducibility of the measurements was analysed with the help of a frequency distribution. For each single expression value obtained in the first and in the second measurement (the technical replicate) for any given sample and gene, the ratio between the two values was calculated and the absolute frequency of the log2 (ratio) displayed in the form of a histogram. For the experiment performed with light-induced plasmodia (Fig. 6a) the variation was considerably higher than for a second experiment performed with plasmodia that had not been exposed to any light (dark controls; Fig. 6b). For the light-induced plasmodial cells, 77% of the values varied by a factor of two or less and 9% of the values by a factor of four or more. One reason for the high variation observed in the light-induced plasmodia might be that genes are strongly up- or down-regulated. Two measurements of a transcript, the concentration of which is close to the detection limit, may be inaccurate and therefore give a relatively high value for the ratio between first and second measurement. In contrast, the values measured in the dark controls were highly reproducible (Fig. 6b).