Quantitative promoter analysis in Physcomitrella patens: a set of plant vectors activating gene expression within three orders of magnitude

Background In addition to studies of plant gene function and developmental analyses, plant biotechnological use is largely dependent upon transgenic technologies. The moss Physcomitrella patens has become an exciting model system for studying plant molecular processes due to an exceptionally high rate of nuclear gene targeting by homologous recombination compared with other plants. However, its use in transgenic approaches requires expression vectors that incorporate sufficiently strong promoters. To satisfy this requirement, a set of plant expression vectors was constructed and equipped with either heterologous or endogenous promoters. Results Promoter activity was quantified using the dual-luciferase reporter assay system. The eight different heterologous promoter constructs tested exhibited expression levels spanning three orders of magnitude. Of these, the complete rice actin1 gene promoter showed the highest activity in Physcomitrella, followed by a truncated version of this promoter and three different versions of the cauliflower mosaic virus 35S promoter. In contrast, the Agrobacterium tumefaciens nopaline synthase promoter induced transcription rather weakly. Constructs including promoters commonly used in mammalian expression systems also proved to be functional in Physcomitrella. In addition, the 5' -regions of two Physcomitrella glycosyltransferases (i.e. α1,3-fucosyltransferase and β1,2-xylosyltransferase) were identified and functionally characterised in comparison to the heterologous promoters. Furthermore, motifs responsible for enhancement of translation efficiency – such as the TMV omega element and a modified sequence directly prior the start codon – were tested in this model. Conclusion We developed a vector set that enables gene expression studies, both in lower and higher land plants, thus providing valuable tools applicable in both basic and applied molecular research.


Background
The moss Physcomitrella patens (Hedw.) B.S.G. is characterised by its simple development and well-defined differentiation steps, including a dominant haploid gametophyte. Physcomitrella has proven to be an appropriate model system for studying gene function in molecular and cellular development as well as for analysis of differentiation processes in plants. The suitability of this model system is attributable to it being capable of undergoing highly efficient homologous recombination events. This characteristic allows for gene function analysis by targeted knockout [1][2][3]. Furthermore, mosses are acknowledged as comparable to higher plants in terms of gene content, expression, and regulation [4], although they are estimated to have diverged from other groups of land plants approximately 450 million years ago [5]. To enable functional genomic approaches, EST databases covering more than 95% of the Physcomitrella transcriptome have been created [6,7] and two different approaches of targeted mutagenesis via transposon tagging have been performed [8,9]. In previous studies, the influence of phytohormones in Physcomitrella was examined by physiological analyses [10,11] and in a recent survey the auxin distribution in Physcomitrella was characterised with the help of transgenic plants [12].
In addition to the extensive basic research employing Physcomitrella, this system represents a promising model for applied approaches. For instance, a saturated mutant collection has been created to uncover targets for crop plant improvement [8,13]. Additionally, Physcomitrella has been used as a bioreactor for the production of complex biopharmaceuticals [14]http://www.greenova tion.de. Physcomitrella is suitable for low cost and high volume production of recombinant proteins. This system has the added capability to extensively posttranslationally process proteins, including the formation of disulfide bridges and complex glycosylation [15].
For basic as well as applied research, potent and flexible expression systems are necessary. Transgenic technologies are dependent upon genetic tools among which sufficiently strong promoters for the construction of expression vectors are very important. As hardly any promoter sequences of moss genes are currently available and only a few heterologous promoters have been reported to function reliably in Physcomitrella, moss researchers must use the latter for expression of interesting transgenes [e.g. [16]]. Specifically, systems involving various gene cassettes require different promoter sequences to minimise molecular interaction by recombination processes between the different constructs. There is also preliminary evidence that two transgenes driven by the same promoter may be silenced by a mechanism that remains to be determined [17].
Although these studies address the individual strengths of the various promoters, there is a fundamental lack of a quantitative comparison between them. To identify sufficiently strong promoters that can drive expression of transgenes in Physcomitrella, Holtorf et al. [16], for example studied promoter activity in relation to the number of independent transformants surviving the selection process.
In this study we compared suitability of different heterologous and endogenous promoters to express foreign genes in Physcomitrella patens. Additionally, the effect of two translational enhancement motifs (the TMV omega element and a modified sequence directly prior the start codon) were analysed.

Quantitative analysis of heterologous promoter activities in transiently transfected Physcomitrella protoplasts
We quantified the ability of heterologous and endogenous promoters in Physcomitrella to drive foreign gene expression in plants. For this purpose, a set of diverse reporter constructs was created. The final basic constructs used in this study are shown in Fig. 1A.
Firefly (Photinus pyralis) luciferase was used as a highly sensitive reporter enzyme to quantify gene expression in a transient expression assay. For normalisation of transfection efficiency, Renilla reniformis luciferase was used. The dual-luciferase assay, which is commonly used in mammalian systems [e.g. [22]], was improved for use in a plant system by constructing a Renilla control plasmid in which the reporter gene is driven by the CaMV 35S promoter (Fig. 1B). To our knowledge, this dual-luciferase assay has been adapted for use in a plant system for the first time in our current study.
We chose a set of promoters that should represent a wide range of expression levels in Physcomitrella. These included promoters broadly used in plant expression systems as well as promoters commonly used in mammalian transfection experiments. The strengths of these various promoters were compared by calculating their activity in relation to the activity of the CaMV 35S(1x) promoter, which was set as 100% (Fig. 2). The CaMV 35S(1x) promoter was chosen as a reference since it has been shown to function efficiently in a wide range of plants, including Physcomitrella [reviewed in [16]].
The rice Act1 promoter drove the highest induction of luciferase activity, on average 10 fold higher than the CaMV 35S(1x) promoter (Fig. 2). Previous studies have shown that this promoter is highly active in monocots [e.g. [23]]. Our data verify findings, which claimed strong activity of the rice Act1 promoter in Physcomitrella patens [17,24]. In contrast to our results, Zeidler et al. [21] measured only twofold induction of a GUS reporter gene driven by the rice Act1 promoter. However, these differences may be due to the fact that constructs different from ours were used in that particular study. Our analyses, in contrast, were based on constructs that all possessed the same vector backbone and were introduced into the vector via the same restriction sites. Interestingly, the actin promoter construct used in this study includes the rice actin-1 5'-UTR sequence. This sequence contains a leader intron which is required for high level expression in cereal cells [23]. The high expression level of the Act1 promoter in Physcomitrella indicates that this intron is efficiently spliced by moss cells.

35SP RL 35ST
firefly luciferase promoter plasmids renilla luciferase promoter plasmid In addition to the 1230 bp complete rice Act1 promoter, activity of a truncated version was analysed. This truncated form is missing 374 bp of the 5'-end resulting in a loss of about 60% in promoter activity, compared with the complete Act1 promoter (Fig. 2). Similar results have been obtained previously by McElroy et al. [23] in transiently transformed rice protoplasts where the complete promoter possessed twice the strength of the cut version.
The CaMV 35S(2x) promoter also supported strong reporter expression, approximately six times higher than that of the CaMV 35S(1x) promoter (Fig. 2). In transformed tobacco plants, Kay et al. [25] observed an approximately tenfold higher transcriptional activity of a CaMV 35S variant constructed by tandem duplication of 250 base pairs of the transcription-activating sequences upstream of the TATA element. The CaMV 35S(long) promoter was about four times as effective as the single 35S promoter and therefore comparable to the cut rice Act1 promoter (Fig. 2).
The nos promoter of the Agrobacterium tumefaciens nopaline synthase gene exhibited an activity of about 2.4% compared with activity of the CaMV 35S(1x) promoter. The nos promoter has been commonly used to drive reporter genes or selectable marker genes in Physcomitrella patens [e.g. [8,17]], resulting in sufficient expression levels. The relationship observed between the strength of the nos promoter and the 35S promoter in Physcomitrella is similar to observations made in petunia plants [26] where the nos promoter was 30 times less active than the 35S promoter.
The quantification of the different promoter activities shows that the ratios between these activities are comparable between Physcomitrella and higher plants. This confirms the observation that, despite the evolutionary distance between mosses and seed plants, the basic organisation and the regulatory mechanisms of Physcomitrella are similar to those of higher land plants. promoters that have commonly been used for transfection of human or animal cells, are stronger than the nos promoter or at least comparable to it and therefore offer an appropriate alternative for foreign gene expression in Physcomitrella.

TMV omega element does not mediate translation enhancement in Physcomitrella
The TMV (tobacco mosaic virus) omega element has been demonstrated to be a translational enhancer in both eukaryotes and prokaryotes, increasing translation both in vivo and in vitro [27]. However, various experiments illustrated that the effects of the omega element are highly species-specific. Enhancement was strongest in dicotyledonous plant cells and moderate in cultured mammalian cells as well as in monocotyledonous plant cells [27]. An effect of the omega element on yeast translation was demonstrated in the presence of the tobacco heat shock protein HSP101, which binds to the omega element mRNA [28]. Results describing the effects of the TMV omega element in lower land plants have not been reported thus far.
We analysed the translation enhancing effect of the TMV omega element in Physcomitrella by introducing the element directly downstream of the CaMV 35S(long) promoter and the CMV promoter. Neither the quite strong long version of the 35S promoter nor the relatively weak CMV promoter exhibited a significantly augmented reporter signal with the introduction of this element (Fig.  3).
Previous studies have shown that activity of the CaMV 35S promoter was enhanced two-to threefold with the addition of the translational enhancer TMV omega element in stably transformed Arabidopsis thaliana plants [29] as well as in transiently transformed tobacco protoplasts [30].
Prior studies indicate that the enhancing effect of this omega element is related to reduction of secondary structures at the 5'-end of the RNA [30]. Gallie [31] pointed out that the reason underlying the omega's element's translational enhancement capability is its functional similarity to a 5'-cap and a poly(A) tail. This characteristic would allow the omega element to recruit the eukaryotic initiation factor eIF4F, thus enhancing translation from mRNA. So, if Physcomitrella already has an effective translation mechanism, it might not need the structures provided by the omega element. Gallie et al. [32] assumed that specific functional differences between ribosomes could explain the variation of the omega element's enhancing potential in various organisms. According to this theory, functional features of Physcomitrella ribosomes, in contrast to other eukaryotic ribosomes, could be responsible for the lacking translational enhancer effect. It is also possible that a translation enhancing effect of the omega element in Physcomitrella would require an additional mRNA-binding protein as it was shown for yeast [28].

Modifying the start codon context slightly enhances expression in Physcomitrella
The sequences flanking the initiation codon in eukaryotic mRNAs have been described to be conserved to a certain extent. By analysing 5'-non-coding sequences of various vertebrate mRNAs, Kozak [33] described a consensus sequence, which was composed of the most frequently observed nucleotides in positions -6 to +1 (the A of the AUG codon providing position +1). For vertebrates the consensus sequence was revealed to be GCC(A/ G)CCAUGG [33] and for plants AACAAUGGC [34]. Joshi et al. [35] performed an extensive study identifying the consensus context for various categories of plants such as monocots, dicots, lower and higher plants. They suggest therefore that the derived sequence CAA(A/C)AAUGGCG is the general plant consensus sequence, whereas in lower plants the consensus sequence is C The bases at the -3 (purine) and +4 (G) positions are considered to be the most critical ones for initiation codon selection in plants and mammals [34,36,37]. Mutations at these positions modulated initiation of translation over a 20-fold range [36].
In the context of our promoter studies we changed the ATG-flanking sequence by introducing five additional nucleotides in front of the initiation codon. Consequently, the original sequence AGCCATGG was altered to AACCATGG. The change from A to C at position -1 has been shown to be of little consequence for transient transgene expression in tobacco plants [38].
Physcomitrella protoplasts transfected with the single 35S promoter and the 35S(long) promoter, including this modified sequence around the ATG, showed increased expression levels by about 52% and about 33%, respectively (Fig. 4). The relatively weak effect of the modified sequence in our analyses may be explained by the original sequence of the plasmid, which had a G in the most critical positions -3 and +4. Thus, these constructs do not differ very much from the consensus sequence described by Kozak [33].
Based upon wheat germ in-vitro translation experiments, Luetcke et al. [34] suggested a less important role for the AUG context in plants than in other organisms. Structural and functional findings of that study indicate that the factors that select AUG initiation codons in plants and animals differ significantly.
Other factors such as 5'-untranslated leader structures, upstream AUG codons, and sensitivity of plant ribosomes to secondary structure [35,39], play an important role in translation efficiency as well. The fact that these various promoter studies are based upon diverse plasmids resulting in different structural conditions should be taken into account when comparing their results.

Characterisation of endogenous promoters in Physcomitrella patens
In addition to the heterologous promoters, we characterised the activity of two native Physcomitrella promoters, namely α1,3-fucosyltransferase and β1,2-xylosyltransferase (fuc-t, xyl-t). Four consecutive inverse PCRs were carried out to determine 2087 bp of the 5'-flanking region of the fuc-t sequence (EMBL accession no. AJ618932, see Additional file 1), whereas 1430 bp of the 5'-flanking-sequence of the xyl-t gene were identified by a single inverse PCR using genomic DNA cut with NcoI (EMBL accession no. AJ618933, see Additional file 2).
The TMV omega element did not mediate translation enhancement in Physcomitrella patens

-(3)
To ensure valid comparison with the heterologous promoter activity studies (Fig. 2), the CaMV 35S activity was again used as the standard. The two promoters identified from Physcomitrella patens, the 5'-sequences of the fuc-t and xyl-t, exhibited expression levels of 194% and 31%, respectively (Fig. 5A).
Four deletion constructs were examined for the functional analysis of the fuc-t promoter region (Fig. 5B). The first deletion construct plucFTP-∆1 included 969 bp of the 5'region, of which 930 bp were identified as an intron within the 5'-untranslated region. This construct showed an activity of 12% compared with the entire 5'-sequence. This type of 5'-intron has reportedly evolved to contain A modified sequence directly upstream of the start codon slightly increases transcription efficiency

2.2 (4)
transcriptional regulators such as enhancer elements, to further extend their involvement in the control of gene expression [e.g. [40]]. Buchman and Berg [41], for example, observed that the expression of some genes is highly dependent on the presence of a 5'-intron. Given this connection, intron-mediated enhancement is not restricted to the original promoter or coding region [42].
To further characterise the role of the fuc-t 5'-intron, three 5'-deletion constructs (plucFTP-∆2, plucFTP-∆3 and plucFTP-∆4) were created, each about 250 bp shorter than the previous construct. Compared to the first deletion construct, plucFTP-∆1, plucFTP-∆2 caused an almost threefold increase in luciferase activity ( Fig. 5B; from 12 to 32%). Deletion construct plucFTP-∆3 showed a residual activity of around 5% compared to plucFTP, a significant decrease of luciferase activity in relation to plucFTP-∆2 (Fig. 5B). The shortest deletion construct, plucFTP-∆4, did not induce sufficient gene expression (below 1%; Fig. 5B). Previous studies analysing the rice Act1 promoter, which is also included in our study [43], demonstrated that introns significantly contribute to gene expression levels in rice and maize [42]. However, McElroy et al. [23] did not detect promoter activity in the intron itself. The intron-mediated stimulation of gene expression was associated with an in-vivo requirement for efficient intron splicing [43]. PlucFTP-∆1, where the potential promoter sequence mainly consists of the 5'-intron, contained 17 bp and 22 bp, respectively, of untranslated sequence information at the intron-flanking sides. These sequences may ensure effective intron splicing, in contrast to the intron sequence analysed in the study of McElroy et al. [23]. This result corresponds to earlier observations [e.g. [44]] suggesting that the degree of enhancement depends upon the exon sequences flanking the intron; additionally that the sequences surrounding the introns do influence efficiency of intron splicing.
Nevertheless, the increase in luciferase activity induced by plucFTP-∆2, compared with plucFTP-∆1, suggests that the 250 bp missing in plucFTP-∆2 comprise repressor-binding elements. Therefore the intron sequence by itself must influence the level of gene expression. Accordingly, a Gbox motif was identified at position -852, relative to the translation start site, which was described to act as transcriptional repressor http://www.dna.affrc.go.jp/htdocs/ PLACE/signalscan.html (an assortment of possible transcription influencing elements is illustrated in Fig. 5D). The significant decrease in reporter expression observed with plucFTP-∆2 and plucFTP-∆3 strongly suggests that the sequence from position -719 to position -484 is essential for expression efficiency of the fuc-t first intron. Moreover, we identified a region of 14 TA repeats in the middle of the intron (position -583), which might serve as an alternative binding site for the basal transcription initia-tion complex. Additionally, potential regulatory elements were recognised in the sequence between position -719 and position -484, for instance two CAAT boxes at position -1457 and -1552. Regarding the complete 5'sequence of the fucosyltransferase, several presumptive TATA boxes were identified at position -1741, -1603, -1513 and -1409. They are all located in the 5'-region of the complete fucosyltransferase promoter sequence and missing in the deletion constructs. Outside of the intron sequence, several CAAT boxes were identified, for example at position -1762, -1673 and -631.
Since the deletion constructs plucFTP-∆2, plucFTP-∆3 and plucFTP-∆4 are probably not able to remove the leader intron, one could assume that the relative activity of the luciferase reporter could result from altered translational efficiency. However, we think that the differences in luciferase activity observed with the deletion constructs are due to altered transcript abundance rather than an altered 5'-UTR sequence. Several features of leader sequences located within the 5'-untranslated region (UTR), like stable hairpin structures [45,46], and the presence of AUG triplets [47], have been shown to influence mRNA translational efficiency. However, the average length of eukaryotic 5'-UTRs varies from 90 to 170 bases [47]. The 5'-UTRs of plucFTP-∆2, plucFTP-∆3 and plucFTP-∆4 encompass about 250 to 750 bases, whereas the last 250 bp are identical in all deletion constructs. Therefore it seems unlikely that features located further upstream strongly influence the translation efficiency.
To determine the minimal length of the putative xyl-t promoter sequence required for maximal luciferase activity, four 5'-deletion constructs were created (Fig. 5C). Partial deletion of the 5'-sequence resulted in a concomitant reduction of reporter gene expression. PlucXTP-∆1 exhibited half of the activity of plucXTP, suggesting that important regulatory elements have to be located between position -1430 and -1139. Plasmid plucXTP-∆1 showed 49%, plucXTP-∆2 33% and plucXTP-∆2 17% activity compared with the complete putative xyl-t promoter sequence. PlucXTP-∆4 only showed a very weak reporter gene expression equivalent to about 5% of plucXTP activity. Due to a steady decrease in expression efficiency displayed by the 5'-deletion constructs, increased activity with the introduction of additional 5'-sequence information is to be expected. In the xylosyltransferase upstream region, two possible TATA boxes were identified at position -1107 and -704. At position -873 there is a 30-base-pair poly(T) region that can serve as a functional genetic element [48]. Three putative CCAAT boxes were recognised, one of them located at position -735 in the reverse strand (r). Several CAAT boxes are present, for example at position -1141 and -816.
Both of the Physcomitrella 5'-upstream sequences in question are AT-rich (AT content is 61% in case of the fucosyltransferase and 57% for the xylosyltransferase 5'untranslated region), while the AT content of translated sequences is below 50% [6].
Thus, we identified two endogenous promoter fragments that can be used to drive the expression of heterologous genes in Physcomitrella. The entire 5'-sequence of the fuc-t gene, which encompasses 2087 bp, induces reasonable expression levels of the luciferase reporter gene at 194% compared to the CaMV 35S promoter. Even the second deletion construct plucFTP-∆2, containing 716 bp, still showed 32% of the activity of plucFTP and therefore about three quarters of the activity of the CaMV 35S promoter. 1430 bp of the 5'-sequence of the xyl-t confers an expression efficiency of 31% in relation to the CaMV 35S promoter. Therefore these sequences are much more efficient to drive gene expression than the nos promoter (2.4%), for example.

Conclusions
The various heterologous and endogenous promoters, whose activity in Physcomitrella was quantified in these studies provide valuable tools for basic plant research and biotechnology applications. Availability of a set of differentially active promoters in a particular system is important both for the construction of expression vectors used in analysis of gene function as well as for large scale production of recombinant proteins. The results presented here provide evidence that the general transcriptional and translational mechanisms of the moss Physcomitrella resembles those in other eukaryotic organisms, despite the evolutionary distance between them. Beyond the specific applications described in the Physcomitrella system, our set of expression vectors will provide a valuable resource for genetic engineering of lower and of higher plants.

Plasmid construction
All constructs were based on the vector pUC18 supplemented by several restriction sites (EcoRV, NaeI, NotI, NheI, BglII, XhoI, NcoI and SpeI). For this purpose, the vector pUC18 was cut inside the multiple cloning site (SalI and HindIII) and a double-stranded oligonucleotide containing the additional restriction sites was inserted.
The sequence of the firefly luciferase and the CaMV 35S terminator from plasmid pGN35S-luc + (kindly provided by Prof. Gunther Neuhaus, Freiburg University) was introduced into the modified pUC18 vector via NcoI (providing the ATG luciferase start codon) and PstI.
A synthetic polyadenylation (polyA) signal for background reduction was inserted into the EcoRI site upstream of the additionally introduced restriction sites. The signal was PCR-amplified from the plasmid pGL3basic (Promega, Mannheim, Germany) (primers: polyA-a and polyA-b; all oligonucleotides used as primers are shown in Tab. 1, see Additional file 3) and subcloned into the vector pCR ® 4-TOPO ® (Invitrogen, Karlsruhe, Germany) out of which it was cut with EcoRI for further cloning. This yielded the firefly luciferase construct pluc into which the transcriptional and translational enhancer elements were introduced afterwards (Fig. 1A).
The TMV (tobacco mosaic virus) omega translation enhancer [49] was cut out of pGN35S-luc + and inserted into the basic vector pluc using the restriction sites XhoI and NcoI to yield pluc-Ω (Fig. 1A). With the aim of changing the sequence precisely in front of the translation initiation codon ATG, in accordance to Kozak [33] and Luetcke et al. [34], the five nucleotides CTCAA were inserted between the restriction sites XhoI and NcoI (NcoI including the translation initiation codon) of pluc to give pluc-enh (Fig. 1A).
All of the promoter sequences were PCR-amplified and the amplification products subcloned into the vector pCR ® 4-TOPO ® (Invitrogen, Karlsruhe, Germany). Fragments of the heterologous promoters, except CaMV 35S(long), were excised with SalI and XhoI and subsequently inserted between the SalI and XhoI sites of the three basic vectors pluc, pluc-Ω and pluc-enh, respectively. The CaMV 35S(long) promoter was inserted into the basic vectors using the restriction sites EcoRI and XhoI. The 5'-sequences of the Physcomitrella glycosyltransferases fuc-t and xyl-t were introduced into pluc using the restriction sites SalI and NcoI. To avoid background activity resulting from the vector backbone, all firefly luciferase promoter plasmids were linearized using restriction enzyme EcoRI, which cuts 33 bp upstream the start of the corresponding promoter sequence, at the 5'-end of the multiple cloning site.
The sequence of the CaMV 35S(1x) promoter was amplified by PCR from the vector pRT101 [50] using the primers 35S(1x)-a and 35S(1x)-b. For the amplification of CaMV 35S(long) promoter sequence, the vector mAV4 [51] was used as a PCR-template with the primers 35S(long)-a and 35S(long)-b. PGN35S-luc + employed for amplification of the CaMV 35S(2x) promoter sequence using the primers 35S(2x)-a and 35S(2x)-b. For the construction of plasmids carrying the rice Act1 promoter [43], the Act1 5'region was PCR-amplified from the vector pDM302 [ [52], kindly provided by Pof. Ray Wu, Cornell University Ithaca, New York], using the primers Act1-a and Act1-b. PEGFP-N1 (BD Biosciences Clontech, Heidel-berg, Germany) served as template for the amplification of the human cytomegalo virus (CMV) promoter with the primers CMV-a and CMV-b. The nos promoter sequence was PCR-amplified with the primers nos-a and nos-b using plasmid pBSNNN [8] as a template. Amplification of the simian virus (SV) 40 promoter sequence from the vector pSG5 (Stratagene, Amsterdam, The Netherlands) was carried out with the primers SV40-a and SV40-b.
The 5'-sequences of the Physcomitrella glycosyltransferases identified by inverse PCR were amplified using the primers FT-P-a and FT-P-b in the case of the fucosyltransferase promoter and XT-P-a and XT-P-b for the xylosyltransferase promoter, respectively. The promoter region of deletion construct plucFTP-∆1 was amplified with the primers FTP-∆1 and FT-P-b. PlucFTP-∆1 served as a template for the amplification of the promoter sequences of plucFTP-∆2, plucFTP-∆3 and plucFTP-∆4 with the forward primers FTP-∆2, FTP-∆3 or FTP-∆4, respectively, and the reverse primer FT-P-b. The promoter regions of the xylosyltransferase deletion constructs were amplified with the forward primers XTP-∆1, XTP-∆2, XTP-∆3 or XTP-∆4, respectively, and XT-P-b as reverse primer, using plucXTP as template.
The resulting amplification products were inserted into pluc using restriction sites SalI and XhoI.
Taken together, the XT fusion contains the sequence upstream of position -1 relative to the translation start codon. For cloning reasons the G at position -1 was replaced by the nucleotides CC followed by the luciferase coding sequence. The FT fusion contains the 5'-untranslated sequence upstream of position -2. The nucleotides A and T at positions -2 and -1 were replaced by CC followed by the luciferase coding sequence.
For the creation of the Renilla luciferase control plasmid (pRluc) the firefly luciferase sequence was exchanged with the Renilla luciferase sequence in the vector pluc-35S(long), using XhoI and XbaI (Fig. 1B). The sequence of the luciferase control reporter gene was amplified from the plasmid pRL-CMV (Promega), using the primers Rluca and Rluc-b.

Identification of endogenous promoter sequences via inverse PCR
The 5'-regions of the Physcomitrella patens fucosyl-and xylosyltransferase genes were identified by inverse PCR (I-PCR). Genomic DNA (3-5 µg) was digested with 30 units of various restriction endonucleases in a total volume of 30 µl at 37°C for two hours. One endonuclease (e.g. BamHI, BspHI, NcoI, NdeI, SphI) was used per approach. The digested DNA was purified and eluted in a volume of 50 µl of elution buffer (EB; Qiagen, Hilden, Germany). Prior to any further treatment, 10 µl of the eluate were analysed on an agarose gel (0.5%).
Purified DNA was re-ligated in a total volume of 300 µl for two hours at room temperature and for an additional two days at 4°C. Before the addition of the enzyme ligation mixture, the DNA was incubated at 50°C for five minutes and then quenched on ice, in order to melt sticky end base pairing. After ethanol precipitation with 0.3 M Na-acetate (pH 4.8) and two washes with 70% ethanol, the re-ligated DNA was resuspended in 200 µl elution buffer (theoretically: ~10 ng/µl).
I-PCR was done with 0.25 µl Advantage™ cDNA Polymerase Mix (implying proofreading activity) and buffer (including 3.5 mM Mg (OAc) 2 , both BD Biosciences Clontech), 0.2 mM of each primer and 0.2 mM dNTPs and 1 to 3 µl of the religated genomic DNA in a total volume of 25 µl. Cycling conditions were: an initial step of 2 minutes at 96°C, then 20 seconds at 94°C, 10 seconds initially at 61°C and 10 minutes at 68°C, with 35 to 40 repetitions, followed by a terminal step of 20 minutes at 68°C and cooling to 4°C at the end of the program. The primers were designed based on genomic fucosyl-and xylosyltransferase sequences, respectively [15]. PCR products were eluted from agarose gels (elution was done in a volume of 30 µl) and either cloned directly in pCR ® 4-TOPO ® vector or used as templates for reconfirmation via nested PCRs. In the latter case, the gel-eluted, nested PCR products were cloned in pCR ® 4-TOPO ® . Sequences derived from the I-PCR were confirmed by amplification of undigested genomic DNA with primers located at the 5'-end of the fucosyl-or xylosyltransferase on the one hand and at the 5'-end of the newly identified 5'-region on the other hand. The resulting products were also cloned in pCR ® 4-TOPO ® and served as templates for the cloning of the luciferase-promoter constructs.

Plant material, protoplast isolation and transfection
The moss Physcomitrella patens (Hedw.) B.S.G. was propagated as an axenic suspension in modified

Luciferase assay
Cells were harvested by a 10 min-centrifugation at 160 × g in a microcentrifuge at room temperature. After discarding the supernatant, the cells were frozen in liquid nitro-gen and stored at -80°C. The cells were lysed by incubation in 80 µl passive lysis buffer (PLB; Promega) and additionally homogenised in a 2 ml tube using a micropestle at maximum speed (2000 rpm) for 45 seconds on ice. Measurement of luciferase activity in the lysates (20 µl each) was performed using the dual-luciferase reporter assay system, according to the manufacturer' s recommendation (Promega). Light emission was measured immediately at room temperature using a luminometer (Berthold Lumat LB 9507, Bad Wildbach, Germany). The initial 10-second-integral of light emission was recorded. The relative luciferase activity was calculated as the ratio between the firefly luciferase and the control Renilla luciferase activity. This transient expression assay was repeated independently 3 to 11 times for each construct with all samples measured in triplicates.

Authors' contributions
VH essentially performed the described experiments and prepared the manuscript. CMH participated in the analysis of the FTP deletion constructs. WJ optimised the inverse PCR assay. RR leads the chair plant biotechnology at Freiburg University. ELD was responsible for the design and development of these studies; consequently she is the corresponding author. All authors read and approved the final manuscript.