Generation of an external guide sequence library for a reverse genetic screen in Caenorhabditis elegans

Background A method for inhibiting the expression of particular genes using external guide sequences (EGSs) has been developed in bacteria, mammalian cells and maize cells. Results To examine whether EGS technology can be used to down-regulate gene expression in Caenorhabditis elegans (C. elegans), we generated EGS-Ngfp-lacZ and EGS-Mtgfp that are targeted against Ngfp-lacZ and Mtgfp mRNA, respectively. These EGSs were introduced, both separately and together, into the C. elegans strain PD4251, which contains Ngfp-lacZ and Mtgfp. Consequently, the expression levels of Ngfp-lacZ and Mtgfp were affected by EGS-Ngfp-lacZ and EGS-Mtgfp, respectively. We further generated an EGS library that contains a randomized antisense domain of tRNA-derived EGS ("3/4 EGS"). Examination of the composition of the EGS library showed that there was no obvious bias in the cloning of certain EGSs. A subset of EGSs was randomly chosen for screening in the C. elegans strain N2. About 6% of these EGSs induced abnormal phenotypes such as P0 slow postembryonic growth, P0 larval arrest, P0 larval lethality and P0 sterility. Of these, EGS-35 and EGS-83 caused the greatest phenotype changes, and their target mRNAs were identified as ZK858.7 mRNA and Lin-13 mRNA, respectively. Conclusion EGS technology can be used to down-regulate gene expression in C. elegans. The EGS library is a research tool for reverse genetic screening in C. elegans. These observations are potentially of great importance to further our understanding and use of C. elegans genomics.


Background
RNase P catalyzes the maturation of 5'-termini of all tRNAs by a single endonucleolytic cleavage of their precursors [1]. This enzyme is found in cells from all three domains of life: the Bacteria, Eukaryote and Archaea [2][3][4][5]. One of the unique features of RNase P is its ability to recognize the structures, rather than the sequences, of tRNAs; this allows the enzyme to cleave other substrates with similar structure to the tRNA precursor. Accordingly, any complex of two RNA molecules that resembles a similar tRNA molecule can be recognized and cleaved by RNase P [6][7][8]. One of the two RNA molecules that resemble the complex is termed the external guide sequence (EGS). In principle, an mRNA sequence can be targeted for RNase P cleavage by hybridization with EGS to direct RNase P to the cleavage site. Subsequent studies have shown that EGS technology can be used to down-regulate gene expression in many organisms, such as bacteria, [9][10][11][12] mammalian cells [13][14][15][16][17][18][19]and maize cells [20].
Nucleic-acid-based gene-interference strategies, such as anti-sense oligonucleotides, ribozymes, and RNAi, are powerful research tools and promising therapeutic agents for human diseases [21][22][23][24][25]. Each technology has advantages and limitations in terms of targeting efficacy and specificity [26]. Compared with other nucleic-acid-based gene-interference strategies, such as the RNAi approach that induces the cellular RISC RNase to cleave a target mRNA [26,27], targeted cleavage of mRNA by RNase P using an EGS is a unique approach that can be used to inactivate any RNA of known sequence expressed in vivo. Moreover two types of interaction govern the targeting specificity of EGS [3,19]. One is the Watson-Crick basepairing interaction between the anti-sense domain of an EGS and the accessible region of a target mRNA. The other is the interaction between a target mRNA and the other domains of an EGS, which are required for folding of the RNase P-recognizable tertiary structure.
Several EGSs derived from natural tRNA sequences have been shown to be effective in blocking gene expression in bacteria [12,28] and mammalian cells [29]. For example, the "3/4 EGS" resembles three-quarters of the tRNA molecule and consists of two sequence elements: a targeting sequence that is complementary to the accessible region of a target mRNA in which most sequences are inaccessible owing to the secondary or tertiary structures of the RNA and or the binding of proteins; and a RNase-P-recognizing sequence that is a portion of the tRNA sequence and required for interacting with RNase P [8]. It has been demonstrated that the "3/4 EGS" effectively and specifically induces target mRNA cleavage by eukaryotic RNase P [8,28].
Phenotype changes have been associated with more than 1,500 C. elegans genes through a combination of RNAi screens, classical mutant screens and systematic gene knockout experiments [30][31][32][33][34][35][36][37][38][39][40][41][42]. Despite these successes, the functions of most of the approximately 20,000 predicted genes in the C. elegans genome remain elusive. Moreover, there were some clear differences in the results of these RNAi screens conducted by different researchers. These differences were considered to result from different approaches and standards in RNAi screening. Furthermore, there was also 10 to 30% variability in the results of the RNAi screens conducted by the same researcher according to the same procedure [30,31,34,[36][37][38][39][40]43,44]. The relative variability of the RNAi effect should be an important consideration before the RNAi data are used as starting point for new experiments [40]. In this study, we show that EGS technology can be used to down-regulate gene expression in C. elegans, and the EGS library can facilitate a reverse genetic screen similar to that possible with an RNAi library

Validation of EGS technology for down-regulating gene expression in C. elegans
There are two types of green fluorescent proteins (GFP) in C. elegans strain PD4251. Ngfp consists of a wild-type GFP and a nuclear-localization signal encoded by Ngfp-lacZ. Mtgfp consists of a wild-type GFP and a mitochondriallocalization signal encoded by Mtgfp [45]. EGSs that target to Ngfp-lacZ or Mtgfp mRNA can be designed using RNAfolding software [46]. According to the rules of EGS design [28], the favorable accessible regions of Ngfp-lacZ ( Fig. 1A) and Mtgfp mRNAs (Fig. 1B) were identified from all candidate accessible regions. The "3/4 EGS" (Fig. 1C) was used as the framework. The anti-sense sequence of the accessible region was introduced into the antisense domain of the framework. The "CCA" sequence [7,8,28,47,48] located in the 3'-terminus is important for the EGS effect. To protect the "CCA" sequence from being exposed directly to RNase, the "UUU" sequence was attached to its 3'-terminus. Two EGSs, EGS-Ngfp-lacZ (Fig. 1D) and EGS-Mtgfp (Fig. 1F), were constructed. Two additional EGSs, EGS-Ngfp-lacZ-D (Fig. 1E) and EGS-Mtgfp-D (Fig. 1G), were also constructed. EGS-Ngfp-lacZ-D and EGS-Mtgfp-D were derived from EGS-Ngfp-lacZ and EGS-Mtgfp, respectively, and contained point mutations (5'-TTC-3' → AAG) at the three highly conserved positions in the "T-loop" of these EGSs. These nucleotides have been found in most of the known, natural tRNA sequences [49] and are thought to be important for interactions between the tRNA domains and human RNase P [3]. Previous studies have shown that EGSs with these mutations prevented RNase P recognition and showed little activity in directing RNase-P-mediated cleavage [19,50,51].
The expression level of GFP mRNA was determined by quantitative PCR (QPCR) analysis ( Fig. 3A and Table 1). Reductions of 34% and 40% in the expression level of GFP mRNA were observed in worms treated with EGS-Ngfp-lacZ and EGS-Mtgfp, respectively. There was a marked reduction of 96% in the expression level of GFP mRNA in worms treated with a mix of EGS-Ngfp-lacZ and EGS-Mtgfp. By contrast, the expression level of GFP mRNAs was reduced by <10% in worms treated with EGS-Ngfp-lacZ-D, EGS-Mtgfp-D, or a mix of EGS-Ngfp-lacZ-D and EGS-Mtgfp-D. These results indicate that these EGSinduced significant reductions in the target mRNA expression level were due to RNase P-mediated cleavage. The low level of inhibition in worms treated with these disabled EGSs was presumably due to an anti-sense effect of the EGS.
To examine the targeting specificity of EGS-Ngfp-lacZ and EGS-Mtgfp, the protein levels of Ngfp and Mtgfp were determined by Western-blot analysis ( Fig. 3B and Table  1). Reductions of 56 ± 5% and less than 10% in the levels of Ngfp and Mtgfp proteins, respectively, were observed in worms treated with EGS-Ngfp-lacZ. Similarly, there were reductions of 70% and less than 10% in the levels of Mtgfp and Ngfp proteins, respectively, in worms treated with EGS-Mtgfp. Interestingly, greater reductions of 71 ± 6% and 95% in the level of Ngfp and Mtgfp proteins, respectively, were observed in worms treated with a mix of EGS-Ngfp-lacZ and EGS-Mtgfp. By contrast, Ngfp and Mtgfp protein levels were reduced by <10% in worms treated with EGS-Ngfp-lacZ-D, EGS-Mtgfp-D or a mix of EGS-Ngfp-lacZ-D and EGS-Mtgfp-D. The small reductions in the Ngfp and Mtgfp protein expression levels in worms treated with these disabled EGSs were likely due to antisense effects of the EGSs. To locate the nuclei, worms were stained with Hoechst 33258 stain.

Generation of EGS library
The "3/4 EGS" (Fig. 4A) was used as a framework for the EGS library. The EGS library (Fig. 4B), which contains a randomized anti-sense domain of the "3/4 EGS", was generated by introducing the following modifications into the framework: the anti-sense domain was composed of random bases; The "CCA" sequence [7,8,28,47,48] located in the 3'-terminus is important for the EGS effect. To protect the "CCA" sequence from being exposed directly to RNase, the "UUU" sequence was attached to its 3'-terminus. The resulting EGS library is a collection that contains any EGS targeted to any target mRNA (Fig. 4C).
pET28a-LEGS, which contains the EGS library cassette under control of T7 promoter, was constructed (Fig. 5).
First, a primer pair of FLESp and RLEGSp was designed (Fig. 6). The partially randomized oligonucleotides of FLESp and RLEGSp were composed of two parts; one acted as a primer to amplify pET28a-D equal to pET28a but lacked the fragment between the T7 terminator and T7 promoter. The other acted as a primer to amplify the EGS library cassette. Second, pET28a-LEGSL was amplified by PCR with the primer pair of FLEGSp and RLEGSp using pET28a as template. Third, pET28a-LEGS was constructed by self-ligation of pET28a-LEGSL and transformed into DH5α to screen for pET28a-EGS clones containing individual EGS cassettes.
In general, about 98% of pET28a-EGS clones have one HincII site, with the remaining 2% having two or three (A) Effects of EGS on expression levels of GFP mRNA in PD4251 worms HincII sites. Their HincII digestion patterns were predicted by the NTI program (Fig. 7A). To examine the composition of the EGS library, 500 clones were chosen at random for restriction enzyme (HincII) analysis. Of these 500 clones, 94% (Fig. 7B) showed the HincII digestion pattern shown in Fig. 7A, lane RV1, the rest (see Additional file 1) showed the HincII digestion pattern shown in Fig. 7A, lane RV2. Sequence analysis was performed to determine the specific sequences; 94% were shown to have a unique EGS cassette sequence. Alignment analysis was used to show that these sequences ( Fig. 7C) showed no bias in cloning of certain EGS cassettes.

Validation of EGS library for reverse genetic screen in C. elegans
To examine whether the EGS library can be used as a reverse genetic screen in C. elegans, 300 unique EGSs were randomly selected and used for screening of the C. elegans strain N2. The screening procedure is systemically shown in Fig. 8. First, the EGS clone IVTT containing an EGS cassette controlled by the T7 promoter was amplified by PCR with the primers Fclone-IVTT and Rclone-IVTT, using the pET28a-EGS clone as a template (Fig. 8A, B). An EGS clone was transcribed by T7 RNA polymerase using the purified EGS-clone IVTT as a template (Fig. 8A, C). Second, synchronous cultures of N2 worms were soaked in EGS solution. These worms were individually transferred to new plates with food, and phenotypes of both P0 worms and F1 progenies were recorded (Fig. 8D). All phenotypes visible under the dissection microscope were recorded. Such phenotypes included sterility, slow postembryonic growth, larval arrest, larval lethality, abnormal morphology, and uncoordination. About 6% of EGSs induced abnormal phenotypes, such as P0 slow postembryonic growth, P0 larval arrest, P0 larval lethality and P0 sterility ( Table 2). Of these, EGS-35 and EGS-83 ( Fig. 9A, C) caused the greatest phenotype changes ( Table  2). The target mRNAs of EGS-35 and EGS-83 were identified by the following procedure. All candidate target mRNAs of an EGS were identified by a BLAST search of its target sequence (see Additional file 2). BLAST searches of all EGS-35 and EGS-83 candidate target sequences (Table  3) produced 12 and 34 candidate mRNAs (  (Fig. 9B, D). These small reductions in worms treated with the disabled EGS were likely due to anti-sense effects of the EGSs. These results indicate that the significant reductions in the levels of target mRNA expression (ZK858.7 mRNA and Lin-13 mRNA for EGS-35 and EGS-83, respectively) in worms treated with EGSs were due to EGS-directed RNase-Pmediated cleavage. The phenotypes of worms with RNAi-ZK858.7 mRNA and RNAi-Lin-13 mRNA were similar to the phenotypes induced by EGS-35 and EGS-83, respectively ( Table 2).

Discussion
It has been shown that EGS technology can be used to down-regulate gene expression in bacteria [9][10][11][12], mammalian cells [13][14][15][16][17][18][19] and maize cells [20]. We have shown that EGS technology can also be used to down-regulate gene expression in C. elegans. Several criteria must be satisfied if successful EGS targeting is to be achieved. Among these are high cleavage efficiency, EGS target specificity, and efficient delivery of the reagent. We constructed EGS-Ngfp-lacZ and EGS-Mtgfp that target Ngfp-lacZ and Mtgfp mRNAs, respectively, and showed that these EGSs direct RNase P to cleave the targets efficiently. Moreover, we showed targeting specificity of these EGSs.   Table 1). This was probably due to anti-sense effects of the EGSs, but is not due to any overlap in the target sequence. Maybe the EGS methodology is particularly effective when more than one site in a particular mRNA is targeted [12,16].
Many C. elegans genes have been associated with phenotypes due to the results of reverse genetic screens based on RNAi libraries. Despite the success of these screens, the functions of most of approximately 20,000 predicted genes in the C. elegans genome remain elusive. Moreover, the limitations of RNAi such as off-target [52][53][54] and relative variability in the RNAi effect [40] compromise the level of confidence in the results of these RNAi screens.
The EGS library aims to facilitate reverse genetic screens such as those with the RNAi library, and it will be useful for confirming RNAi phenotypes. For example, ZK858.7 and lin-13 genes were identified by a reverse genetic screen based on the EGS library. Remarkably, EGS-35 and EGS-83 efficiently and specifically interfered with ZK858.7 and lin-13, respectively. The target specificity of the EGS is governed by two different types of interactions [3,19]. One is the base-pairing interactions [3,17,19,55] in which the ten nucleotides in the EGS hybridize with the accessible region of the target mRNA. The EGS has two short, sequence-specific recognition elements that are oriented in space with respect to each other in a well-defined fashion. This complex recognition element provides the necessary specificity for RNase P. It is known that the ten nucleotides involved in base-pairing between the EGS and the target mRNA make it difficult to guarantee target specificity in C. elegans. Given the extensive secondary and tertiary structure associated with the RNA or the binding of proteins to the target RNA in vivo, the target sequences in cellular RNAs are not all accessible. The other type of interaction [3,17,19,55] is between the RNase P recognition domain (e.g., T-loop and stem) and the mRNA. This interaction facilitates the folding of the EGS-mRNA complex into a tRNA-like molecule and stabilizes the mRNA-EGS complex. An immediate corollary is that if two targets with a one-bp mismatch are compared, the same caveat on accessibility rules out any meaningful comment on specificity of targeting. Mutation of a single base in the target mRNA will not affect the methodology based on "stem EGS" because a single base mismatch in the complex with the target mRNA is unlikely to alter recognition by RNase P [9,12]. However, the location of the unpaired nucleotides is important because three contiguous unpaired bases might very well disallow the RNase P-mediated effects. It is that an EGS could still function despite several point mutations between it and the bacterial target mRNA, depending precisely on the sequence of the unpaired bases [9]. The framework of EGS-35 and EGS-83 is the "3/4 EGS" that is distinguishable from the "stem EGS" by additional parts equivalent to the T-stem and Tloop, and variable regions of a tRNA. The mismatch tolerance of the effects of EGS-35 and EGS-83 needs further study. Since the worms are cultured at 20°C, specificity Demonstration of FLEGSp and RLEGSp. Figure 6 Demonstration of FLEGSp and RLEGSp. The partially randomized oligonucleotides of FLESp and RLEGSp are composed of two parts. One is used to amplify pET28a-D, which is equal to pET28a but does not contain the fragment between the T7 terminator and T7 promoter. The other is used to amplify the EGS library cassette.

Conclusion
EGS technology can be used to interfere with gene expression in C. elegans. The EGS library is used to facilitate a reverse genetic screen as performed by a RNAi library, and it should be particularly useful for confirming the RNAi phenotype as the function of most of the approximately 20,000 predicted genes in the C. elegans genome remains elusive. Moreover, the limitations of RNAi such as off-target and relative variability in the RNAi effect compromise the level of confidence in the RNAi screen results. Taken together, these observations are potentially of great importance for further our understanding and promoting the development of C. elegans genomics.

C. elegans, primers and vector
The N2 and PD4251 strains of C. elegans were provided by the Caenorhabditis Genetics Center (Univ. of Minnesota, St. Paul). The worms were maintained and handled as described previously [56]. Primers used in this work are listed in Table 8. The pET28a vector was purchased from Merk, Inc.

Synchronous cultures of C. elegans
Synchronous cultures of C. elegans were prepared basically as described previously [56]. The

Preparations of EGS-Ngfp-lacZ, EGS-Mtgfp
The EGSs that specifically target Ngfp-lacZ or Mtgfp mRNAs are designed using RNA-folding software [46]. According to the rules of EGS design [28], the favorable accessible regions of Ngfp-lacZ (Fig. 1A) and Mtgfp mRNAs (Fig. 1B) were identified from all candidate accessible regions. The "3/4 EGS" (Fig. 1C) was used as the design framework. The anti-sense sequence of the accessible region was introduced into the anti-sense domain of the design framework. The "CCA" sequence [7,8,28,47,48] located in the 3'-terminus is important for the EGS effect. To protect the "CCA" sequence from being exposed directly to RNase, the "UUU" sequence was attached to its 3'-terminus. To construct pET28a-EGS-Ngfp-lacZ and pET28a-Mtgfp, which contain EGS-Ngfp-lacZ and EGS-Mtgfp cassettes, respectively, under the control of the T7 promoter, primer pairs were designed using the NTI program (see Additional file 3) and synthesized with 5'-ter-  polymerase (Epicentre) using purified PCR products of EGS-Ngfp-IVTT and EGS-Mtgfp-IVTT, respectively, as templates.
The final RNA concentration varied from 6 to 10 mg/ml. Synchronous cultures of C. elegans strain PD4251 (containing 400 L1 larvae, 400 L2 larvae, 400 L3 larvae and 400 L4 larvae) in volumes of 400 μl 0.25 × M9 solution were added to EGS solution and shaken at 20°C for 24 hours. The treated worms underwent the following analyses: GFP fluorescence of PD4251 worms was imaged by microscope; to locate the nuclei, and worms were stained with Hoechst 33258 (sigma) according to standard protocol. Total RNA was prepared as described in the "Experimental Procedures and Protocols for Total RNA Isolation" developed and provided by Stuart Kim's laboratory. Primers for quantitative real-time PCR (QPCR) were: eft-2 (eft-2-QPCR-F and eft-2-QPCR-R) and GFP (GFP-QPCR-F and GFP-QPCR-R). QPCR was performed using PrimeScript™ RT reagent kit and PrimeScript ® Premix Ex Taq™ kit (TAKARA) according to the manufacturer's instructions. Expression level of GFP mRNA was normalized to the mRNA eft-2 expression level. Protein was prepared according to the "Protocol of Protein prep from C. elegans and Western Analysis" provided by the Pasquinelli laboratory. Western-blot analysis was performed using the following antibodies: actin (I-19)(SANTA CRUZ sc-1616), GFP (B-2) (SANTA CRUZ sc-9996), bovine anti-mouse IgG-AP (SANTA CRUZ sc-2373), and donkey anti-goat IgG-AP (SANTA CRUZ sc-2022). The films were imaged using the UVP gel imaging analytical system (Upland, GDS-8000) The accession number refers to the GenBank database. The values shown are means derived from triplicate experiments, and values for the standard deviation that were less than 5% are not shown.
The accession number refers to the GenBank database. The values shown are means derived from triplicate experiments, and values for the standard deviation that were less than 5% are not shown. Reverse genetic screen of C. elegans based on EGS and analyzed using Labworks software. Actin protein was used as an internal control.

Construction of EGS library
To construct pET28a-LEGS, which contains the EGS library cassette under control of the T7 promoter, the primer pair of FLEGSp and RLEGSp was designed using the NTI program and synthesized with random bases at certain positions and 5'-terminal phosphorylation modifications. pET28a-LEGSL was amplified by PCR with the primer pair of FEGSp and REGSp using pET28a as a template; the reaction conditions were 98°C for 60 s, 30 cycles of 98°C for 5 s, 70°C for 15 s and 72°C for 90 s, followed by 72°C for 10 min, in 50-μl volumes with Phusion DNA Polymerase (NEB: F-530S). One microgram of the purified PCR product of pET28a-LEGSL was selfligated by T4 ligase (NEB) in a 1-ml volume at 15°C for 16 hours. The ligation product was purified and transformed into DH5α maximum efficiency competent cells (Invitrogen: 18258-012), and selection of bacterial clones was performed with 30 μg/ml kanamycin. Individual clones were selected at random for restriction enzyme digest with HincII and sequencing with the S-LEGS-F or S-LEGS-R primers.

Reverse genetic screen based on EGS
PCR amplification of the EGS clone IVTT was performed with the primer pair of Fclone-IVTT and Rclone-IVTT using the pET28a-EGS clone as a template; the reaction conditions were: 98°C for 60 s, 30 cycles of 98°C for 5 s, 70°C for 15 s and 72°C for 15 s, followed by 72°C for 10 min, in 50-μl volume with Phusion DNA Polymerase (NEB: F-530S). The EGS clone was transcribed in vitro using T7 RNA polymerase (Epicentre) and the purified PCR product of the EGS clone IVTT as a template. The purified EGS clone was dissolved in 4 μl soaking buffer (10.9 mM Na 2 HPO 4 , 5.5 mM KH 2 PO 4 , 2.1 mM NaCl, 4.7 mM NH 4 Cl, 6 mM spermidine, and 0.1% gelatin) [34]. The final RNA concentration varied from 6 to 10 mg/ml. Purified synchronous cultures of C. elegans strain N2 (containing 3 L1 larvae, 3 L2 larvae, 3 L3 larvae and 3 L4 larvae) in a volume of 4 μl 0.25 × M9 solution were added to each EGS solution in 48-well PCR plates and shaken at 20°C for 24 hours. The worms were then transferred to new plates with food and phenotypes of both P0 worms and F1 progenies were recorded.

Identification of target mRNA of EGS-35 and EGS-83
All candidate target mRNAs of an EGS were identified by BLAST analysis of the target sequence (see Additional file 2). The expression levels of all candidate target mRNAs in worms treated with EGS-35 or EGS-83 were analyzed by QPCR as described above. Primers for QPCR are listed in Tables 6, 7 and 8. Expression levels of candidate target mRNA were normalized to the expression level of the mRNA eft-2.

Authors' contributions
QY validated EGS technology in C. elegans, generated the EGS library, performed the reverse genetic screen and wrote the first draft of the manuscript. RZ generated the EGS library, performed the reverse genetic screen and participated in writing the manuscript. CY performed the reverse genetic screen. BZ performed the Western-blot analysis. WZ and WM initiated the project, designed the EGS library and finalized writing the manuscript. All authors read and approved the final manuscript.   S-EGS-R 5'-CCTGCCACCATACCCACGCC-3' The "p" in the "sequence" column represents the modification by phosphorylation. Fclone-IVTT is the outline of the corresponding primer used in the specific experiment. Base-substitution mutations at three positions of the T-loop are indicated by bold text.