Genome-wide target profiling of piggyBac and Tol2in HEK 293: pros and cons for gene discovery and gene therapy
© Meir et al; licensee BioMed Central Ltd. 2011
Received: 29 September 2010
Accepted: 30 March 2011
Published: 30 March 2011
DNA transposons have emerged as indispensible tools for manipulating vertebrate genomes with applications ranging from insertional mutagenesis and transgenesis to gene therapy. To fully explore the potential of two highly active DNA transposons, piggyBac and Tol2, as mammalian genetic tools, we have conducted a side-by-side comparison of the two transposon systems in the same setting to evaluate their advantages and disadvantages for use in gene therapy and gene discovery.
We have observed that (1) the Tol2 transposase (but not piggyBac) is highly sensitive to molecular engineering; (2) the piggyBac donor with only the 40 bp 3'-and 67 bp 5'-terminal repeat domain is sufficient for effective transposition; and (3) a small amount of piggyBac transposases results in robust transposition suggesting the piggyBac transpospase is highly active. Performing genome-wide target profiling on data sets obtained by retrieving chromosomal targeting sequences from individual clones, we have identified several piggyBac and Tol2 hotspots and observed that (4) piggyBac and Tol2 display a clear difference in targeting preferences in the human genome. Finally, we have observed that (5) only sites with a particular sequence context can be targeted by either piggyBac or Tol2.
The non-overlapping targeting preference of piggyBac and Tol2 makes them complementary research tools for manipulating mammalian genomes. PiggyBac is the most promising transposon-based vector system for achieving site-specific targeting of therapeutic genes due to the flexibility of its transposase for being molecularly engineered. Insights from this study will provide a basis for engineering piggyBac transposases to achieve site-specific therapeutic gene targeting.
DNA transposons are natural genetic elements residing in the genome as repetitive sequences. A simple transposon is organized by terminal repeat domains (TRDs) embracing a gene encoding a catalytic protein, transposase, required for its relocation in the genome through a "cut-and-paste" mechanism. Since the first discovery of DNA transposons in Maize by Barbara#McClintock in 1950 , transposons have been used extensively as genetic tools in invertebrates and in plants for transgenesis and insertional mutagenesis [2–7]. Such tools, however, have not been available for genome manipulations in vertebrates or mammals until the reactivation of a Tc1/mariner-like element, Sleeping Beauty, from fossils in the salmonid fish genome . Since its awakening, Sleeping Beauty has been used as a tool for versatile genetic applications ranging from transgenesis to functional genomics and gene therapy in vertebrates including fish, frogs, mice, rats and humans . Subsequently, naturally existing transposons, such as Tol2 and piggyBac, have also been shown to effectively transpose in vertebrates.
The Medaka fish (Orizyas latipes) Tol2, belonging to the hAT family of transposons, is the first known naturally occurring active DNA transposon discovered in vertebrate genomes . Tol2 is a standard tool for manipulating zebrafish genomes and has been demonstrated to transpose effectively in frog, chicken, mouse and human cells as well . Recent studies found that Tol2 is an effective tool both for transgenesis via pronuclear microinjection and germline insertional mutagenesis in mice . Cabbage looper moth (Trichoplusia ni) piggyBac is the founder of the piggyBac superfamily and is widely used for mutagenesis and transgenesis in insects . Recently, piggyBac was shown to be highly active in mouse and human cells and has emerged as a promising vector system for chromosomal integration, including insertional mutagenesis in mice and nuclear reprogramming of mouse fibroblasts to induced-pluripotent stem cells [14–19].
To date, most gene therapy trials have utilized viral vectors for permanent gene transfer due to their high transduction rate and their ability to integrate therapeutic genes into host genomes for stable expression. However, serious problems associated with most viral vectors, such as limited cargo capacity, host immune response, and oncogenic insertions (as evidenced by the retrovirus-based gene therapy) highlight an urgent need for developing effective non-viral therapeutic gene delivery systems [20, 21]. Recently, Sleeping Beauty, Tol2, and piggyBac transposon-based vector systems have been explored for their potential use in gene therapy with proven successes [22–25]. However, for therapeutic purposes, a large cargo capacity is often required. The transposition efficiency of Sleeping Beauty is reduced in a size-dependent manner with 50% reduction in its activity when the size of the transposon reaches 6 kb . Tol2 and piggyBac, however, are able to integrate up to 10 and 9.1 kb of foreign DNA into the host genome, respectively, without a significant reduction in their transposition activity [14, 22]. Additionally, by a direct comparison, we have observed that Tol2 and piggyBac are highly active in all mammalian cell types tested, unlike SB11 (a hyperactive Sleeping Beauty), which exhibits a moderate and tissue-dependent activity . Because of their high cargo capacity and high transposition activity in a broad range of vertebrate cell types, piggyBac and Tol2 are two promising tools for basic genetic studies and preclinical experimentation. Our goal here was to evaluate the pros and cons of piggyBac and Tol2 for the use in gene therapy and gene discovery by performing a side-by-side comparison of both transposon systems. In this study, we reported for the first time the identification of the shortest effective piggyBac TRDs as well as several piggyBac and Tol2 hotspots. We also observed that piggyBac and Tol2 display non-overlapping targeting preferences, which makes them complementary research tools for manipulating mammalian genomes. Furthermore, piggyBac appears to be the most promising vector system for achieving specific targeting of therapeutic genes due to a robust enzymatic activity of the piggyBac transposase and flexibility the transposase displays towards molecular engineering. Finally, results of our in-depth analyses of piggyBac target sequences highlight the need to first scrutinize the piggyBac favored target sites for the therapeutic cell type of interest before designing a customized DNA binding protein for fusing with the piggyBac transposase to achieve site-specific therapeutic gene targeting.
Transposition activity of piggyBac and Tol2in mammalian cells
With the ultimate goal of identifying and targeting safe sites in the genome at which to insert corrective genes, we previously explored three active mammalian transposases, piggyBac, Tol2 and SB11 (a hyperactive Sleeping Beauty) for their sensitivity to molecular modification . After fusing the GAL4 DNA binding domain to the N-terminus of the three transposases, we only detected a slight change in the activity of the piggyBac transposase, whereas the same modification nearly abolished the activity of Tol2 and SB11 . A recent genetic screen has yielded a novel hyperactive Sleeping Beauty transposase (designated as SB100X) that was shown to be more active than piggyBac under restrictive conditions that support their peak activity . However, in this study we chose to focus on piggyBac and Tol2 but not Sleeping Beauty for the following reasons: (1) all of the reported attempts to modify the SB11 transposase either N- or C-terminally result in a complete elimination or a significant reduction in transposase activity; (2) Sleeping Beauty is more susceptible to over expression inhibition than piggyBac and Tol2; (3) the cargo capacity of Sleeping Beauty is limited; and (4) unlike Tol2 and piggyBac that are active in all mammalian cell types tested, Sleeping Beauty display cell-type dependent activity [15, 27, 34].
To evaluate the activity of the piggyBac transposase, we then transfected a fixed amount of piggyBac donors (100 ng) with a various amount of helper plasmids bearing Myc-tagged piggyBac transposases (ranging from 50 to 300 ng) into HEK 293. PiggyBac transposition activity increases as the amount of piggyBac transposases increase until reaching its peak in cells transfected with 200 ng of helper plasmids (Figure 2C). As the amount of piggyBac transposases were reduced to the level barely detected by Western blotting, 68% of the transposition activity at its peak was still retained (compare lane 5 with lane 2 in Figure 2C), suggesting that piggyBac transposase is highly active.
A global evaluation of Tol2 and piggyBactargeting preferences in the human genome
The data sets of piggyBac and Tol2 genome-wide target profiling in HEK 293
# of individual
successful rate in
# of individual
# of targets with a
# of targets mapped
to the human genome*
To measure the distributions of piggyBac and Tol2 targets with regards to the gene density around the target sites, we counted the number of genes located within a 200 kb interval on either side of their target sites. By this analysis, Tol2 tends to target to regions with lower gene densities, particularly favoring regions with one to two genes located within a 200 kb window on either side of the insertion site (Figure 4B).
Sequence analyses of Tol2 and piggyBactarget sites
The piggyBac and Tol2 hotspots in the HEK 293 genome
Near gene (distance bp)
Far gene (distance bp)
The piggyBac and Tol2 targets located within the repetitive sequences of the HEK 293 genome
Sequence (+1 ~ +30)
Types of repeats
To study the nature of piggyBac target specificity, we next examined the neighboring sequences around five piggyBac hotspots. We observed that several TTAA tetranucleotides are located within a 100 bp interval of two piggyBac hotspots. The target sequences in B102-2 and B38-4 are identical and contain three TTAA tetranucleotides within a 100 bp interval upstream of the actual piggyBac TTAA target (Figure 5B). Similarly, the sequence of another piggyBac hotspot (as in B92-1 and B75-4), contains three TTAA tetranucleotides within the 100 bp interval downstream of the genuine TTAA piggyBac target site. A Blat search has identified another sequence which is located 3.3 Mb away and shares 99.5% sequence identity with the target site of B92-1 and B75-4. As detailed in the lower sequence of Figure 5B, a G (in red) to A substitution is identified at +88 on the other sequence where the piggyBac target site is designated as 0.
The fact that piggyBac targeted repeatedly to the same TTAA but not the adjacent TTAA tetranucleotides or to the TTAA site on another highly identical sequence nearby raise the possibility that the genuine TTAA piggyBac targets may be determined by some intrinsic sequence constraints flanking the target site. To further address this possibility, we focused on two other piggyBac target sequences, the B89-4 and B87-4 (Table 3). By a Blat search, we identified four sequences on chromosome 16 that share 100% sequence identity with one of the piggyBac hotspot as in B89-4 and B77-4 (Table 3). We then performed a multiple sequence alignment on these four sequences. Although the primary sequence of these four sequences with a 200-bp interval on either side of the TTAA target site is almost identical, both B89-4 and B77-4 target to the same TTAA tetranucleotide on the top but not the other three similar sequences in Figure 5C. Another example, B87-4, was found to share at least 97% sequence identity with 510 sequences elsewhere in the human genome, yet none of these highly similar sequences were targeted by piggyBac (Table 3). To gain further insight into the nature of piggyBac target selection, we retrieved the top 184 sequences that share 99% sequence identity with the first 100 bp of the B87-4 target. As revealed by the sequence logo analysis, the primary sequence of these 184 sequences is highly conserved (Figure 5D). By designating the first T of TTAA as +1, the conserved A at -51 and C at +99 are changed to C and T, respectively, in the B87-4 target. Collectively, these observations strongly suggest that piggyBac does not target arbitrarily to any TTAA tetranucleotide in the human genome but rather to the TTAA sites in a specific sequence context.
The activity of genes nearby the piggyBac and Tol2hotspots
Risk assessment of targeting within or near cancer-related genes by piggyBac and Tol2
A list of cancer-related genes targeted by iggyback or Tol2
Targeted Cancer Gene
hematopoietically expressed homeobox
POU class 6 homeobox 2
ADP-ribosylation factor 4
SMAD family member 5
RAB40B, member RAS oncogene family
transcription factor 4
tyrosine kinase, non-receptor
sema domain, immunoglobulin domain (Ig), (semaphorin) 3C precursor
alkaline phosphatase, liver/bone/kidney
par-3 partitioning defective 3 homolog (C. elegans)
Ras association (RalGDS/AF-6) domain family member 3
low density lipoprotein-related protein 1B
motor neuron and pancreas homeobox 1
Indian hedgehog homolog (Drosophila)
FK506 binding protein 1A, 12kDa
forkhead box P1
RAN binding protein 9
epidermal growth factor (beta-urogastrone)
The longer the foreign sequences introduced into the host genome, the greater the probability of evoking adverse consequences, such as transgene silencing and dysregulation of the endogenous genes nearby. Hence, for both basic research and clinical applications, a transposon system with smallest terminal repeats for genetic manipulations is desired. By removing most of the nonfunctional sequences of piggyBac and Tol2 TRDs, we observed a 1.5- and 3.3-fold increase in transposition activity for piggyBac and Tol2, respectively. The increase in transposition activity for both piggyBac and Tol2 is unlikely to be due to their reduction in size, since the piggyBac element in the pXLBacII-cassette and the Tol2 element in the Tol2ends-cassette are both within their maximal cargo capacity of 9.1 Kb and 10 Kb, respectively [14, 22]. In general, the transposition activity of a transposon negatively correlates with the fitness of the host. Although in most cases the activity of transposons in the host is abolished due to mutations and deletions, some transposons are intact but are completely silenced epigenetically by host defense mechanisms . For example, RNAi is the mechanism for silencing the Tc1 DNA transposon in the germ line of Caenorhabditis elegans . Unlike pXL-BacII-cassette only consisting of 245 bp-left and 313 bp-right TRD, the Tol2end-cassette preserves most of the non-coding cis sequences of the wild-type Tol2 transposon. These "non-essential sequences" may be susceptible to epigenetic silencing and in turn attenuate their transposition activity. This possibility may explain why extra cis sequences in Tol2ends-cassette has a greater impact in deregulating transposition activity than that of pXLBacII-cassette. This observation further implicates the possible interaction between epigenetic silencing factors and the cis sequence of wild-type transposons, and for Tol2 in particular. Studies are now underway to address this possibility.
Unlike our findings that pPB-cassette3short with short TRDs at the ends results in a higher activity than its long counterpart in HEK 293, attempts to transform D. melanogaster with p(PZ)-Bac-EYFP consisting of 35-bp 3'TRD and 63-bp 5'TRD yielded transformation frequencies far less than full-length piggyBac constructs (reduced from 15% to 0.6%) . This discrepancy may simply reflect the differences in the components and/or the mechanism involved in transposition between mammalian and insect cells. It is also possible that the extra 5 and 4 nucleotides included in our 3'- and 5'-TRD, respectively, are crucial for an effective transposition. Another important feature of our functional piggyBac terminal sequences (referred as micro-PB hereafter) is that most of the activator sequences (as underlined in Figure 1A) identified previously in D. melanogaster  are excluded. In this respect, the micro-PB may potentially be a safer cis-piggyBac element as a mammalian genetic tool as compared to the minimal piggyBac cis-sequence identified previously. Studies are now underway to address whether micro-PB exhibits any enhancer or silencer activity.
Genome-wide targeting profiles of piggyBac and Tol2 in the human genome have been previously reported [31–34]. All of these analyses utilized chromosomal target sequences that were retrieved either by plasmid rescue from a heterogenous population of targeted cells or by PCR-based strategies using a limited amount of genomic DNA isolated from individual targeted clones grown on 96-well plates. Several factors may introduce strong biases into the data sets obtained in these studies including (1) differences in proliferation rates of the individual targeted cells, (2) intrinsic difficulties in retrieving certain targeting sequences, and (3) biases in obtaining PCR products from certain templates but not from the others. Hence, to fully evaluate the pros and cons of piggyBac and Tol2 for gene discovery and gene therapy, a direct comparison of their genome-wide targeting profile based on reliable data sets obtained within the same experimental setting was needed. To achieve this goal, we utilized a labor intensive strategy involving isolating, expending, and performing plasmid rescue to retrieve chromosomal targeting sequences for each individual HEK 293 clone targeted. Based on the following observations, we believe the data sets established in this study provides reliable insights into the targeting profiles of piggyBac and Tol2. First, we successfully rescued plasmids from 87% and 91% of piggyBac and Tol2 targeted clones, and the majority of clones that were not rescued were due to a lack of sufficient genome DNA for performing plasmid rescue. Second, several copies of an identical plasmid were often obtained in the same targeted clones, suggesting that most, if not all, inserts in the same clones were successfully recovered. Third, for each individual clone targeted, we normally obtained 1-4 different inserts, consistent with a recent report that the copy number of Tol2 and piggyBac in HeLa cells ranges between 1-3 and 1-4, respectively . Identifying targeted sites in individual clones has led to the identification of piggyBac and Tol2 hotspots and allowed us to perform a detailed and unbiased analysis on target site preferences for both transposon systems.
All piggyBac and Tol2 hotspots identified in this study are likely to be bona fide given the following reasons. First, the protocol (as detailed in the methodology section) used to isolate individual targeted clones is intentionally designed to avoid cross-contamination between individual drug-resistant colonies. Second, all of the target sequences in this study were retrieved using plasmid rescue rather than a PCR-based strategy. A small amount of contaminating genomic DNA, if any, is not sufficient for a successful plasmid rescue. Third, the four Tol2 targets mapped to the hotspot located in the SIRPD locus were derived from two separate experiments suggesting the occurrence of independent targeting events at this particular site in the HEK 293 genome. Finally, all of the piggyBac and Tol2 clones with a hotspot targeted contain additional integrations mapped to distinct chromosomal locations (data not shown), indicating all of these targeted clones were indeed independent. Our analyses of Tol2 have revealed a distinct global targeting distribution among 23 human chromosomes in HEK 293, which stands in sharp contrast to the reported Tol2 distribution in HeLa cells (compare Figure 4 with the Figure 6A in reference 34). Distinct Tol2 genome-wide targeting profiles in HEK 293 and HeLa cells seem to reflect their difference in frequency of targeting to different genomic contexts. For instance, our analyses revealed 23.5% and 15.4% of Tol2 intronic and exonic targeting frequency in HEK 293, respectively (Figure 5A), while the reported intronic and exonic targeting rate of Tol2 in HeLa cells are 45.1% and 3.5%, respectively (Table 2 in reference 34). Discrepancies in the frequency of Tol2 targeting to various repeat types between our study and others were also detected. Two factors may account for the observed discrepancies: namely (1) differences in strategies, and (2) differences in Tol2 targeting preferences in HEK 293 and HeLa cells. The former factor should not substantially contribute to the great difference in targeting preferences seen in the two separate studies, since even if one approach is less biased than the other, a certain degree of overlapping in Tol2 target distributions should still be detected in both human cell types. However, this is not the case. Hence, the non-overlapping Tol2 target profiles are likely due to differences in cell types. As for piggyBac, although its intragenic target rate in this study (51.6% in HEK 293) and in other studies (51.9% in primary T cells) is similar, we observed a much higher frequency of piggyBac targeting to untranslated regions in HEK 293 (15.8% total) than what was observed in primary T cells (1.7% total) (compare Figure 5A with data reported in reference 35). Additionally, we fail to detect any piggyBac targets that are found both in HEK293 (this study) and in human T cells . Unlike the data set established in this study, the genome-wide piggyBac targets in primary T-cells were obtained from a heterogenous population of piggyBac targeted clones . Consequently, the data set obtained from primary T-cells is inevitably biased to the target sites that are easily retrieved by plasmid rescue, a factor that may contribute significantly to the sharp contrast in the targeting profiles of piggyBac observed in the two different cell types. However, our data set revealed five piggyBac hotspots in HEK 293 and yet no target in our data set is found in that of primary T cells, suggesting cell type differences may still be the major contributing factors when explaining these observed differences. Furthermore, these differences were likely to be amplified by the fact that unlike T-primary cells which contain normal 46 chromosomes, HEK 293 is a transformed cell line with an aberrant karyotype of 64 chromosomes as characterized originally. Collectively, comparisons of our data with that of others highlights the necessity for (1) obtaining a reliable data set for genome-wide target analyses (preferably by retrieving all target sequences for each individual targeted clone) and (2) re-evaluating the genome-wide target profile of transposons (at least piggyBac and Tol2) in the specific stem cell type of therapeutic interest before advancing them to clinical uses.
The reliable data sets obtained in this study allow us to perform in-depth sequence analyses of their targets without ambiguity. The sequence logo of Tol2 detected subtle but significant information present within the first 11 base pairs on the 3' end of Tol2 target sites. Furthermore, as indicated in Table 3 despite the fact that the target sequence of the most frequently targeted Tol2 hotspot (4 out of 207) is actually located within LINEs and shares more than 97% sequence identity with two other sequences in the genome, Tol2 only targeted to this particular site but not to other similar sequences. Collectively, these observations strongly suggest even though no distinct features of Tol2 target sequences can be readily identified, Tol2, like piggyBac, also targets in a selective manner in the host genome. The in-depth sequence analyses also revealed the following important features of piggyBac targeting preference: (1) TTAA sites in a particular sequence context are targeted by piggyBac, as opposed to arbitrary TTAA sites, (2) there is no direct correlation between piggyBac hotspots (and Tol2 hotspots as well) and the activity of genes either contained within or near the hotspots, and (3) at least the first 100 nucleotides on either side of piggyBac target site seem to be important for piggyBac target selection, and a subtle change in the primary sequence within this 200 bp interval may result in losing its potential for piggyBac targeting. These insights will provide a solid knowledge basis for engineering piggyBac transposase to achieve site-specific therapeutic gene targeting.
Powerful genetic tools enabling the probing of functions of both coding and non-coding genome sequences are urgently needed to facilitate the progress in determining the genetic factors that contribute to our uniqueness as human beings in a post-genomic era. The fact that piggyBac favorably targets intragenic chromosomal regions makes it a great tool for uncovering the functions of protein coding genes. Transposable elements are often considered "junk" DNA in the human genome. An increasing body of evidence, however, suggests that a fraction of these repetitive sequences are active and play import roles in epigenetic gene regulation [43, 44, 46, 47]. The preference of Tol2 to target genomic repeats makes it an ideal tool for revealing new functions of transposable elements residing in our genome. Collectively, the non-overlapping genome-wide target profiles of piggyBac and Tol2 potentially makes them complementary research tools for studying the human genome.
Genotoxicity caused by a single integration event mediated by the retrovirus-based vector has resulted in the development of T-cell leukemia in 5 of 20 patients treated for SCID with one death reported . Hence, no wild type DNA transposon is considered safe for gene therapy since they all introduce transgenes into a host genome in a random fashion. Indeed, our genome-wide target profiling of piggyBac in HEK 293 revealed a piggyBac hotspot located within the coding region of gephyrin, a scaffold protein implicated in colon cancer and adult T-cell leukemia [40–42]. Most active mammalian genome manipulating enzymes, including viral integrases and DNA transposase, must therefore be molecularly modified to achieve the ultimate goal in gene therapy: targeting the therapeutic gene into a predetermined genomic site where the therapeutic gene can be stably and faithfully expressed without disturbing the global gene expression profile. Put into perspective, piggyBac is by far the most promising vector system for gene therapy, as piggyBac transposase is the only one capable of being molecularly modified without substantially losing activity (reference 15 and this study).
The transposon-based tool box for mammalian genomic manipulations is expanding. Here, we engaged in a side-by-side comparison of two highly effective mammalian active transposons, piggyBac and Tol2, to evaluate their pros and cons for gene discovery and gene therapy. We report the identification of the shortest piggyBac TRDs, micro-PB, which have a higher transposition efficiency in HEK 293 than that of the previously reported piggyBac minimal terminal repeat domains, mini-piggyBac. Our genome-wide target profiling reveals that piggyBac and Tol2 display complementary targeting preferences, making them suitable tools for uncovering the functions of protein-coding genes and transposable elements, respectively, in the human genome. Our results suggest that piggyBac is the most promising DNA transposon for gene therapy because its transposase is likely the most amenable mammalian genetic modifier for being molecularly engineered to achieve site-specific therapeutic gene targeting. Our in-depth sequence analyses of piggyBac targets revealed that the sequence context near and within a considerable distance from the TTAA piggyBac target site is highly important in site selection. Based on this observation, it is clear that in order to advance piggyBac for a clinical use in gene therapy, a safe and favorable site for piggyBac targeting in the genome of the appropriate therapeutic stem cell should first be identified, followed by the engineering of piggyBac transposase to achieve site-specific gene targeting.
The plasmid construction described in this study followed the protocol of Molecular Cloning, 3rd edition, CSHL . The sequences of all constructs involving PCR-based cloning were confirmed by DNA sequencing. The process of each construction is described briefly as follows:
The short piggyBac TRDs (i.e. 746~808 3' LTR and 1426~1460 5' LTR as in pXL-BacII [29, 30]) were obtained from the PCR mixture consisting of the following four pairs of primers; pB-11-KpnI (atcgggtaccttaaccctagaaagataatcatattg), pB-5-forward (ggtaccCCCTAGAAAGATAATCATATTGTGACGTACGTTAA AGATAATCATGCGTAAAATTGACGCATGctcgag), pB-6-reverse (gagctcCCCTAGAAAGATAGTCTGCGTAAAATTGACGCATGccaccgcggtggatttaa atctcgagcatgcgtca), and pB-12-SacI (cgatgagctcttaaccctagaaagatagtctgcg). The resulted amplicon containing both 67 bp 5' and 40 bp 3' TRD with SwaI and Xho I restriction sites in between was cloned into pBS-SKII through Kpn I and Sac I restriction sites to obtain the pPBendAATT. The same cassette (containing the hygromycin resistant gene driven by SV40, the replication origin, ColE1, and the kanamycin resistant gene) as in pXLBacII-cassette  was inserted between short piggyBac TRDs in pPBendAATT through the blunt-ended Xho I site to make the intermediate construct, pPBcassette3. To generate the pPB-cassette3short, pPBcassette3 was digested with Acc65 I and Afl III to remove the ampicillin resistant gene and the f1 replication origin. The remaining DNA fragment was blunt-ended followed by self-ligation to generate the final construct, pPB-cassette3short.
To construct the Tol2 donor with short TRDs, two separated PCR products were generated by two sets of primers, Tolshort-1 (atcgggtaccatttaaatCAGAGGTGTAAAGTACTTG)/Tolshort-2 (tatcaagcttagatctagAAGTGATCTCCAAAAAATAAG) and Tolshort-3 (ctaagcttgatatcaacggatccAATACTCAAGTACAATTTTAATGG)/Tolshort-4 (cgatgagctcatttaaatCAGAGGTGTAAAAAGTACTC), respectively using the Tol2end-cassette  as a template. Next, these two PCR products were served as templates to produce the third PCR product using the Tolshort-1 and Tolshort-4. The third PCR product was cloned into the Kpn I and Sac I site of pBS-SK II vector to generate the miniTol2-end. The same cassette as described in section (1) above was then inserted into the EcoR V site of miniTol2end to generate pTol2mini-cassette.
To generate pPRIG-piggyBac, the coding sequence of the piggyBac transposase was PCR amplified from pcDNA3.1Δneo-piggyBac  using primer piggyBac-10 (ATCGGAATTCACCATGGGTAGTTCTTTAGACG) and primer piggyBac-11 (AAGGCACAGTCGAGGCTG). The PCR product was cloned into the EcoR I and Not I site of the pPRIG vector.
The coding sequence of the Tol2 transposase was obtained from the Xba I/BamHI restriction fragment of pcDNA3.1Δneo-Tol2  and then inserted into the Stu I and BamHI sites of pPRIG vector.
The same fragment containing the ORF of piggyBac transposase as described in section (3) above was cloned into the pCMV-myc vector (Clontech, Inc) to generate pCMV-Myc-piggyBac.
A pair of complementary oligos containing the sequence of the HA tag was synthesized, annealed and inserted into the BamHI site of pPRIG-Tol2 vector to generate pPRIG-HA-Tol2 which expresses a N-terminal HA tagged Tol2 transposase. The clones with a correct orientation were obtained and verified by DNA sequencing.
pPRIG-Tol2-HA expressing the C-terminal HA tagged Tol2 transposase was constructed by swapping the restriction fragment of XcmI and SphI of pCR4-TOPO-Tol2HAc (the detailed procedure regarding the construction of this plasmid is upon requested) with those in pPRIG-Tol2.
Cell culture and transposition assay
HEK 293 cells were maintained in MEMα medium (HyClone) supplemented with 10% FBS (HyClone), 100 units/ml penicillin, and 100 μg/mL streptomycin. The details for the transposition assays were described previously .
Activity assay of the piggyBactransposase
A similar procedure as detailed previously  was used to co-transfect 100 ng of piggyBac donor, with various amount of the piggyBac helper, pCMV-Myc-piggyBac, ranging from 0 - 300 ng into 1.2 × 105 of HEK 293 cells. pcNDA3.1ΔNEO, an empty vector used in our previous study (Wu et.al 2006), was used to top the total amount of DNA transfected to 400 ng. Each transfection condition was done in triplicate. Twenty four hours after transfection, one fifth of transfected cells were subjected to transposition assay. The remaining transfected cells (4/5) in triplicate were pooled and grew in a 35-mm plate for another twenty four hours before being subjected to Western blotting. For Western blotting, total proteins were extracted using RIPA buffer (50 mM Tris, pH 8.0, 150 mM NaCl, 1% Nonidet P-40, 0.1% SDS, 0.5% sodium deoxycholate, and 1:100 diluted proteinase inhibitor cocktail) and quantified using the Lowry assay (Biorad). Twenty μg of total proteins were separated by SDS-PAGE on a 8% acrylamide gel. After electrophoresis, the gel were transferred to PVDF membranes (millipore). The membrane was then probed with anti-Myc antibody at 1:1000 (Clontech) and anti-α actin antibody (Calbiochem) at 1:10,000. After three washes, a secondary antibody, peroxidase-conjugated goat anti mouse IgG, was added. After incubation and three washes, the secondary antibodies were subsequently detected by ECL.
Retrieving chromosomal sequences flanking the transposon targets by plasmid rescue
The same transfection procedure detailed previously was used to transfect the piggyBac donor, pXLBacII-cassette, and Tol2 donor, Tol2ends-cassette, along with their corresponding helper, pPRIG-piggyBac and pPRIG-Tol2, respectively, into HEK 293 cells using Fugene HD (Roche). The transposition efficiency for pXLBacII-cassette and Tol2ends-cassette is around 1~2%. To avoid the duplication of the same targeted cell, twenty four hours after the addition of Fugene HD, transfected cells were subjected to a series dilutions and then grown in the hygromycin (100 μg/ml) containing culture medium at a density (about 20 ~ 30 colonies per 100-mm plate as estimated from 1~2% of transposition rate) enabling for isolating individual colonies without cross-contamination. Two weeks after selection, colonies which were at a great distance away from adjacent colonies were individually cloned and expanded until reaching confluence on 100-mm dishes. Genomic DNA of individual clones was isolated and subjected to plasmid rescue. Detailed procedures for plasmid rescue were described previously . Plasmids rescued from the same targeted clone were digested with Hinf II (4-cutter restriction enzyme). For each targeted clone, only plasmids showing different Hinf II digestion patterns were subjected to sequencing. Based on the Hinf II digestion pattern, all of the colonies isolated displayed a distinct repertoire of rescued plasmids indicating that each isolated colony was indeed derived from different targeted cells.
Q-PCR and Q-RT-PCR
HEK 293 cDNA was obtained using the FastLane Cell cDNA kit (Qiagen). One point three μl of cDNA and 0.125 μg (predetermined by a series dilution of genomic DNA) of HEK 293 genomic DNA were subjected to Q-PCR using primers listed in 2. Q-RT-PCR was performed using SYBR Green PCR Master Mix (Applied Biosystems) in 20 μl of reaction on 7500 Fast Real-Time PCR System (Applied Biosystems). The expression level of individual transcripts was determined by dividing the copy number of each cDNA with the copy number of the corresponding gene using following formula: 2(Ctgenomic DNA-CtcDNA). The relative expression level between each gene and GAPDH was calculated by the ratio of the gene expression level between the two.
Target sites were identified in build hg18 of the human genome using Blat , with a sequence identity cutoff of 95%. Human genes were obtained from RefSeq , and 2,075 cancer-related genes were taken from the CancerGenes database . Upon counting the number of genes within n base intervals, all overlapping genes were first merged to avoid over-counting. CpG islands were taken from the UCSC Genome browser "CpG Island" track, which identifies CpG islands based on the methods of Gardiner-Garden and Frommer . Repeat elements predictions were obtained from RepeatMasker . Only insertions whose first 100 bases are contained within a repeat element were considered to overlap a repeat element. To estimate the significance of the tendency of insertions to be located proximal to CpG islands, we compared the number of insertions located within 2,000 bases of a CpG island to the number expected by chance. The expected number was calculated for each transposon type by picking N random regions in the genome of the same size (in bases) as the given transposon, where N is the total number of insertions for the given transposon. This procedure was repeated 1,000 times, and the mean and standard deviation of the number of random insertions points within 2,000 bases of a CpG island across the 1,000 random trials were used to obtain a Z-score (and associated P-value) for the actual number of insertions located within 2,000 bases of a CpG island.
Long interspersed nuclear element
Short Interspersed nuclear Element
Terminal repeat domain
terminal inverted repeat
polymerase chain reaction
Quantitative reverse transcription PCR
The pPRIG vector was kindly provided by Dr. Patrick Martin, Université de Nice, Parc Valrose, 06108 Nice, France. The piggyBac vector system was kindly provided by Dr. Max Scott, Institute of Molecular BioSciences, Massey University, Palmerston North, New Zealand. The Tol2 vector system was kindly provided by Dr. Loichi Kawakami, National Institute of Genetics, Shizuoka, Japan. We thank Dr. Scott C. Schuyler for critical reading of the manuscript and Miss Chih-Jou Chen for assisting the preparation of the manuscript. Dr. Sareina C.-Y. Wu was supported by the funding of Chang Gung Molecular Medicine Research Center (EMRPD190181) and in part by a grant from the Children's Medical Research Foundation and a NIH T32 training grant. Miss Pei-Cheng Chung was supported by the funding of Chang Gung Molecular Medicine Research Center (EMRPD190181). This study was supported by Taiwan National Science Council (NMRPD180021), Chang Gung Memorial Hospital (CMRPD180041), and in part by YongKang Veterans Hospital (VHYK9804) and GenomeFrontier, INC.
- McClintock B: The origin and behavior of mutable loci in maize. Proc Natl Acad Sci USA. 1950, 36: 344-355. 10.1073/pnas.36.6.344.View ArticleGoogle Scholar
- Hayes F: Transposon-based strategies for microbial functional genomics and proteomics. Annu Rev Genet. 2003, 37: 3-29. 10.1146/annurev.genet.37.110801.142807.View ArticleGoogle Scholar
- Spradling AC, Rubin GM: Transposition of cloned P elements into Drosophila germ line chromosomes. Science. 1982, 218: 341-347. 10.1126/science.6289435.View ArticleGoogle Scholar
- Bellen HJ, O'Kane CJ, Wilson C, Grossniklaus U, Pearson RK, Gehring WJ: P- element-mediated enhancer detection: A versatile method to study development in Drosophila. Genes Dev. 1989, 3: 1288-1300. 10.1101/gad.3.9.1288.View ArticleGoogle Scholar
- Thibault ST, et al: A complementary transposon tool kit for Drosophila melanogaster using P and piggyBac. Nat Genet. 2004, 36: 283-287. 10.1038/ng1314.View ArticleGoogle Scholar
- Plasterk RH: The Tc1/mariner transposon family. Curr Top Microbiol Immunol. 1996, 204: 125-143.Google Scholar
- Osborne BI, Baker B: Movers and shakers: Maize transposons as tools for analyzing other plant genomes. Curr Opin Cell Biol. 1995, 7: 406-413. 10.1016/0955-0674(95)80097-2.View ArticleGoogle Scholar
- Ivics Z, Hackett PB, Plasterk RH, Izsvák Z: Molecular reconstruction of Sleeping Beauty, a Tc1-like transposon from fish, and its transposition in human cells. Cell. 1997, 91: 501-510. 10.1016/S0092-8674(00)80436-5.View ArticleGoogle Scholar
- Ivics Z, Meng MA, Mátés L, Boeke JD, Nagy A, Bradley A, Izsvák Z: Transposon-mediated genome manipulation in vertebrates. Nature Methods. 2009, 6: 415-422. 10.1038/nmeth.1332.View ArticleGoogle Scholar
- Koga A, Suzuki M, Inagaki H, Bessho Y, Hori H: Transposable element in fish. Nature. 1996, 383: 30-10.1038/383030a0.View ArticleGoogle Scholar
- Kawakami K: Tol2: a versatile gene transfer vector in vertebrates. Genome Biol. 2007, 8 (Suppl 1): S7-10.1186/gb-2007-8-s1-s7.View ArticleGoogle Scholar
- Ryan BJ, Wangensteen KJ, Balciunas D, Schmedt C, Ekker SC, Largaespada DA: Efficient Transposition of Tol2 in the Mouse Germline. Genetics. 2009, 183: 1565-1573. 10.1534/genetics.109.100768.View ArticleGoogle Scholar
- Fraser MJ, Brusca JS, Smith GE, Summers MD: Transposon-mediated mutagenesis of a baculovirus. Virology. 1985, 145: 356-361. 10.1016/0042-6822(85)90172-2.View ArticleGoogle Scholar
- Ding S, Wu X, Li G, Han M, Zhuang Y, Xu T: Efficient transposition of the piggyBac (PB) transposon in mammalian cells and mice. Cell. 2005, 122: 473-483. 10.1016/j.cell.2005.07.013.View ArticleGoogle Scholar
- Wu SC, Meir JY, Coates CJ, Handler SM, Pelczar P, Kaminsk J: piggyBac is a flexible and highly active transposon as compared to Sleeping Beauty, Tol2 and Mos1 in mammalian cells. Proc Natl Acad Sci USA. 2006, 103: 15008-15013. 10.1073/pnas.0606979103.View ArticleGoogle Scholar
- Wang W, Lin C, Lu D, Ning Z, Cox T, Melvin D, Wang X, Bradley A, Liu P: Chromosomal transposition of piggyBac in mouse embryonic stem cells. Proc Natl Acad Sci USA. 2008, 105: 9290-9295. 10.1073/pnas.0801017105.View ArticleGoogle Scholar
- Liang Q, Kong J, Stalker J, Bradley A: Chromosomal mobilization and reintegration of Sleeping Beauty and piggyBac transposons. Genesis. 2009, 47: 404-408. 10.1002/dvg.20508.View ArticleGoogle Scholar
- Woltjen K, Michael IP, Mohseni P, Desai R, Mileikovsky M, Hämäläinen R, Cowling R, Wang W, Liu P, Gertsenstein M, Kaji K, Sung HK, Nagy A: piggyBac transposition reprograms fibroblasts to induced pluripotent stem cells. Nature. 2009, 458: 766-770. 10.1038/nature07863.View ArticleGoogle Scholar
- Yusa K, Rad R, Takeda J, Bradley A: Generation of transgene-free induced pluripotent mouse stem cells by the piggyBac transposon. Nat Methods. 2009, 6: 363-369. 10.1038/nmeth.1323.View ArticleGoogle Scholar
- Hacein-Bey-Abina S, Garrigue A, Wang GP, Soulier J, Lim A, Morillon E, et al: Insertional oncogenesis in 4 patients after retrovirus-mediated gene therapy of SCID-X1. J Clin Invest. 2008, 118: 3132-3142. 10.1172/JCI35700.View ArticleGoogle Scholar
- Deakin CT, Alexander IE, Kerridge I: Accepting risk in clinical research: is the gene therapy field becoming too risk-averse?. Mol Ther. 2009, 17: 1842-1848. 10.1038/mt.2009.223.View ArticleGoogle Scholar
- Balciunas D, Wangensteen KJ, Wilber A, Bell J, Geurts A, Sivasubbu S, Wang X, Hackett PB, Largaespada DA, McIvor RS, Ekker SC: Harnessing a high cargo-capacity transposon for genetic applications in vertebrates. PLoS Genet. 2006, 2: e169-10.1371/journal.pgen.0020169.View ArticleGoogle Scholar
- VandenDriessche T, Ivics Z, Izsvák Z, Chuah MK: Emerging potential of transposons for gene therapy and generation of induced pluripotent stem cells. Blood. 2009, 114: 1461-1468. 10.1182/blood-2009-04-210427.View ArticleGoogle Scholar
- Kang Y, Zhang X, Jiang W, Wu C, Chen C, Zheng Y, Gu J, Xu C: Tumor-directed gene therapy in mice using a composite nonviral gene delivery system consisting of the piggyBac transposon and polyethylenimine. BMC Cancer. 2009, 9: 126-10.1186/1471-2407-9-126.View ArticleGoogle Scholar
- Hackett PB, Largaespada DA, Cooper A: Transposon and transposase system for human application. Mol Ther. 2010, 18: 674-83. 10.1038/mt.2010.2.View ArticleGoogle Scholar
- Geurts AM, Yang Y, Clark KJ, Liu G, Cui Z, Dupuy AJ, et al: Gene transfer into genomes of human cells by the Sleeping Beauty transposon system. Mol Ther. 2003, 8: 108-117. 10.1016/S1525-0016(03)00099-6.View ArticleGoogle Scholar
- Kay MA: Site-directed transposon integration in human cells. Nucleic Acids Research. 2007, 35: e50-10.1093/nar/gkm089.View ArticleGoogle Scholar
- Mátés L, Chuah MK, Belay E, Jerchow B, Manoj N, Acosta-Sanchez A, et al: Molecular evolution of a novel hyperactive Sleeping Beauty transposase enables robust stable gene transfer in vertebrates. Nat Genet. 2009, 41: 753-61.View ArticleGoogle Scholar
- Li X, Harrel RA, Handler AM, Beam T, Hennesy K, Fraser MJ: piggyBac Internal sequences are necessary for efficient transformation of target genomes. Insect Mol Biol. 2005, 14: 17-30. 10.1111/j.1365-2583.2004.00525.x.View ArticleGoogle Scholar
- Fraser MJ: Home of piggyBac. [http://piggybac.bio.nd.edu/]
- Urasaki A, Morvan G, Kawakami K: Functional dissection of the Tol2 transposable element identified the minimal cis-sequence and a highly repetitive sequence in the subterminal region essential for transposition. Genetics. 2006, 174: 639-49. 10.1534/genetics.106.060244.View ArticleGoogle Scholar
- Wu SC, Maragathavally KJ, Coates CJ, Joseph M, Kaminski JM: Steps toward targeted insertional mutagenesis with class II transposable elements. Methods in Molecular Biology. 2007, 435: 139-151. full_text.View ArticleGoogle Scholar
- Wilson MH, Coates CJ, George AL: PiggyBac transposon-mediated gene transfer in human cells. Mol Ther. 2007, 5: 139-145. 10.1038/sj.mt.6300028.View ArticleGoogle Scholar
- Grabundzija I, Irgang M, Mátés L, Belay E, Matrai J, Gogol-Döring A, Kawakami K, Chen W, Ruiz P, Chuah MK, VandenDriessche T, Ivics Z: Comparative analysis of transposable element vector systems in human cells. Mol Ther. 2010, 18: 1200-9. 10.1038/mt.2010.47.View ArticleGoogle Scholar
- Galvan DL, Nakazawa Y, Kaja A, Kettlun C, Cooper LJ, Rooney CM, Wilson MH: Genome-wide mapping of piggyBac transposon integrations in primary Human T cells. J Immunother. 2009, 32: 837-44. 10.1097/CJI.0b013e3181b2914c.View ArticleGoogle Scholar
- Huang X, Guo H, Tammana S, Jung YC, Mellgren E, Bassi P, Cao Q, Tu ZJ, Kim YC, Ekker SC, Wu X, Wang SM, Zhou X: Gene transfer efficiency and genome-wide integration profiling of Sleeping Beauty, Tol2, and piggybac transposons in human primary T Cells. Mol Ther. 2010, 6: 1-11.Google Scholar
- Kent WJ, Sugnet CW, Furey TS, et al: The Human Genome Browser at UCSC. Genome Res. 2002, 12: 996-1006. [http://genome.ucsc.edu/cgi-bin/hgGateway]View ArticleGoogle Scholar
- Shaw G: The HEK293 Cell Database. [http://www.mbi.ufl.edu/~shaw/293.html]
- Fischer A, Hacein-Bey-Abina S, Cavazzana-Calvo M: 20 years of gene therapy for SCID. Nature Immunology. 2010, 11: 457-460. 10.1038/ni0610-457.View ArticleGoogle Scholar
- Reddy-Alla S, Schmitt B, Birkenfeld J, Eulenburg V, Dutertre S, Boöhringer C, Goötz M, Betz H, Papadopoulos T: PH-Domain-driven targeting of collybistin but not Cdc42 activation is required for synaptic gephyrin clustering. European Journal of Neuroscience. 2010, 31: 1173-1184. 10.1111/j.1460-9568.2010.07149.x.View ArticleGoogle Scholar
- Ruginis T, Taglia L, Matusiak D, Lee B-S, Benya RV: Consequence of Gastrin-Releasing Peptide Receptor Activation in a Human Colon Cancer Cell Line: A Proteomic Approach. Journal of Proteome Research. 2006, 5: 1460-1468. 10.1021/pr060005g.View ArticleGoogle Scholar
- Ozawa T, Itoyama T, Sadamori N, Yamada Y, Hata T, Tomonaga M, Isobe M: Rapid isolation of viral integration site reveals frequent integration of HTLV-1 into expressed loci. J Hum Genet. 2004, 49: 154-165. 10.1007/s10038-004-0126-7.View ArticleGoogle Scholar
- Slotkin RK, Martienssen R: Transposable elements and the epigenetic regulation of the genome. Nat Rev Genet. 2007, 8: 272-85. 10.1038/nrg2072.View ArticleGoogle Scholar
- Sijen T, Plasterk RHA: Transposon silencing in the Caenorhabditis elegans germline by natural RNAi. Nature. 2003, 426: 310-314. 10.1038/nature02107.View ArticleGoogle Scholar
- Shi X, Harrison RL, Hollister JR, Mohammed A, Fraser MJ, Jarvis DL: Construction and characterization of new piggyBac vectors for constitutive or inducible expression of heterologous gene pairs and the identification of a previously unrecognized activator sequence in piggyBac. BMC Biotechnology. 2007, 18: ;7:5-Google Scholar
- Sanjida H, Haig HK: Many LINE1 elements contribute to the transcriptome of human somatic cells. Genome Biology. 2009, 10: R100-10.1186/gb-2009-10-9-r100.View ArticleGoogle Scholar
- Chow JC, Ciaudo C, Fazzari MJ, Mise N, Servant N, Glass JL, Attreed M, Avner P, Wutz A, Barillot E, Greally JM, Voinnet O, Edith Heard: LINE-1 activity in facultative heterochromatin formation during × chromosome inactivation. Cell. 2010, 141: 956-969,. 10.1016/j.cell.2010.04.042.View ArticleGoogle Scholar
- Sambrook J: Molecular Cloning A Laboratory Manual. 3
- Pruitt KD, Tatusova T, Klimke W, Maglott DR: NCBI Reference Sequences: current status, policy and new initiatives. Nucleic Acids Research. 2009, 37: D32-D36. 10.1093/nar/gkn721.View ArticleGoogle Scholar
- Higgins ME, Claremont M, Major JE, Sander C, Lash AE: CancerGenes: a gene selection resource for cancer genome projects. Nucleic Acids Research. 2007, 35: D721-D726. 10.1093/nar/gkl811.View ArticleGoogle Scholar
- Gardiner-Garden M, Frommer M: CpG islands in vertebrate genomes. J Mol Biol. 1987, 196: 261-82. 10.1016/0022-2836(87)90689-9.View ArticleGoogle Scholar
- Smit AFA, Hubley R, Green P: RepeatMasker Open-3.0. 1996, [http://www.repeatmasker.org]Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.