A fluorescent cassette-based strategy for engineering multiple domain fusion proteins

Background The engineering of fusion proteins has become increasingly important and most recently has formed the basis of many biosensors, protein purification systems, and classes of new drugs. Currently, most fusion proteins consist of three or fewer domains, however, more sophisticated designs could easily involve three or more domains. Using traditional subcloning strategies, this requires micromanagement of restriction enzymes sites that results in complex workaround solutions, if any at all. Results Therefore, to aid in the efficient construction of fusion proteins involving multiple domains, we have created a new expression vector that allows us to rapidly generate a library of cassettes. Cassettes have a standard vector structure based on four specific restriction endonuclease sites and using a subtle property of blunt or compatible cohesive end restriction enzymes, they can be fused in any order and number of times. Furthermore, the insertion of PCR products into our expression vector or the recombination of cassettes can be dramatically simplified by screening for the presence or absence of fluorescence. Conclusions Finally, the utility of this new strategy was demonstrated by the creation of basic cassettes for protein targeting to subcellular organelles and for protein purification using multiple affinity tags.


Background
Conceptually, a fusion protein is constructed by joining two different domains to produce a new chimeric protein, which retains the properties of the individual domains. For instance, a tumor necrosis factor (TNF) inhibitor was created from the fusion of the TNF receptor to the F c domain of human immunoglobin G as it retains the ability to bind TNF and to be targeted by the immune system [1]. By using the principle of fluorescence resonance energy transfer, protein biosensors can be created from multiple domain fusions with fluorescent proteins to image cellular events such as Ca 2+ signaling, phosphoryla-tion, and caspase proteolytic cleavage [2]. In practice, such fusion proteins are created by inserting PCR products of the individual domains into an expression vector at the available restriction endonuclease sites. Previously, the flexibility of design is compromised as the choice of insertion sites limits the possible locations for future fusions into the same expression vector. In turn, many initially unplanned but simple extensions to existing fusion proteins cannot be constructed because available sites are exhausted or incompatible. As this issue is exacerbated when constructing multiple domain fusion proteins, we have created a new expression vector for subcloning using a cassette-based strategy. A basic cassette contains the sequence of an individual domain that can be recombined with other cassettes irrespective of order, without the progressively more complex management of sites. Therefore, the creation of a cassette library of commonly used domains facilitates the rapid prototyping of multiple domain fusion proteins that can perform numerous functions.

Results and discussion
Standard vector structure of the cassette Our cassettes must have a standard vector structure where a domain(s) is flanked by cut sites 1 and 2a at the 5' end and 2b and 3 at the 3' end ( Figure 1a). Site 1 and 3 can be selected arbitrarily, but site 2a and 2b must be derived from different restriction enzymes producing blunt or compatible cohesive ends. For example, there are two ways to create the AB fusion cassette from cassette A and B (likewise for the BA fusion cassette): ligate the insert from cassette B (site 2a and 3) to the host cassette A (site 2b and 3) (Figure 1b) or ligate the insert from cassette A (site 1 and 2b) to the host cassette B (cut site 1 and 2a) ( Figure  1c). Since the ligation point of site 2a and 2b produces the recognition site of neither, it cannot be cut with either restriction enzymes. Therefore, the AB fusion cassette has the same standard vector structure and can be used for further fusion following the same concept. Note that if no more than one of the four enzyme sites are found inside the domain sequence of the cassette, it is still possible to create any fusion because cassettes can be fused on either the 5' or 3' end.

pCfvtx embodies the standard vector structure and allows fluorescence screening
Our new expression vector, pCfvtx, Cassette Fused with Venus [3] in the p Trie X1.1-Hygro vector (Novagen), allows for rapid subcloning of basic and fusion cassettes by screening positive colonies using fluorescence ( Figure  1d). This vector fixes site 1 and 3 as NcoI and XhoI, respectively, but there are many choices for site 2a and 2b: StuI and SmaI or NheI and SpeI or BamHI and BglII. Since standardization of these specific sites is required to create a basic cassette, they must be added to the domain of interest by PCR and then inserted into the vector. pCfvtx was constructed with a stop codon flanked by two multiple cloning sites (MCS1 and MCS2) upstream of Venus [3], a mutant variant of green fluorescent protein (GFP). When a fragment is subcloned into the vector between MCS1 and MCS2, the stop codon is removed and therefore, a fluorescent cassette is created since it is fused with Venus. As the leak expression of the fusion protein is enhanced by the presence of the T7lac promoter and the absence of the lacI repressor gene [4], positive colonies will be fluorescence on bacterial culture plates (Figure 3a). To create a non-fluorescent cassette, Venus can be removed by cutting with PmeI, performing a self-ligation and then screening for the absence of fluorescence.

Fluorescence screening with Venus
It should be noted that Venus is the fastest folding and brightest GFP mutant to date [3]. Accordingly, the positive colonies will become fluorescent immediately, whereas other GFP variants may require several days. Second, these fluorescent colonies ensure that the inserted fragment is in-frame and without nonsense mutations. Also, the Cterminal fusion of GFP to target proteins is an effective assay for protein solubility and fold stability -the more fluorescent the fusion protein, the more soluble and wellfolded the inserted fragment [5,6]. Lastly, any desired fusion cassette can be designed, such that at each intermediate step, a positive colony is selected by the presence or absence of fluorescence ( Figure 2). As only one fluorescent or non-fluorescent colony is needed and the random gain or loss of this property is improbable, fluorescence is a robust reporter that tolerates much of the inefficiency in the subcloning process. In sum, through the use of fluorescence, subcloning is performed rapidly and precisely such that it is possible to efficiently create many fusion cassettes in parallel.

Protein purification cassettes
To demonstrate the utility of our cassette-based strategy, we first applied it to protein expression/purification systems, which often involve either an N-terminal or C-terminal fusion of the target protein with an affinity tag. Cassettes were made using two popular tags -6xHis (Qiagen) and Glutathione S-transferase (GST) tag (Pharmacia) [7]. The N-terminal fusion of the 6xHis tag to Venus allows binding to Ni-NTA (nickel-nitrilotriacetic acid) agarose beads, however, a simple elution yields an impure sample ( Figure 3b). The additional C-terminal fusion of GST to Venus allows binding of the previous elution to GST sepharose beads. Since the affinity tags flank the target protein and it is unlikely that a protein will non-specifically bind to both affinity beads, only full-length fusion proteins will be eluted from the GST beads. Note that the newly created 6xHis-Venus-GST fusion cassette is itself a useful affinity tag that additionally could be used to estimate protein expression greater than ~1 nM as fluorescence intensity from the Venus domain is linearly proportional to target protein concentration. Finally, the flexibility of our cassette-based strategy opens new opportunities for the design of tandem affinity purification (TAP) tags [8], which were useful in protein complex purification in the yeast proteome [9]. The customization of TAP tags is desirable as the same affinity tag may not be suitable for all organisms [10].
The cloning methodology

Protein subcellular targeting cassettes
The creation of protein biosensors has allowed the observation of signaling events in single cells [11][12][13]. Such events are often isolated to subcellular organelles such as the nucleus or endoplasmic reticulum and therefore, the ability to easily localize biosensors to these sites is important. The localization of proteins to specific organelles relies on vital cellular mechanisms that recognize leader sequences and signal peptides [14]. If a protein (such as the 6xHis-Venus-GST protein) is expressed in the cell without any localization peptides, it will be found inside the cytoplasm (Figure 3c). To localize a target protein to the nucleolus, a cassette was created containing the protein transduction domain of human immunodeficiency virus (HIV) Tat [15]. When this cassette was N-terminally fused to Venus and transfected into COS-7 cells, fluorescence was most intense in the nucleolus (Figure 3d). To localize to the lumen of the endoplasmic reticulum, a cassette was created containing the leader sequence from interleukin-4 and another cassette was created with the KDEL retention signal. When these cassettes were fused Nand C-terminally to Venus, it localized to the endoplasmic reticulum (Figure 3e). In summary, the creation of these  cassettes allows the flexibility of localizing any cassette in our library to those organelles.

Conclusions
Using the pCfvtx vector as a starting point, basic cassettes are subcloned from target genes or domains of interest by PCR. These basic cassettes can then be recombined in any order and number of times to create fusion cassettes of multiple domain proteins. Each step of the subcloning process of cassettes is rapidly and reliably screened by the presence or absence of fluorescence. In contrast to the common β-Gal screen [16], our fluorescence approach may potentially have applications in high-throughput structural genomics by identifying in-frame fragments with favorable folding and solubility properties. Unlike fluorescence, the subcloning of a target sequence using the β-Gal screen disrupts expression of the lacZ α-peptide, so a subsequent fusion cannot use the same screening process. Finally, the use of our fluorescent cassette-based strategy offers significant long-term advantages in protein engineering as each new cassette enriches the functionality of the growing library of cassettes (Table 1). Thus, future designs can efficiently build on previous work to create progressively more complex and sophisticated fusion proteins which are capable of performing a wide range of functions.

Fluorescence screening
Vectors were transformed into E. coli strain DH5α and plated on LB (Luria Broth) agarose with 100 µg/mL ampicilin. The culture plates were then incubated overnight at 37°C. Venus fluorescence was observed on the culture plate using the Lighttools Illuminatool Tunable Lighting System equipped with a 535 nm viewing filter and 488 nm/10 nm filter cup.
The kdel-antisense contained XhoI and SpeI sites. The fragment was subcloned into the pInstx vector at the SpeI and XhoI sites. The pKdeltx was subcloned by cutting and self-ligating at the SpeI site. To create pVenkdeltx vector, pVentx was cut with NcoI and NheI and the fragment was subcloned into pKdeltx at NcoI and SpeI sites. To create pIl4venkdeltx, pVenkdeltx was cut with SpeI and XhoI and the fragment was subcloned into pIl4tx at the NheI and XhoI sites.

Construction of the 6xHis-Venus-GST cassette (pHisvengsttx vector)
To create pGstvtx, GST was amplified from pGEX2T (Invitrogen) using primers gst-sense (5'-GACTAGTATGTC-CCCTATACTAGGTTATTG-3') and gst-antisense (5'-GAAGATCTATCCGATTTTGGAGGATGGTCG-3'). The gstsense and gst-antisense contained SpeI and BglII sites, respectively. The fragment was subcloned into the pCfvtx vector at the SpeI and BglII sites. pGsttx was created by cutting and self-ligation at the PmeI site. To create the pHisvtx vector, 5'-end phosphorylated primers his-sense (5'-CAT-GGGCCTGACTAGTGGCAGCAGCCACCACCACCAC-CACCACAGCAGCGGCG-3') and his-anti-sense (5'-CTAGCGCCGCTGCTGTGGTGGTGGTGGTGGTGGCT-GCTGCCACTAGTCAGGCC-3') were self-hybridized and subcloned into the pCfvtx vector at the NcoI and NheI sites. pHistx was created by cutting and self-ligation at the PmeI site. To create pHisventx vector, pVentx was cut with SpeI and XhoI and the fragment was subcloned into the pHistx at NheI and XhoI sites. To create pHisvengsttx, pHisventx was cut with NcoI and NheI and the fragment was subcloned into pGsttx at the NcoI and SpeI sites.