Collagen is the most abundant class of proteins in the extracellular matrix and it demonstrates dynamic interactions with its biological microenvironment . Therefore, fabrication of de novo, synthetic collagen in which specific sequences can be prescribed has tremendous potential in tissue engineering, drug delivery applications, and fundamental biological studies of the extracellular matrix. Towards this end, two challenges exist: (1) the synthesis of genes encoding the collagen-mimetic polymers which contain repeating amino acid sequences, and (2) the hydroxylation of prolines in a recombinant system. To address the first constraint, our research group has developed a platform that yields genes encoding for full-length, modular collagen and its variants. This approach allows mixing-and-matching of specific functional amino acid sequences from the different families of collagens and enables introduction of non-native sequences at defined locations, combinations, and number of occurrences .
We are addressing the second challenge, recombinant expression coupled with post-translational proline hydroxylation, by designing and optimizing genetic modifications in Saccharomyces cerevisiae. Proline hydroxylation is particularly important, as it imparts stability and structure to collagen. However, bacterial and yeast expression systems do not natively perform this post-translational modification, and must be engineered to produce prolyl-4-hydroxylase (α and β subunits). These systems, however, tend to yield biopolymers with lower levels of proline hydroxylation, ranging from 0.5% to 38% proline hydroxylation in recombinant collagens produced in S. cerevisiae[2–4] and 44.2 − 47.2% in the best P. pastoris reported systems [5, 6]. In comparison, fibrillar human collagens from native tissues show 42–54% hydroxylation [7, 8].
Given the large possible range of values, we needed an accessible and facile assay that can determine the level of proline hydroxylation in future libraries of recombinant collagen and its variants. Such an assay should also use relatively small amounts (pmol) of sample, require minimal processing and derivatization, and potentially enable high-throughput scale-up. As others have noted, however, detection of 4-hydroxyproline (HYP) is particularly challenging with respect to both selectivity and sensitivity .
To address these difficulties, analytical methods for HYP often require derivatization [10–13]. In fact, the conventional method of determining the percentage of proline hydroxylation, amino acid analysis (AAA), measures the concentration of amino acid residues after derivatization with a fluorescent probe, such as ninhydrin [14, 15]. However, to assay relatively small quantities (picomole), a sensitive and expensive fluorescence detector is required on the liquid chromatography system. Protocols using radioisotopes have also been developed , but the logistics of using radioactive compounds are inconvenient if appropriate research infrastructure is not in place.
Our aim was to develop a rapid method to quantify HYP without further derivatization by utilizing mass spectrometry (MS) instrumentation that would be accessible in most research institutions. Mass spectrometry protocols requiring no additional chemical reaction have been reported using hydrophilic interaction chromatography (HILIC)  and tandem (LC-MS/MS) mass-spectrometry with multiple reaction monitoring (MRM) [9, 17]. Our method to quantify the amounts of proline (PRO) and HYP in different collagen samples uses a simple and standard reversed-phase liquid chromatograph coupled to a single analyzer time-of-flight MS (LC-MS) and requires no sample derivatization.
We applied this LC-MS assay to engineered S. cerevisiae systems that we expected would yield various levels of proline hydroxylation. These yeast strains contained different collagen to prolyl-4-hydroxylase gene ratios on plasmid vectors. To determine the reliability of this LC-MS assay, these hydroxylation results were compared to conventional AAA.