Genome-wide response on phytosterol in 9-hydroxyandrostenedione-producing strain of Mycobacterium sp. VKM Ac-1817D

Background Aerobic side chain degradation of phytosterols by actinobacteria is the basis for the industrial production of androstane steroids which are the starting materials for the synthesis of steroid hormones. A native strain of Mycobacterium sp. VKM Ac-1817D effectively produces 9α-hydroxyandrost-4-ene-3,17-dione (9-OH-AD) from phytosterol, but also is capable of slow steroid core degradation. However, the set of the genes with products that are involved in phytosterol oxidation, their organisation and regulation remain poorly understood. Results High-throughput sequencing of the global transcriptomes of the Mycobacterium sp. VKM Ac-1817D cultures grown with or without phytosterol was carried out. In the presence of phytosterol, the expression of 260 genes including those related to steroid catabolism pathways significantly increased. Two of the five genes encoding the oxygenase unit of 3-ketosteroid-9α-hydroxylase (kshA) were highly up-regulated in response to phytosterol (55- and 25-fold, respectively) as well as one of the two genes encoding its reductase subunit (kshB) (40-fold). Only one of the five putative genes encoding 3-ketosteroid-∆1-dehydrogenase (KstD_1) was up-regulated in the presence of phytosterol (61-fold), but several substitutions in the conservative positions of its product were revealed. Among the genes over-expressed in the presence of phytosterol, several dozen genes did not possess binding sites for the known regulatory factors of steroid catabolism. In the promoter regions of these genes, a regularly occurring palindromic motif was revealed. The orthologue of TetR-family transcription regulator gene Rv0767c of M. tuberculosis was identified in Mycobacterium sp. VKM Ac-1817D as G155_05115. Conclusions High expression levels of the genes related to the sterol side chain degradation and steroid 9α-hydroxylation in combination with possible defects in KstD_1 may contribute to effective 9α-hydroxyandrost-4-ene-3,17-dione accumulation from phytosterol provided by this biotechnologically relevant strain. The TetR-family transcription regulator gene G155_05115 presumably associated with the regulation of steroid catabolism. The results are of significance for the improvement of biocatalytic features of the microbial strains for the steroid industry. Electronic supplementary material The online version of this article (10.1186/s12896-019-0533-7) contains supplementary material, which is available to authorized users.

Steroid-degrading bacteria are widespread in the environment, and play an important role in the global carbon cycle [4]. Bacterial sterol degradation represents cascades of reactions which can be conventionally divided according to the parts of the steroid molecule: aliphatic side chain, rings A/B and rings С/D degradation (Fig. 1). Dozens of enzymes are involved in this process. The groups of putative genes engaged in sterols degradation have been characterised in the genomes of several strains such as Mycobacterium neoaurum VKM Ac-1815D, Gordonia neofelifaecis NRRL B-59395, M. tuberculosis H37Rv, Rhodococcus jostii RHA1, M. neoaurum NRRL 3805B, Nocardioides simplex VKM Ac-2033D and M. smegmatis mc 2 155 [2,[5][6][7][8][9]. It had been proposed that steroid catabolism pathways were conserved in certain different Actinobacteria taxa [10].
The pathway of the following side chain degradation of sterols is generally similar to the fatty acids βoxidation. CoA is attached to the carboxyl group of cholestenoate with acyl-CoA synthetase encoded by fadD19 [18]. The CoA-thioester formed is further subjected to three successive cycles of β-oxidation to complete the degradation of the side chain, resulting in the production of C17-keto androstanes [14,19,20].
Further rings A/B degradation is catalysed by the enzymes encoded by hsaA, hsaB, hsaC, hsaD, hsaE, hsaF, hsaG [25][26][27]. Most of the genes coding for the enzymes of the side chain and A/B rings degradation are regulated with KstR transcription factor [28].
As has been proven for several actinobacteria strains, the side chain and rings A/B degradation may occur simultaneously [21,22,29]. Moreover, some cholesterol ring-degrading enzymes were shown to exhibit much higher activities towards the side-chain degradation intermediates than towards the corresponding C-17 ketosteroids with fully degraded side chain, e.g. steroid CoA-thioesters represent physiological substrates for Ksh of M. tuberculosis [21]. These results were further confirmed for M. neoaurum ATCC 25795: greater activities of Ksh (KshA1, KshA2) have been demonstrated in vitro towards 23,24-bisnorchol-1,4diene-22-oic acid (1,4-BNC) than towards AD [24,30]. It coincides with the structural features of KshA (for review, see [31]) [21,32] and the resulting transcriptional changes of kshA1 and kshA2: unlike cholesterol, AD was shown to be a poor inducer for both enzymes in M. neoaurum ATCC 25795 [30]. At the same time, KshAs demonstrate lower activity towards cholestenone than towards C19-steroids such as AD, ADD, or testosterone [30,33].
The KstD enzyme may also be involved in different stages of steroid catabolism, in particular, KstD activity has been identified for both AD and partially oxidised side chain steroids, such as 22-hydroxy-23,24-bisnorchola-4-en-3-one [34].
The C/D rings of the steroid core are degraded by the products of the genes regulated with KstR2. KstR2 is a transcription repressor which similar to KstR belongs to TetR family. As previously shown, the effector of KstR2 is HIP-CoA which is a thioester of cholesterol metabolite HIP (3aα-H-4α(3-propanoate)-7aβ-methylhexahydro-1,5indanedione) [35]. Regulon KstR2 in M. smegmatis mc 2 155 was reported to consist of 15 genes (MSMEG_ 5999-MSMEG_6004, MSMEG_6008, MSMEG_6009, MSMEG_6011-MSMEG_6017) including kstR2 itself (MSMEG_6009) [36]. For many genes of the KstR2 regulon, the participation in HIP catabolism has been already proven [37][38][39], while the functions of some genes of this regulon have not been cleared yet.
Whole transcriptome analysis is being applied increasingly more often in sterol biodegradation research in order to reveal the complete set of the genes whose expression is induced in the presence of sterols, such as cholesterol or phytosterols [2,9,40,41]. This approach allows the identification of those genes whose products are directly or indirectly involved in the sterol degradation pathways including the genes that have not been known before as related to sterol catabolism. Moreover, this method makes it possible to determine how the specific genes are grouped within the genome, as well as to estimate the features of their regulation.
One of the first investigations which exploited whole genome approach showed that 572 genes in R. jostii RHA1 increased the level of their expression more than 2-fold in the presence of cholesterol. These genes were grouped into 6 clusters, but only two of them included the known genes related to sterol catabolism [2]. In M. smegmatis mc 2 155, the expression of 89 genes increased more than 3-fold in response to cholesterol [9]. Most of these genes were grouped into 3 clusters including the genes involved in sterol catabolism. Study of the global transcriptome of M. smegmatis mc 2 155 unraveled 454 genes increased their expression in the presence of cholesterol as compared to the control. Eleven and sixteen gene clusters were induced by cholesterol when compared with glycerol, or androstenedione, respectively [40]. However, in spite of the growing body of research on bacterial steroid degradation, many aspects of the catabolic pathways regulation remain unknown, especially those related to phytosterol oxidation by saprotrophic fast-growing mycobacteria which were recently suggested to re-classify as Mycolicibacterium in accordance with modern taxonomy [42].
Unlike other cholesterol-degrading or phytosteroltransforming mycobacteria such as M. tuberculosis, M. smegmatis mc 2 155, M. neoaurum VKM Ac-1815D, VKM Ac-1816D, NRRL 3805B or NRRL 3683B, the saprotrophic fast-growing wild-type strain of Mycobacterium sp. VKM Ac-1817D effectively produces 9-OH-AD from phytosterol [5,23]. The identification of sidechain degradation intermediates showed the presence of 9α-hydroxyl function in their structures, thus suggesting the action of Ksh at the early stages of phytosterol degradation [23]. However, effective 9-OH-AD accumulation by this strain was shown to be accompanied with slow steroid degradation due to residual 3ketosteroid-Δ 1 -dehydrogenase activity [43].
Earlier, we had carried out a comparative study of the sequences of steroid catabolism genes in the whole genome scales in Mycobacterium sp. VKM Ac-1817D (Myc 1817) and two M. neoaurum strains, namely, VKM Ac-1815D and 1816D converting phytosterols to AD and ADD, respectively [5]. The complete genome of Myc 1817 was later assembled [44]. In comparison with M. neoaurum, Myc 1817 possesses a larger genome (6.35 Mbp) containing more homologues of the genes with putative role in sterol catabolism, including several copies of kshA, kshB and kstD. Based on 16S rRNA phylogeny, Myc 1817 is in the close relationship with M. gilvum and M. smegmatis [5]. These data, in combination with specific catalytic features of Myc 1817D strain, and mainly the ability for 9α-hydroxylation of all the intermediates of sterol side chain degradation with accumulation of 9-OH-AD as a major product, allowed us to predict the differences in the induction pattern.
In this study, we estimated genome-wide response on phytosterol in Mycobacterium sp. VKM Ac-1817D based on RNA-sequencing and whole-transcriptome analysis in order to understand peculiar properties of this organism.
The harvesting point for the phytosterol-induced cells was decided on the basis of high specific activity of phytosterol oxidation suggesting maximum level of expression of the specific transcripts. During initial 9 h the conversion rate considerably increased and stabilised at the level of 90-100 μmol/l h g (d.c.w.) until depletion of phytosterol after approx. 30 h. The induced culture was harvested for RNA isolation at the age of 18-21 h. The

Transcriptome sequencing
We studied the differential expression of Myc 1817 genes during its growth in medium with phytosterol in comparison with the control medium without any steroids. Statistical parameters of the sequencing results are given in Additional file 1.
For the analysis, we selected the genes whose expression increased by no less than 3-fold at a q-value (false discovery rate) ≤ 0.01. In summary, 260 genes were revealed. The full list of these genes is presented in Additional file 2.
Sterol-induced genes were distributed irregularly within the Myc 1817 genome: they formed some large clusters, as well as a number of relatively small groups. The distribution of the genes which increased their expression in the presence of phytosterol is shown in Fig. 4.

Real-time PCR
For the expression analysis with real-time PCR, the following single-copy genes with known functions were chosen: (i) ltp3 and fadE29/chsE2 related to cholesterol side-chain cleavage [17,20]; (ii) kstD and kshB encoding the enzymes accounted for the key steps of steroid core degradation [22,32]; (iii) fadD3 and kstR2 that belongs to KstR2-regulon and related to the C/D ring degradation pathway [35]. The three housekeeping genes rpoB, rpoD and ftsQ known for insignificant expression deviations at the transcriptomic studies [45] were used as the reference genes. The results of qRT-PCR qualitatively coincided with the transcriptomic data. The expression of all six genes significantly increased as outlined below from both the transcriptomic and qRT-PCR results (Table 1). Both methods showed similar quantitative results for 4 genes, while the difference was 2-fold for fadE29/chsE2 and the qRT-PCR indicated 5.9fold higher up-regulation for kstR2. The latter may be attributed to the rather unstable expression of the KstR2regulon genes under the induction conditions, since its putative effector is HIP, but not the early sterol intermediates.

Identification of transcriptional factors binding sites
Putative KstR binding sites were detected for 57 operons including 43 operons with the genes whose expression increased in the presence of phytosterol. Most of the genes related to steroid catabolism were the parts of these operons.
Putative KstR2 binding sites were detected for 8 operons. The genes in all of them were up-regulated on phytosterol. Along with 15 orthologues of the genes putatively involved in the rings C/D degradation [36], we found also additional inducible operon which included the gene of amino hydrolase family G155_10290 with a putative KstR2-binding site in the promoter (Additional file 3). However, this site was found in only one of the two used programs (UGENE), while the score of this site only slightly exceeds the threshold value.
Among the genes over-expressed in the presence of phytosterol, we found several dozen genes without binding sites of KstR and KstR2. Among them, some orthologues of the known steroid catabolism genes were identified. In the promoters of these genes, a regularly occurring palindromic motif (motif X) was observed (Fig. 5). This motif was very similar to the binding site of M. tuberculosis transcription factor Rv0767c [46] with a Q-value of a similarity of 2 × 10 − 10 (Fig. 6). This means that the similarity was most likely not accidental. The orthologue of Rv0767c in Myc 1817 is the TetR-family transcription regulator gene G155_ 05115. There was also a sequence that corresponded to the motif X near the start codon of G155_05115. This may be a sign of autoregulation, further suggesting that the motif X indeed belongs to this transcription factor.
We identified 14 operons which adjacent with the motif X and presumably regulated with G155_05115 (Additional file 3). The expression of the genes in 12 of these operons in response to phytosterol increased more than 3-fold, and one operon (G155_19665-G155_19675) raised expression 2.7fold, on average. The operons with an adjacent site that is corresponding to motif X included the genes of putative steroid Δ-isomerase (G155_05080), some cytochromes, alcohol and aldehyde dehydrogenases, short chain dehydrogenases and TetR-family transcriptional regulators: G155_05115 itself, G155_10530, and G155_29025. Among the genes putatively encoded cytochromes, four different genes (namely, G155_05095, G155_05100, G155_05110 and G155_26175) were presumably coded for cytochrome P450monooxygenases. Interestingly, that in two cases the sites that corresponded to the motif X were found together with KstR binding sites. For example, the binding sites for both KstR and G155_05115 were found near the promoter of the gene G155_26175. Probably, the enzymes which are putatively under regulation of the protein encoded by G155_ 05115 with motif X may play role in steroid catabolism.

Steroid catabolism genes up-regulated in the presence of phytosterol
Among the phytosterol-inducible genes of Myc 1817, genes were found whose orthologues were proved, or  assumed, to have a function in sterol catabolism ( Table 2). These are the genes putatively encoding the side chain oxidation enzymes including starting oxygenases, as well as the genes involved in β-oxidation (encoded acyl-CoA-synthases, enoyl-CoA-hydratases, acyl-CoA-dehydrogenases, aldol lyases, acetyl-CoA-acetyltransferases and homologs of igr-operon genes), the genes related to steroid core destruction, uptake of steroids (mce4-operon) and the known regulators of steroid catabolism (Table 2) [2,9,40,41]. The genes related to the side chain oxidation significantly (50-136-fold) increased their expression ( Table 2). As described above, Ksh and KstD are the key enzymes of the rings A/B degradation. In our previous study, five genes homologous to oxygenase subunit of 3-ketosteroid-9α-hydroxylase (kshA) were revealed in Myc 1817 [5]. In this work, we found that two of them are phytosterol-induced: the expression of kshA_1 (G155_04755) and kshA_2 (G155_24375) was increased by 55-fold and 25-fold, respectively (Table 2).
Among the five putative kstD genes encoding 3ketosteroid Δ 1 -dehydrogenases, only kstD_1 (G155_ 04625) was over-expressed (in 61-fold) in the presence of phytosterol ( Table 2). Expression of the gene G155_08770 increased in 10 times in the presence of phytosterol. This gene is of 99% similarity to the putative 3-oxo-5α-steroid Δ 4 -dehydrogenase (Δ4(5α)-KSTD) XA26_17580 annotated earlier in Mycobacterium fortuitum CT6 genome (CP011269.1) [57]. It is worth noting that G155_08770   [56] does not show any similarity with the known genes of Δ4(5α)-KSTDs whose protein function has been experimentally confirmed, such as ro05698 of Rhodococcus jostii RHA1 and Rv1817 of M. tuberculosis [53]. The expression of orthologues of the genes whose products are involved in further A/B ring degradation, i.e. hsaABCD, hsaEFG and others, increased in 7-34 times ( Table 2). Cholesterol oxidases have not been identified among the phytosterol-induced genes, but the hsd gene (G155_22700) encoding 3β-hydroxysteroid dehydrogenase was over-expressed in the presence of phytosterol by 83-times. The genes of the only one of at least nine mce-operons were slightly (in 3-4 times) upregulated on phytosterol. This operon is similar to the mce4 operon of M. smegmatis mc 2 155 that takes part in sterol transport [56]. Interestingly, the expression of genes of the ATPases associated with the mce-operons (G155_06965 and G155_02045) was not increased in response to phytosterol in Myc 1817.
Genes related to the C/D ring degradation pathway that controlled by KstR2 were also up-regulated in phytosterol. In particular, the genes involved in the C/D ring degradation (echA20, ipdA, ipdB, fadA6, fadD3, fadE30, fadE31, fadE32, fadE33) showed increased expression by 7-16 times, with an average of 10 times (Table 2).

Mutations in KshAs, KshB and KstD_1
In order to evaluate whether the capability of Myc 1817 to accumulate 9-OH-AD as a major product from phytosterol is associated with malfunctioning of any proteins, we estimated degrees of conservation of the key enzymes of the A/B ring degradation. Amino acid substitutions in evolutionarily conserved positions of KstD_1, KshA, and KshB were evaluated. The known phytosterol degrading strain M. smegmatis mc 2 155 was used for comparison (NCBI Reference Sequence: NC_008596.1). The results are presented in Table 3.
The amount of substitutions in the most conserved positions (rank 1) in KshA_1 and KshB in Myc 1817 was on a par similar with that in M. smegmatis mc 2 155. Of particular note, KshA_2 in M. smegmatis mc 2 155 has fewer changes in the conservative positions than its orthologue in Myc 1817. The greater number of substitutions in the most conserved positions was observed in KstD_1 of Myc 1817 as compared with the orthologous KstD in M. smegmatis mc 2 155 (5 and 2, respectively) ( Table 3).

Discussion
The saprotrophic, fast growing native strain Myc 1817 is of industrial interest due to its ability to fully transform phytosterol at the high loads with forming 9-OH-AD as a major product [58]. Steroid metabolite profiling of Myc 1817D grown in the presence of phytosterol showed the accumulation of 9α-hydroxylated intermediates with a partially oxidised side chain, thus confirming that 9αhydroxylation precedes full side chain degradation by the strain (Fig. 7). The simultaneous presence of 9αhydroxylated products with C 3 and C 5 side chain in the form of carboxylic acid (9-HCBС) and alcohols (9,22-DHBC, 9,24-DHC) may indicate that its metabolic pathway differs from those known for M. smegmatis mc 2 155 [9,59] and M. neoaurum ATCC25795 [55], but seems to   generally correspond to that suggested for M. tuberculosis [21], and some other Mycobacterium strains [23,24,33]. It had been postulated that accumulation of C 22steroids, such as 4-BNC, or 1,4-BNC with some mycobacteria occurs due to the absence, or inefficient 9αhydroxylation [59]. Accumulation of the corresponding 9α-hydroxyderivatives at the phytosterol conversion with Myc 1817D allows to suggest that 9α-hydroxylation is rather not so critical for the elimination of the last isopropyl group of sterol side chain. Differential transcriptome analysis showed 260 genes that were up-regulated in response to phytosterol in Myc 1817, including numerous genes of steroid catabolism. A suite of phytosterol-induced genes related to side-chain, A/B ring and C/D ring degradation was generally similar to that in other phytosterol/cholesterol digesting mycobacteria [2,9,40]. Interestingly, the genes which putatively encoded ChOs did not show increased expression in response to phytosterol, while hsd (G155_22700) was significantly (83 times) up-regulated. This corresponds to the literature data -a similar pattern is observed in M. smegmatis [9]. Insignificant up- Fig. 7 Sterol catabolism and gene expression peculiarities in Mycobacterium sp. VKM Ac-1817D. R 1 -H, CH 3 or C 2 H 5 (various sterols); R 2partly degraded side chains. The thickness of arrows reflects suggested relative level of the catabolic activity regulation of hsd4A homolog whose product may play a role as 17β-hydroxysteroid dehydrogenase, or β-hydroxyacyl-CoA-dehydrogenase [55] is generally coincided with small amounts of 9-HCBС and 9,22-DHBC observed among the metabolites at phytosterol conversion with Myc 1817, and allows to propose that Hsd4A may be active towards 9αhydroxylated steroids.
Along with the genes involved in the side chain degradation and rings A/B oxidation, the genes related to the lower degradation pathway (rings C/D degradation) were also up-regulated on phytosterol. It is clear that the strain has the ability to fully degrade the steroid core. However, relatively high increase in expression of the rings C/D degradation genes was a bit surprising. These genes belong to KstR2-regulon, which is induced with HIP-CoA, − the compound formed after degradation of the rings A and B [37]. Since Myc 1817 mainly accumulates 9-OH-AD, the formation of HIP-CoA was expected to occur much slower than in the sterol-degrading strains, thus predicting attenuation of the KstR2-regulon genes induction.
Effective 9-OH-AD accumulation is generally stipulated with high 3-ketosteroid-9α-hydroxylase (Ksh) and low (or fully blocked) 3-ketosteroid-Δ 1 -dehydrogenase (KstD) activities. In Myc 1817, two genes of the oxygenase subunit (kshA_1 and kshA_2) and one gene of the reductase subunit (kshB_1) of Ksh significantly (in dozens of times) increased their expression in the presence of phytosterol, while among five putative kstDs, an only kstD_1 was phytosterol-inducible. The mutational analysis showed, however, that the product of this gene contained a significant number of nucleotide substitutions in the conserved positions, thus allowing us to propose either the absence, or low levels of activity. Four remained kstDs which did not up-regulated on phytosterol may, nevertheless, be expressed constitutively, thus contributing to the observed steroid core oxidation by the strain. Expression increasing of the KstR2-regulon genes probably also occurs through the activity of these 3-ketosteroid-Δ 1 -dehydrogenases.
A new putative transcriptional regulator G155_05115 involved in the control of steroid catabolism was identified in Myc 1817, while most of the genes putatively controlled by this factor did not increased their expression in M. smegmatis mc 2 155 and R. jostii RHA1 [2,9]. In Myc 1817, many genes presumably related to the oxidation of the side chain of sterols were among the putative regulon of this factor.
It is known that in addition to the genes involved in the catabolism of cholesterol and sitosterol, actinobacteria usually have clusters of the genes involved in the catabolism of other steroid compounds and, respectively, regulated by other transcription factors. In particular, there is a cluster of catabolism genes of the cholate [60], and the so-called C-19 cluster, whose genes are involved in the catabolism of yet unknown steroid compounds [61].
Because phytosterol contains β-sitosterol and other plant sterols which differ from cholesterol by branched side chains, it is reasonable to assume that the genes putatively regulated by G155_05115 may play a role in the oxidation of the side chains of the plant sterols of the phytosterol. However, there is a significant number of genes whose relationship with the side chain oxidation, or steroid catabolism remains uncertain in the putative regulon of G155_05115. This indicates that the function of G155_05115-regulon apparently is not limited to the oxidation of the sterols side chain.
Generation of the engineered mycobacterial strains producing valuable steroids is often complicated by homolog multiplicity of steroid catabolism genes [13,47,52,62]. Moreover, various genes capable of performing the same function may have different roles in the catabolism of sterols, as demonstrated for kstDs, hsd4A, hsd, choD and others [13,19,34,55,63]. On the other hand, the known steroid core degradation genes present in the genomes of different mycobacterial strains including those having different metabolic blocks and accumulating specific intermediates of phytosterol degradation process. Knowledge of gene expression regulation in response to phytosterol allows a role for the specific homologous genes in sterol catabolism to be predicted.

Microorganism and cultivation
The strain Mycobacterium sp. VKM Ac-1817D was obtained from All-Russian Collection of Microorganisms (VKM IBPM RAS) and pre-cultured as described earlier [64]. The strain was cultured in glycerol-mineral (control) medium [64] and the same medium supplemented with 12 mmol phytosterol (induction medium). Both media contained 24.1 mM MCD. Phytosterol powder and MCD were added to the medium before autoclaving; the sterilized media were sonicated during 2 min on an ultrasonic bath (100 W, 35 kHz) and incubated on an orbital shaker (200 rpm, 30°C) overnight before inoculation. Experiments were carried out in the two independent biological replicates in 750-ml shake flasks containing 100 ml media.

Growth estimations
The growth was followed gravimetrically [65] due to high cell-to-cell and cell-to-phytosterol aggregation. Briefly, the samples of the cultivation broth were sedimented by centrifugation at 6500×g for 15 min, then the cakes were washed twice with 40 ml of 10% (w/v) aqueous MCD for phytosterol removal and then twice with 40 ml of distilled water. The washed cells were dried at 70°C. For viable cell counts, broth samples were serially diluted with 1 g/l aqueous Tween 80 under vigorous agitation and plated on the solid minimal medium. The growth experiments were carried out in three replicates.

Analytical methods
Phytosterol and steroid metabolites were analysed as described earlier [65] by isocratic reversed-phase HPLC using Waters Symmetry (USA) 250 mm × 4.6 mm (5 μm) column at 50°C and flow rate 1 ml/min. For phytosterol analysis, the samples were diluted with a mixture of 2propanol and acetonitrile (45:50, v/v); the analysis was performed using 2-propanol:acetonitrile:deionised water (45:50:5, v/v) as a mobile phase with detection at 200 nm. Total areas of peaks relevant to plant sterols (6-12 min) were used for phytosterol quantification.
For steroid metabolites analysis, the samples were diluted 1:50 with 50% (v/v) aqueous acetonitrile; the analysis was performed using acetonitrile:water:acetic acid (52:48: 0.01, v/v) as a mobile phase with detection at 240 nm.

Isolation of mRNA
The cells were harvested by centrifuge at 8000×g for 10 min and immediately ground in a porcelain mortar under liquid nitrogen. Total RNA was isolated with Qiagen RNeasy mini kit (Qiagen, Netherlands), DNAse I and Ribo-Zero rRNA Removal Kit (Epicentre, USA) according to the protocols of the suppliers.

High-throughput sequencing
We used TruSeq RNA Sample prep kit v.2 (Illumina, USA) for sample preparation of mRNA for highthroughput sequencing. Sequencing was performed with HiSeq 2000 (50-nucleotide single-read run) according to the protocols of the manufacturer (Illumina, USA).

Accession numbers
The reads have been deposited in NCBI Sequence Read Archive (SRA) under the accessions numbers: SAMN05941438, and SAMN05941440.

Transcriptome analysis
For reads mapping and analysis of differential expression the Rockhopper 2.03 have been used [66]. A gene was considered differentially expressed between the conditions with and without phytosterol, if its expression increased, or decreased more than threefold with q-value less, or equal to 0.01.

Genome annotation
The genome was annotated with RAST (http://rast.theseed. org/FIG/rast.cgi) and PGAP [67]. To find putative genes encoding the enzymes of steroid catabolism and analyse orthologous relations between the genes of Myc 1817 and other actinobacteria, we used orthogroups between the genes of Myc 1817 and other species of actinobacteria including their plasmids that were constructed by OrthoMCL 2.09 with the OrthoMCL inflation parameter of 1.5, all other parameters were set to the default values.

Analysis of transcription factors and binding sites
In order to find binding sites (BSs) of transcription factors which may regulate steroid metabolism, we analysed regions of 500 bp upstream plus 50 bp downstream with respect to start codons of the genes that changed their expression more than threefold with q-value less, or equal to 0.01. The analysis was performed for the upregulated and down-regulated genes separately. We searched for the over-represented motifs in these regions using MEME 4.10 [68]. Motifs were allowed to be from 8 to 50 bp long. The 20 top-scoring motifs in a form of position-weight matrices (PWMs) were compared with the known motifs of mycobacterial steroid metabolism regulators KstR [28] and KstR2 [36]. The motifs which corresponded to KstR and KstR2 were determined in the lists obtained by MEME. Then, possible motifs of other putative transcription factors were estimated among the remaining motifs in the lists.
Criteria for a motif to be considered as a putative motif of a transcription factor were as follows: 1) the motif must be statistically significant with E-value ≤10 − 5 ; 2) the motif must contain a palindrome (palindromic motifs are typical for bacterial transcription factors; and 3) the motif should not be a simple repeat, such as CGCGCGCG…, which is unusual for bacterial transcription factor binding motifs [69]. A single motif that conformed to these criteria was found in a set of the genes that increased their expression and further indicated as "motif X".
To compare this motif with the known motifs of the transcription factors from TetR family, we replicated a methodology used in [46] to determine motifs of various TetR-family proteins in mycobacteria. The only change in the methodology was that besides 10 species used in that work, we added Myc 1817 and other species of our interest, M. neoaurum Ac-1815D, to the analysis. After determining the putative motifs of the transcription factors, motif X was compared with them by TOMTOM [70], which is a special tool for motif comparisons from the MEME suite. Then, to seek for all sites of KstR, KstR2 and the transcription factor of motif X in Myc 1817 genome we scanned the regions 500 bp upstream plus 50 bp downstream with respect to the start codons of all genes using KstR, KstR2 motifs and motif X found in the previous step. The scan was performed by a tool FIMO from MEME suite [71]. The sites determined with false discovery rate less, or equal to 0.01 (q-value, estimated by FIMO using Benjamini-Hochberg technique) were considered as putative binding sites. For comparison, we also scanned the same regions with the same position weight matrices by UGENE 1.13.1 [72], considering a site as a putative binding site of a transcription factor if its score is no less than 85% of a maximum possible score for the motif of that factor (85% is the default value in UGENE). The binding sites that were presented in the results obtained with both tools (or in the results obtained with one tool, but located before the upregulated operons) were used in further analyses. The employment of two tools allowed the number of falsepositive predictions to be reduced. Detection of a binding site before a first gene of a putative operon predicted by Rockhopper was considered in the analyses as a sign that this binding site regulates all genes in the operon.
The amino acid sequences of the transcription factor G155_05115 from Myc 1817 and its orthologue Rv0767c from M. tuberculosis H37Rv (GenBank accession number CCP43514) were aligned by EMBOSS Needle 6.6.0 with the default parameters [73].

Real time PCR
cDNA synthesis was performed using the MMLV reverse transcription kit (Evrogen, Russian Federation) with 0.5 μg of total RNA in accordance with the manufacturer's instructions. A real-time PCR was carried out using AriaMx Real-time PCR system (Agilent, USA) with Eva Green I M-439 kit (Syntol, Russian Federation). The nucleotide sequences of the primers used in this study for the target and reference genes are listed in Additional file 4. Each sample was run twice and the experiment was performed in duplicate. The amplification was performed as follows: 95°C for 5 min (1 cycle), 95°C for 10 s, and 60°C for 30 s (40 cycles). Gene expression levels were calculated using the ddCq method [74].

Search of mutations in kshA, kshB and kstD genes
To evaluate amino acid substitutions in proteins KshA_ 1, KshA_2, KshB and KstD of Myc 1817 and their orthologous proteins in M. smegmatis mc 2 155, we used ConSurf server [75]. For each protein under analysis, it looks for its homologs in 150 close species, performs multiple alignments of their amino acid sequences, and then estimates amino acid conservation for each position. The positions are ranked by ConSurf from the most conserved (rank 1) to the least conserved (rank 9). The most phylogenetically conserved positions in a protein are likely to be crucial for its functioning.

Conclusions
High expression levels of the genes related to the sterol side chain degradation and steroid 9α-hydroxylation in combination with possible defects in KstD_1 may contribute to effective 9α-hydroxyandrost-4-ene-3,17-dione accumulation from phytosterol provided by this biotechnologically relevant strain. The TetR-family transcription regulator gene G155_05115 was presumably associated with the regulation of steroid catabolism.