Reductive Evolution in Streptococcus Agalactiae, Emergence of a Host Adapted Lineage17 June 2013
The Gram positive bacteria Streptococcus agalactiae (GBS) is a cause of severe epidemics in fish farms. To decipher the genetic basis for the emergence of these highly virulent GBS strains and of their adaptation to fish, Isabelle Rosinski-Chupin et al have analysed the genomic sequence of seven strains isolated from fish and other poikilotherms.
Comparative genomics of strains belonging to species with a large spectrum of hosts, such as Staphylococcus aureus, have highlighted two main evolutionary trends linked to the adaptation to a new host: acquisition of new functions through lateral gene transfer facilitating colonization of new niches and gene loss associated with the host specialization. These trends were also recognized when considering the emergence of highly virulent host-specialized pathogens from bacterial species with a broader host range. For instance massive gene losses were associated to the emergence of human or equine pathogens such as Salmonella enterica sv Typhi and Paratyphi, Bordetella pertussis and parapertussishom and Burkholderia mallei that respectively derive from Salmonella Typhimurium, Bordetella bronchiseptica and Burkholderia pseudomallei. During these transitions, it was postulated that gene inactivation and deletions were probably favored by genetic drift and evolutionary bottlenecks. In line with this model the host-specialized pathogens often showed a much higher number of insertions sequences (IS) than their parental strains and IS expansion was proposed to be largely responsible for gene deletions and genome rearrangements observed in these species.
Streptococcus agalactiae also referred to as Group B streptococcus (GBS) is a Gram-positive bacterium that has emerged as a leading cause of neonatal infections during the sixties and represents an increasing cause of infections in the elderly and in adults with underlying diseases. As a commensal it colonizes the digestive and genitourinary tracts of up to 30% of the human adult population. However, S. agalactiae was initially described as an animal pathogen causing mastitis in ruminant. Since the 70’s, S. agalactiae was found to be responsible for epidemic events of invasive diseases in fish farms, leading to a mortality of up to 30%. Cases of infection were also reported for other aquatic poikilotherms such as frogs and aquatic mammals such as dolphins. How GBS is able to adapt to its different hosts remains poorly understood. The genetic diversity of GBS populations has been studied using different methods including multilocus sequence typing (MLST), which led to the recognition of different clonal complexes (CC). Some of these clonal complexes display host preference. For instance, CC67 is essentially associated with the bovine host and the hypervirulent sequence type (ST) 17 strains are mainly isolated from humans. However incidentally strains belonging to human-associated clonal complexes are also isolated from bovines suggesting that relationships between clonal complexes and host specificity are not so strict. Further analysis of the complete genome sequences of eight isolates of human origin and one of bovine origin has highlighted the composite organization of S. agalactiae genomes with a conserved backbone (representing the core genome of the species) and a dispensable genome composed of genomic islands that are highly variable between the different strains.
The ST261 strain 2-22 (or ATCC 51487) was initially isolated as responsible for several epidemics in fish farms in Israel. This strain, which showed a restricted metabolic pattern, thermosensitivity and lack of ß hemolytic and CAMP activities was first classified as a different species, Streptococcus difficile, but proved later to be a genuine serotype Ib S. agalactiae strain. S. agalactiae strains were repeatedly isolated from fish infections and found to cluster into two main groups. The first group corresponds to strains belonging to the clonal complex 7, also displaying strains isolated from human and bovine hosts. The other strains share two or more common MLST alleles with strain 2-22 and are classified in ST246, 257, 259, 260, 552 and 553. As ST261 strains, these STs were until now never isolated from humans. In addition, a third group of strains belonging to clonal complex 283 has recently been described in fish and humans. The draft genome sequences of one ST260 strain, strain STIR-CD-17, and two ST7 strains, strain ZQ0910 and GD201008-001, respectively isolated from disease outbreaks affecting farmed tilapia in Honduras and Nile Tilapia in China were recently published.
To decipher the phylogenomic relationships between S. agalactiae strains isolated from fish or other poikilotherm animals and strains isolated from human or bovine, we analyzed the genomic sequence of seven isolates belonging to ST260-261 and ST6-7, including the strain 2-22. Comparative analysis confirmed that the two groups of GBS strains responsible for fish epidemic diseases are distantly related. We found that adaptation to fish does not involve any specific function compared to human CC7 isolates. Conversely, specialization to the fish host of the ST260-261 strains was associated with massive gene inactivation and deep remodeling of metabolic and regulatory networks that we also characterized at the transcriptome level. This genome reduction likely occurred through RecA independent recombination.
Results and Discussion
Fish ST7 Strains are Closely Related to Human Strains
We first compared the gene content of S. agalactiae strains isolated from fish but grouped into the same CC as the human strain A909 by sequencing the genome of strains CF01173 and SS1014, isolated in USA and UK respectively. Whole genome sequence comparison showed that strain CF01173 differed from strain A909 by only 389 SNPs. CF01173 was also closely related to the recently described strains ZQ0910 and GD201008-001 isolated from diseased fish in China, which differ by only 105 and 100 SNPs respectively. In contrast the ST6 strain SS1014 was more distant (3484 SNPs), and proved to be closer to the ST6 strain H36B (689 SNPs) isolated from human. Analysis of SNP distributions along the genome sequence showed a uniform distribution when strains of the same ST were compared, with a mean polymorphism of 0.1-0.2 SNP per 1000 nt except in the sequence of an inserted prophage. In contrast alignment of ST6 versus ST7 strains revealed a mosaic pattern of regions of low polymorphism (0.1-0.2 SNP per 1000 nt) interrupted by several regions of higher polymorphism (5-15 SNP per 1000 nt on average) that were probably gained by recombination with distantly related GBS strains, as previously suggested. One of these regions corresponds to the capsule locus that encodes a serotype Ia capsule in ST7 strains and a serotype Ib capsule in ST6 strains. Therefore ST6 and ST7 strains probably shared a common ancestor and recently diverged by recombination with other GBS strains, modifying the capsular serotype between both ST.
Gene content was similar between strains CF01173 and A909 except for 13 genes that were disrupted in CF01173 and five in A909. Seven genes were specifically disrupted in strain SS1014. In addition, we found that the three ST7 strains CF01173, ZQ0910 and GD201008-001 isolated from fish shared one short genomic island that was absent from other GBS and probably resulted from lateral gene transfer. This genomic island encodes proteins 70-95% identical with proteins of Streptococcus anginosus, including a protein with a LPXTG-motif cell-wall anchor domain (GBS1173_1788). Sharing of this island in addition to the low number of SNPs between strains CF01173, ZQ0910 and GD201008-001 suggests that the three strains have a common recent ancestor that propagated worldwide. In contrast, emergence of the ST6 strain SS1014 can be considered as an independent event.
Overall, the low level of polymorphism with human strain indicates that CC7 strains infecting fish recently diverged from strains isolated in humans and bovines. Since at least two independent events of emergence were observed, this suggests that CC7 strains might be more amenable to fish colonization/infection than other GBS clonal complexes isolated from humans or bovines. In agreement with this hypothesis, it was recently shown that a S. agalactiae ST7 strain isolated from human was able to cause disease in Nile tilapia.
ST260-261 Strains form an Independent Lineage which Underwent Reductive Evolution
To explore the phylogenomics relationships between the second group of strains isolated from fish, and CC7 strains, we sequenced the genome of five strains belonging to ST260 and ST261. While the genomes of four of these strains were obtained as draft sequences, the genome of strain 2-22 was sequenced to completion. Strain 2-22 genome consists of a single circular chromosome of 1,838,867 bp (Figure 2a); this is 10 to 25% smaller than the genome sizes of other sequenced GBS strains, which range from 2,065 kb (ST17 human strain COH1) to 2,456 kb (ST67 bovine strain FSLS3-026). The G + C content (35.5%) is similar to that of other GBS strains. Compared to human GBS genomes, the genome of strain 2-22 lacks one rDNA cluster and 9 tRNA genes (71 tRNA genes and 6 rDNA clusters versus 80 and 7 respectively in other GBS genomes). The deletion of this rDNA cluster is associated with the translocation of a 150 kb genomic region probably caused by recombination between flanking ribosomal RNA operons, as also observed in Salmonella Typhi [38,39]. Except for this region the genome of strain 2-22 is syntenic to the genomes of human strains (Figure 2b). The four other strains have a similar genome size as strain 2-22.
Whole-genome sequence comparison of the five strains showed that they clustered into two distinct subgroups correlating with the MLST classification. The ST261 strain SS1218 isolated from frog in Louisiana differed from the strain 2-22 by only 30 SNPs. Strains 90-503 isolated in Louisiana in 1990 and 05-108A isolated in 2005 in Honduras, with 49 SNPs can be considered as variants of the same clone. Strain SS1219 isolated from frog in Taiwan diverged from the 2 former ST260 strains by 130 SNPs. On average, the ST260 strains showed 3,100 SNPs with ST261 strains. ST260-261 strains were also related to the ST552 strain Sa20-06 (3,400 and 1,700 SNPs respectively). In contrast, ST260-261 displayed 15,000 SNP with CC7 strains. Analysis of the SNP distribution along the genome sequences of ST260 and ST261 strains revealed a uniform pattern of 3 SNPs per kb, suggesting that no recombination occurred in this lineage.
A phylogenetic analysis of strains of human, bovine and fish or frog origins confirmed that ST260-261-552 strains constitute a distinct lineage. Separation of this lineage from other S. agalactiae clonal complexes, including CC7 strains, was probably ancient, pre-dating the separation between the three strains of human origin (NEM316, A909 and 2606V/R). While the comparison of the whole genome sequence showed the same mean identity between ST260-261 strains and the three human strains, a higher proportion of nucleotides was found to align with the genome of strain A909 (97.15%) than with other GBS genomes (2603 V/R: 95.64%, NEM316: 95.43%, FSLS3-026: 92.27%).
The low proportion of nucleotides that did not align with the genomes of human GBS strains suggested that ST260-261 strains encode only few specific functions. Indeed, only four genomic islands were characterized in strain 2-22, that represented 25 kb in total (in violet on Figure 2b). However these four regions essentially contained pseudogenes and one of them was absent from the ST260 strains. We also identified in ST260-261 strains eleven to twenty copies of ISSag1, an insertion sequence previously described in the genomes of human isolates and other streptococci. In addition, the genomes of ST260-261 strains contained 10 regions categorized as genomic islands in human S. agalactiae genomes, but shared by most GBS characterized so far (shown for strain 2-22 in Figure 2b). They did not contain any Integrative and Conjugative Elements (ICE) or prophages and only two inactivated copies of integrases were identified. They also lacked two mobile genetic elements encoding important virulence loci: the cis-mobilizable element encoding the major surface antigen Alpha like protein/Rib and the composite transposon coding for the C5A peptidase and the laminin binding protein. In total the difference in genomic island content between ST260-261 strains and GBS strains isolated from humans or bovines accounted for about 70-80% of the genome reduction (210-230 kb). An interesting exception was the genomic island 3.2 that was previously described only in strains A909 (ST7) and H36B (ST6) and therefore is also shared by CC7 strains isolated from fish. This conservation is in favor of a role in fish colonization. This GI encodes two phosphotransferase systems (PTS) for galactitol, sugar ABC transporters and genes for galactose utilization (GBS222_0398-0414 in strain 2-22). Sharing of this genomic island explained the higher number of aligned nucleotides between strains 2-22 and A909.
Genome annotation predicted 1547, 1568, 1560 and 1569 protein coding genes for strains 2- 22, 90-503, SS1219 and 05-108A respectively, which is significantly less compared to 2096 in NEM316, 1990 in A909 and 2135 in 2603V/R. Furthermore, 190-220 pseudogenes were identified in each strain compared to 27 to 41 in the human strains. This revealed a massive reduction of the functional genome during the time-course of adaptation to fish in the ST260-261 lineage.
Reductive Evolution is an Ongoing Process in the ST260-261 Lineage
To get more insights into the evolution of the ST260-261 lineage, we further analyzed the nucleotide changes leading to gene disruption and genome size reduction compared to human strains. Strain A909, belonging to CC7, was used as a reference to compare the genome sequences of the five ST260-261 strains. We also aligned the genomes of two human and one bovine isolates to identify nucleotide changes specific to the fish lineage. Genomic islands and insertion sequences were excluded from the analysis as well as thirty sequences whose evolution involved both insertions and deletions of nucleotides. In these conditions, sequence alignment between strain 2-22 and A909 genomes revealed 621 simple insertion/deletion events ranging from 1 to 10,095 nt. Using the sequences of human strains as outgroups, we found that, among these indels, 160 corresponded to deletions and 60 to insertions that specifically occurred in the ST261 sublineage, after ST260-261 divergence. The mean size of deletions largely exceeded that of insertions since only 300 nucleotides were gained while 47,000 were lost. Deletions, insertions and nucleotide replacements were respectively responsible for 76, 21 and 19 gene disruptions. In addition, 15% of the indels led to minor protein size modifications (less than 20%) and their consequences on protein functions were more difficult to predict. Finally 30% of the indels were characterized in intergenic regions. Altogether mutations leading to gain or loss of nucleotides represented approximately 12% of total mutations after the divergence from ST260 strains and constituted the major process of gene disruption. This evolution was not specific to the ST261 sublineage as a similar number of insertions and deletions occurred in the ST260 sublineage accounting for 86 gene disruptions.
In addition, 235 indels and 6363 SNPs relative to strain A909 sequence were common to the five ST260-261 strains but were not observed in A909 vs other GBS strains. Forty per cent of these indels were intragenic and led to 91 gene disruptions. Finally, 34 genes disrupted by a small indel in one sublineage were deleted in the other, suggesting that gene disruption could be a preliminary step to gene loss and genome reduction by secondary longer deletions. As a consequence some loci were observed under different states of decay in the two sublineages. This is the case for instance for the cyl locus, the CRISPR2 locus and the pil2 pilus locus. Furthermore, the average size of the indels was found to increase following the divergence of the two sublineages compared to the common branch. Therefore, while the process of reductive evolution was already evident in the ancestor of the ST260-261 strains, it was even more pronounced after divergence of the two sublineages.
To better evaluate the specificity of the evolutionary process occurring in ST260-261 strains, we compared the core genome sequence of strains A909 and NEM316, two strains of human origin differing by approximately 11,000 SNP. We detected 314 indels, which represents half the number of indels in the 2-22 vs A909 comparison. These indels were essentially short and as much as 80% of them occurred in intergenic regions while less than 4% led to gene inactivations. Therefore, gain or loss of nucleotides was also frequent in other S. agalactiae strains, but most indels in coding sequences were probably eliminated by purifying selection. Fitting with this hypothesis, we observed one indel per 5-10 SNPs during the recent evolution of ST7 strains where purifying selection is probably not effective. A similar proportion of indels vs SNP was also reported for two strains of S. enterica sv Paratyphi.
Finally, to gain further insights into the mechanism of the observed insertions and deletions, we analyzed the sequences flanking the indels in strain 2-22. Among the 358 indels of 1 nt, 210 (60%) occurred in homopolymeric tracts longer than four nucleotides and likely resulted from DNA polymerase slippage during replication. Interestingly, 136 of the 193 (70%) indels larger than four nucleotides apparently also involved recombination between repeated sequences. The median size of the repeats was 8 nucleotides, with 8 sequences larger than 20 nucleotides, the largest being 254 nt long. Since the threshold of repeat length for RecA-dependent homologous recombination is 23-27 nt, most of the indels probably occurred by RecA independent recombination. It has been shown that, under laboratory conditions, the efficiency of illegitimate recombination is highly dependent on the size of the repeats and inversely dependent on the distance between repeats. Our results suggest that RecA independent recombination between repeats of moderate sizes may also lead to long deletions. Furthermore we did not find any evidence that deletions of long sequences may depend on longer repeats. This may indicate that the lower efficiency of illegitimate recombination between short repeats is buffered by their higher frequency in the genome. Some of the large deletions might also correspond to several consecutive shorter ones.
From this analysis we propose that accumulation of small and easily reversible indels is common in S. agalactiae, mostly occurring as a consequence of RecA independent recombination. Most of these indels are probably removed from the population due to purifying selection during long-term evolution. Alternatively they may be compensated by a second loss or gain of nucleotides. In the ST260-261 lineages host specialization led to the relaxation of the negative selection on genes that became dispensable, allowing the accumulation of larger deletions.
Functional Adaptation to a Host Restricted Way of Living
ST260-261 strains differ from other GBS strains by their host specificity and high pathogenicity. As mentioned, this lineage has only few specific genes compared to other strains and most of them are carried by genomic islands in the process of being eliminated. In particular, we did not identify any specific surface proteins or putative virulence factors. Therefore, the specific properties of the ST260-261 strains more likely rely on functions shared with human isolates and on the loss of some functions. The 1432 genes common to the two sublineages were grouped into functional categories according to KEGG classification. Compared to functional classification of A909 genes, this analysis revealed a dramatic drop in the proportions of genes involved in mobile and extrachromosomal functions (that reflects the loss of most genomic islands) and in cellular processes, including the pathogenesis subcategory. The other major functions submitted to evolutionary erosion were energy metabolism, transport and binding, regulation and signal transduction. Conversely the basal functions of the cell such as transcription and protein synthesis were nearly not affected.
Bacterial-host Interactions and Virulence Factors
Surface components play an important role for tissue colonization and infection by mediating interactions between the pathogen and the host cells and evasion from immune defense. GBS strains possess two distinct polysaccharide antigens, the highly sialylated capsule polysaccharide and the group B-antigen. ST260-261 strains harbor the 16 genes involved in the type Ib capsule synthesis (GBS222_0990-1005) similarly to the ST6 strains H36B and SS1014. They also possess the 16 genes responsible for the synthesis of the group B antigen (GBS222_1160-1175). Only seven proteins with a LPXTG signal for cell-wall anchor are conserved including three important S. agalactiae virulence factors of human strains: the Fibrinogen-binding protein A, the Serine-rich protein Srr1 and the BibA/HvgA protein. However, while in human strains these three proteins carry a variable number of a repeated motif, these repeats have been lost in the ST260-261 strains. This might decrease their accessibility to host immune system and cellular receptors by tethering them to the cell wall.
Among the other proteins that are virulence factors in human GBS isolates, the five strains lack the C5A peptidase, the laminin binding protein and components of the pilus. The cyl locus has also been inactivated in the two sublineages in agreement with the absence of detectable hemolytic activity on blood agar plates. Another major test for S. agalactiae identification is the detection of CAMP activity. While both ST260 and 261 strains were reported to be negative for the CAMP factor reaction we found that the gene encoding the CAMP factor was disrupted only in ST261 strains. We experimentally confirmed that strain SS1219 was negative for the CAMP test detection, probably because of a lower level of gene expression.
Finally, the five ST260-261 strains express a fibrinogen/fibronectin binding protein of the PavA family and a hyaluronidase that were proposed to be virulence factors in GBS or other Streptococcus species and could also have a role in fish infection.
Altogether, approximately 60% of the genes for proteins involved in pathogenesis and considered as important virulence genes in human stains were affected by genome reduction. In this context the capsule could be a major virulence factor in the fish host, as also observed for Streptococcus iniae, another streptococcus species pathogenic for fish.
Analysis of the missing functions revealed a profound remodeling of the metabolism of strains of this lineage. Numerous transport systems for carbon sources (ABC transporters and phosphoenolpyruvate/carbohydrate phospho?transferase systems) and enzymes for degradation of polysaccharides (amylase, extracellular pullulanase, enzyme for degradation of arbutin) were missing or inactivated, reflecting a reduced capacity to utilize diverse carbon sources. In addition ST260-261 strains seem unable to utilize glycerol and glycerol phosphate as genes encoding the glycerol kinase, the glycerol dehydrogenase and the glycerolphosphate permease are missing or mutated. Fermentative pathways utilizing pyruvate/acetate conversion are also altered, as the phosphotransacetylase gene is missing. In mammals, GBS is primarily considered as a commensal of the digestive tract, an environment rich in diverse C-sources. In contrast, reduction in the catabolic capacities observed in ST260-261 strains is in favor of a transition to an obligate pathogen style-of-life.
Approximately 20% of the genes associated with transport systems were missing or inactivated. The targeted functions were mostly the import of nutrients, in particular of Csources and the transport of inorganic and metal ions. The Na+/H + antiporter, the K + uptake permease were inactivated in both sublineages. Therefore ST260-261 strains may be affected in ionic exchange and would have a reduced capacity to maintain their homeostasis in the face of a changing external environment.
Globally, gene disruption and deletion events affected 19 out of the 93 transcriptional regulators predicted in human strains (20%). As much as 13 out of the 21 two-component systems (TCS) (60%) found in GBS were inactivated, either in one (six TCS) or in both (seven TCS) sublineages. Interestingly, analysis of the genome sequence of strain 2-22 revealed that the Rgf TCS, which is involved in the control of virulence in the human ST17 strains, was associated with two putative bacteriocins with a double glycine leader peptide and with a bacteriocin export transporter. This suggests that this TCS originates from a bacteriocin operon that has been partially deleted in human strains. Altogether our observations show that S. agalactiae strains adapted to fish may have a reduced capacity to respond to environmental changes compared to human strains and only eight TCS, including the two major systems, CiaRH and CovRS, may be sufficient to allow GBS adaptation to the different environments encountered in fish. Interestingly, both CiaRH and CovRS were also involved in the regulation of virulence genes in GBS human strains. The higher virulence of ST260-261 strains might also be due to the deregulated expression of some virulence genes, as observed in the transition from local to systemic infections in Group A Streptococci.
Adaptation to Fish is Associated with Broad Changes in Gene Expression
As a first step to explore changes in gene expression linked to host adaptation, we performed a comparative analysis at the transcriptome level of strains A909 (ST7 human isolate), CF01173 (ST7 fish isolate), 2-22 (ST261 fish isolate) and SS1219 (ST260 frog isolate). 1389 genes present in the four strains and with 100% identity matches with probes of the array were taken into account in this analysis. Profound modifications in gene expression were observed in ST260-261 strains and to a lesser extent in strain CF01173 compared to strain A909. Although strains CF01173 and A909 are closely related, the expression of more than 130 genes varied by a two-fold factor between the two strains. In particular 40% of the genes involved in energy metabolism were expressed at a lower level in the fish isolate than in strain A909. In contrast expression of the gene encoding the virulence protein Srr1 was more than 20 fold increased compared to strain A909. Although up-regulation of the srr1 gene is not conserved in the ST260-261 fish isolates, in strain CF01173, it might facilitate penetration of the fish blood brain barrier, leading to an increased tendency to cause meningitis, as shown for the Srr1 protein of human isolates.
Both the number of genes for which expression was modified and the amplitude of these modifications were even larger in ST260-261 isolates. For instance two genes encoding putative glyoxalases are expressed at 6-45 higher levels in ST260-261 strains than in strain A909, suggesting higher needs in methylglyoxal detoxification. Genes encoding the enzymes for the synthesis of arginine (arginosuccinate synthase and arginosuccinate lyase) were 10-15 fold more expressed. Four functional categories were more affected by variations in gene expression in both ST260-261 sublineages: metabolism, transport, regulation and signal transduction. Interestingly these functional categories were also particularly affected by gene erosion, suggesting a remodeling of metabolic networks.
In addition to these global tendencies, ST260 and ST261 strains also harbor sublineagespecific variations in gene expression that probably result from differential gene decay and might lead to specific adaptations. For instance, in strain SS1219, genes encoding phosphoglycerate kinase, phosphoglyceromutase and pyruvate kinase involved in glycolysis and neoglucogenesis are expressed at a lower level than in strain A909. In parallel, genes coding for the arginine deiminase, ornithine carbamoylkinase and carbamate kinase involved in the deiminase pathway are up-regulated, suggesting a shift towards arginine fermentation as the main pathway for energy production in ST260 strains. This up-regulation of the deiminase pathway is associated with a decreased expression of the negative regulator ArgR and the increased expression of the arginine/ornithine antiporter. In contrast, in the same growth conditions, upregulation of the genes for Hpr-kinase and CodY involved in catabolic repression and for lactate dehydrogenase suggests that strain 2-22 might generate energy essentially through glucose utilization and lactic fermentation. Iron import might be more efficient in ST260 than in ST261 strains since strain SS1219 expresses genes for iron ABC transporter and ferrichrome ABC transporter at higher levels than strain A909, while these genes are down-regulated or unchanged in strain 2-22. Both sublineages express lower amounts of ATP synthase, the enzyme responsible for ATP-proton motive force interconversion, than strain A909, indicating that they may be impaired in pH homeostasis in acidic conditions.
Since expression of pseudogenes is energetically costly and may generate deleterious products, we took advantage of the massive genome erosion in the ST260-261 lineage to look for evidence for negative selective pressure acting on pseudogene expression. Among the 184 pseudogenes represented by at least one probe on the array, only 22% were down-regulated compared to strain A909, whereas 64% did not present a significant change in mRNA levels and 14% were even up-regulated. Furthermore silencing of pseudogenes in the ST260-261 lineage did not seem to markedly increase with time. Indeed the proportion of downregulated pseudogenes was similar among pseudogenes arising before (28%) and after the differentiation of sublineages (16% and 25%). Altogether this indicates that no strong evolutionary pressure acts to silence pseudogene expression.
Although modifications in gene expression generally affected different genes in ST260-261 and CC7 strains, 29 genes were found to vary in the same way in the 3 strains isolated from fish. Interestingly, these 29 genes included the operon for the capsule synthesis, the genes for zoocin and hyaluronidase, two response regulators of TCS systems (ciaR and relR) and three targets of the CiaRH TCS that were up-regulated. Sixteen genes were downregulated or inactivated in the three fish strains compared to A909, among which 11 were involved in energy metabolism or in carbohydrate uptake. Whether these regulations reflect a common mechanism of adaptation to fish environment remains to be established. However, the higher expression of the genes for the capsule synthesis might favor resistance to environmental conditions and to fish immune system, as it has also been reported for S. iniae. Similarly the high hyaluronidase expression may help S. agalactiae strains to break through fish tissues and be involved in virulence to fish.
Our results show that S. agalactiae strains leading to epidemic diseases in fish farms and cold blood animals belong to at least two distinct groups that differ by their strategies of host adaptation. CC7 strains have the potency to colonize and infect multiple hosts such as fish, human and cattle. From a genomic point of view, these CC7 fish strains are not distinguished from their human counterparts by any significant genomic island. However, contrasting with this genomic relatedness, large differences in gene expression were observed and could participate to the adaptation to the fish host. Conversely, our genome analysis indicates that strains of the ST260-261 complex diverged anciently from human and cattle strains and subsequently accumulated specific adaptations leading to the emergence of sublineages.
ST260-261 strains exhibit a striking pattern of genome reduction and we took advantage of the emergence of sublineages to reconstruct the different steps involved in this process. We found that accumulation of short indels can be observed all along the evolution of the GBS species, participating to strain-specific gene disruptions. Therefore even in the core genome of human GBS strains some genes are dispensable. Nevertheless the number of inactivated genes greatly increased during specialization of ST260-261 strains to fish. These gene inactivations mainly result from the ongoing accumulation of short indels, but a tendency to eliminate inactivated genes by deleting longer sequences is more noticeable in the sublineage specific branches. In contrast with what has previously been observed in the course of genomic reduction associated to host specialization or to intracellular symbiosis, deletion events are not correlated with an amplification of insertion sequences. Neither could complex genome rearrangements be noticed, suggesting that recombination between IS is not a general mechanism for genome reduction. Indeed our results point to non-homologous recombination as an alternative mechanism of genome reductive evolution.
Further ReadingYou can view the full report and list of authors by clicking here.