To Strepto and beyond! Large-scale bacterial genomics with a focus on pathogenic streptococci
| dc.contributor.advisor | Goesmann, Alexander | |
| dc.contributor.author | Fenske, Linda | |
| dc.date.accessioned | 2026-06-19T12:52:36Z | |
| dc.date.issued | 2026-03 | |
| dc.description.abstract | Bacteria are central research organisms across many disciplines and capturing their diversity is essential for understanding their genomic features, which in turn illuminate virulence, transmission, and the rapid evolution and spread of antimicrobial resistance. Advances in whole-genome sequencing has driven an explosion of publicly available data, with thousands of new datasets deposited daily. However, despite broad data availability, major hurdles remain in terms of accessibility and comparability due to heterogeneous processing pipelines, inconsistent quality control, and incomplete or unstructured metadata.<br> *Streptococcus agalactiae* is an opportunistic multi-host pathogen with major relevance in both human and veterinary medicine. Beyond humans and cattle, *S. agalactiae* has also been reported in endangered species such as elephants, but elephant-associated strains lack detailed genomic characterization to date. Although extensively studied, existing genomic work is often restricted to specific regions or single outbreaks, offering only fragmented views of its true diversity.<br> This thesis addresses these challenges through three linked projects, motivated by an interest in the diversity of *S. agalactiae*. Initial attempts to perform a large-scale comparative analysis of *S. agalactiae* were hampered by the lack of suitable reference datasets. To enable robust comparative genomics from uniformly processed public data, BakRep was developed: A large-scale, searchable web repository built on the assemblies from the *AllTheBacteria* project. BakRep connects consistent genomebased characterizations, like taxonomic information, subtypings and annotations, with descriptive metadata and provides an integrated search interface complemented by interactive visualizations of genomic features. As a use case, BakRep was used to conduct a population-scale comparative analysis of all *S. agalactiae* genomes in the repository, confirming dominant stable lineages while exposing substantial metadata gaps that limited biological interpretation and highlighted the need for curated, structured metadata. Additionally, isolates from elephant-derived *S. agalactiae* were analyzed, to address the present date deficit.<br> These isolates were phylogenetically distinct from strains found in other hosts, and several lineage-specific genes suggested potential niche adaptation. Together, this work demonstrates how consistent, scalable resources support large-scale comparative studies across thousands of bacterial species to identify shared patterns of adaptation, virulence, as well as resistance, to guide focused follow-up research in lesser-studied hosts and ecological niches. While comparative genomics is crucial for understanding genetic variation and host adaptation in multi-host pathogens like *S. agalactiae*, its impact is undermined when sequencing data are generated and shared without well-curated metadata. Such metadata gaps present significant obstacles to biological interpretation and clinical translation, thereby necessitating rigorous data and metadata quality control. Furthermore, this work identified opportunities to enhance methods devised here, highlighted new research questions, and set a foundation for future investigations. | |
| dc.description.sponsorship | Bundesministerium für Bildung und Forschung (BMBF); ROR-ID:04pz7b180 | |
| dc.identifier.uri | https://jlupub.ub.uni-giessen.de/handle/jlupub/21632 | |
| dc.identifier.uri | https://doi.org/10.22029/jlupub-20976 | |
| dc.language.iso | en | |
| dc.relation.haspart | https://doi.org/10.1099/mgen.0.001305 | |
| dc.relation.haspart | https://doi.org/10.1099/mgen.0.001489 | |
| dc.relation.haspart | https://doi.org/10.64898/2026.03.02.709001 | |
| dc.rights | Attribution-NonCommercial-NoDerivatives 4.0 International | en |
| dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/4.0/ | |
| dc.subject | Streptococcus agalactiae | |
| dc.subject | Bioinformatics | |
| dc.subject | Comparative genomics | |
| dc.subject.ddc | ddc:570 | |
| dc.subject.ddc | ddc:004 | |
| dc.title | To Strepto and beyond! Large-scale bacterial genomics with a focus on pathogenic streptococci | |
| dc.type | doctoralThesis | |
| dcterms.dateAccepted | 2026-06-03 | |
| local.affiliation | FB 08 - Biologie und Chemie | |
| thesis.level | thesis.doctoral |