Maximizing information content of SNP arrays for genomic prediction

dc.contributor.advisorSnowdon, Rod
dc.contributor.advisorFrisch, Mathias
dc.contributor.authorWeber, Sven Ernst
dc.date.accessioned2024-06-11T12:19:55Z
dc.date.available2024-06-11T12:19:55Z
dc.date.issued2023
dc.description.abstractGenomic prediction is a promising tool for improving genetic gains in various crops, serving as a valuable tool for plant breeders. SNP arrays are the preferred genotyping tool for breeders of most major crops, however the limited predefined marker number associated with SNP arrays has the potential to impede achievable prediction accuracy in genomic prediction. The objective of this study was to evaluate cost-effective methods for maximizing the information content of SNP arrays. Three methods were explored and their information content was assessed using prediction accuracies from six genomic prediction models across diverse crops and agronomic traits. Independently of the method used to increase the information content of SNP arrays, the applied genomic prediction models consistently demonstrated similar performance in terms of prediction accuracy within traits, making them equally suitable for genomic prediction across a variety of crops and traits. The first method to maximize the information content of SNP arrays involved constructing haplotype blocks with various methods and parameters and utilizing their haplotypes for genomic prediction. Analyzing data from rapeseed, maize, wheat and soybean in genomic prediction models revealed only marginal improvements in genomic prediction accuracy across most traits. Notably, haplotype blocks demonstrated effectiveness in compensating for poorly performing models in scenarios with highly variable prediction accuracies across prediction models. Nevertheless, the absence of a consistent ideal method or parameter for constructing haplotype blocks makes them a hyperparameter requiring careful tuning. Furthermore, failed allele calls from SNP arrays were examined for their information content in genomic prediction of agronomic traits in maize and rapeseed. Two statistical pipelines were developed and tested to filter non-random failed allele calls from random technical errors. Surprisingly, failed allele calls, potentially originating from genome structural variants, exhibited prediction accuracies comparable to genome-wide SNP datasets. However, the combination of SNPs and failed allele calls did not enhance genomic prediction. As an alternative to whole-genome sequencing marker data, imputation of whole-genome sequencing marker data from SNP arrays was explored. While there was a considerable improvement in LD and marker density, no increase in prediction accuracy was observed. This can likely be attributed to erroneous haplotypes and marker calls resulting from imputation errors. A suitable hypothesis to explain this observation is that these errors are introduced by the high complexity and redundancy of crop plant genomes. Across all three methods, relationships emerged as an explanation for the lack of improvement in genomic prediction accuracy. Relationship estimates exhibited a high correlation between those obtained from SNP array data and methods to increase the information content of SNP arrays, contributing predominantly redundant information. Moreover, it can be assumed that markers on arrays generally exhibit sufficient LD with adjacent QTL. In conclusion, SNP arrays were proven to be a reliable genotyping technology, offering a representative sample of the genome for estimating relationships. Furthermore, this study reaffirms the potential of genomic prediction as a breeding tool to improve genetic gain in several crops.
dc.description.sponsorshipBundesministerium für Bildung und Forschung (BMBF); ROR-ID:04pz7b180
dc.identifier.urihttps://jlupub.ub.uni-giessen.de/handle/jlupub/19268
dc.identifier.urihttps://doi.org/10.22029/jlupub-18629
dc.language.isoen
dc.relation.hasparthttps://doi.org/10.3389/fpls.2023.1217589
dc.relation.hasparthttps://doi.org/10.3389/fpls.2023.1221750
dc.relation.hasparthttps://doi.org/10.1139/gen-2023-0126
dc.rightsIn Copyright*
dc.rights.urihttp://rightsstatements.org/page/InC/1.0/*
dc.subjectgenomic prediction
dc.subjectSNP marker
dc.subjecthaplotype block
dc.subjectstructural variations
dc.subjectbreeding
dc.subject.ddcddc:500
dc.subject.ddcddc:580
dc.titleMaximizing information content of SNP arrays for genomic prediction
dc.typedoctoralThesis
dcterms.dateAccepted2024-05-17
local.affiliationFB 09 - Agrarwissenschaften, Ökotrophologie und Umweltmanagement
local.projectBreedPatH
thesis.levelthesis.doctoral

Dateien

Originalbündel
Gerade angezeigt 1 - 1 von 1
Lade...
Vorschaubild
Name:
WeberSvenErnst-2024-05-17.pdf
Größe:
15.06 MB
Format:
Adobe Portable Document Format
Lizenzbündel
Gerade angezeigt 1 - 1 von 1
Vorschaubild nicht verfügbar
Name:
license.txt
Größe:
7.58 KB
Format:
Item-specific license agreed upon to submission
Beschreibung: