Predictive Modelling with Machine Learning in Plant Breeding

dc.contributor.advisorFrisch, Matthias
dc.contributor.advisorSnowdon, Rod
dc.contributor.authorHeilmann, Philipp Georg
dc.date.accessioned2025-12-15T09:20:05Z
dc.date.available2025-12-15T09:20:05Z
dc.date.issued2025
dc.description.abstractGenomic prediction, originally proposed as a solution to the limitations of marker-assisted selection for complex traits, has become the standard for estimating breeding values in both inbred and hybrid crops. While linear models such as GBLUP and RR-BLUP remain effective in many cases, especially when assuming an additive genetic architecture, recent years have seen a growing interest in applying machine learning (ML) methods to overcome some of their constraints, including their limited capacity to model non-additive effects and nonlinear interactions. This thesis explored the influence of three key aspects on the success of genomic prediction: The choice of input features, the statistical model used, and the target trait or crop. <br> In terms of input features, marker data was compared to minimalist parentage-based models, haplotype blocks, and features generated using autoencoders. It was shown that even simple ML models using parentage-based information can rival marker-based GBLUP under certain conditions, which holds potential for small breeding programs with large amounts of historical, but ungenotyped, records. At the same time, dimensionality reduction techniques, especially a novel haplotype-based autoencoder that was developed during this thesis, were introduced to compress genomic data while preserving prediction accuracy and successfully accelerated model training.<br> Concerning the model aspect, a variety of ML algorithms were benchmarked using different approaches for hyperparameter tuning. Although no single model outperformed others across all traits and crops, ensemble approaches typically performed better than the individual models they were based on. Support vector machines seemed to be relatively unstable when compared to other ML based algorithms, such as tree-based models.<br> Finally, results showed that the accuracy of the genomic predictions was strongly dependent on differences between traits, crops with different breeding schemes, and different populations. For hybrids, ML performed well when SCA was more important for determining the hybrid yield than GCA. Large differences were observed for different fungal diseases in wheat, while differences among methods for the same disease were relatively similar.<br> While ML has not yet provided a significant improvement over traditional methods in many scenarios, its flexibility and potential for multi-modal data integration remain promising. The development of plant breeding-specific model architectures, such as haplotype-based autoencoders, may represent a more promising path than the general application of standard ML models.
dc.description.sponsorshipSonstige Drittmittelgeber/-innen
dc.identifier.urihttps://jlupub.ub.uni-giessen.de/handle/jlupub/21133
dc.identifier.urihttps://doi.org/10.22029/jlupub-20479
dc.language.isoen
dc.relation.hasparthttps://doi.org/10.3389/fpls.2023.1178902
dc.relation.hasparthttps://doi.org/10.1111/pbr.13235
dc.relation.hasparthttps://doi.org/10.1186/s12859-025-06323-w
dc.rightsIn Copyright
dc.rights.urihttp://rightsstatements.org/page/InC/1.0/
dc.subjectMachine Learning
dc.subjectPflanzenzüchtung
dc.subjectGenomische Vorhersage
dc.subjectDeep Learning
dc.subject.ddcddc:580
dc.subject.ddcddc:570
dc.titlePredictive Modelling with Machine Learning in Plant Breeding
dc.typedoctoralThesis
dcterms.dateAccepted2025-10-24
local.affiliationFB 09 - Agrarwissenschaften, Ökotrophologie und Umweltmanagement
thesis.levelthesis.doctoral

Dateien

Originalbündel

Gerade angezeigt 1 - 1 von 1
Lade...
Vorschaubild
Name:
HeilmannPhilipp-2025-10-24.pdf
Größe:
9.6 MB
Format:
Adobe Portable Document Format

Lizenzbündel

Gerade angezeigt 1 - 1 von 1
Lade...
Vorschaubild
Name:
license.txt
Größe:
7.58 KB
Format:
Item-specific license agreed upon to submission
Beschreibung: