Stacked ensembles on basis of parentage information can predict hybrid performance with an accuracy comparable to marker-based GBLUP

Heilmann, Philipp Georg; Frisch, Matthias; Abbadi, Amine; Kox, Tobias; Herzog, Eva

Stacked ensembles on basis of parentage information can predict hybrid performance with an accuracy comparable to marker-based GBLUP

dc.contributor.author	Heilmann, Philipp Georg
dc.contributor.author	Frisch, Matthias
dc.contributor.author	Abbadi, Amine
dc.contributor.author	Kox, Tobias
dc.contributor.author	Herzog, Eva
dc.date.accessioned	2023-09-22T06:40:15Z
dc.date.available	2023-09-22T06:40:15Z
dc.date.issued	2023
dc.description.abstract	Testcross factorials in newly established hybrid breeding programs are often highly unbalanced, incomplete, and characterized by predominance of special combining ability (SCA) over general combining ability (GCA). This results in a low efficiency of GCA-based selection. Machine learning algorithms might improve prediction of hybrid performance in such testcross factorials, as they have been successfully applied to find complex underlying patterns in sparse data. Our objective was to compare the prediction accuracy of machine learning algorithms to that of GCA-based prediction and genomic best linear unbiased prediction (GBLUP) in six unbalanced incomplete factorials from hybrid breeding programs of rapeseed, wheat, and corn. We investigated a range of machine learning algorithms with three different types of predictor variables: (a) information on parentage of hybrids, (b) in addition hybrid performance of crosses of the parental lines with other crossing partners, and (c) genotypic marker data. In two highly incomplete and unbalanced factorials from rapeseed, in which the SCA variance contributed considerably to the genetic variance, stacked ensembles of gradient boosting machines based on parentage information outperformed GCA prediction. The stacked ensembles increased prediction accuracy from 0.39 to 0.45, and from 0.48 to 0.54 compared to GCA prediction. The prediction accuracy reached by stacked ensembles without marker data reached values comparable to those of GBLUP that requires marker data. We conclude that hybrid prediction with stacked ensembles of gradient boosting machines based on parentage information is a promising approach that is worth further investigations with other data sets in which SCA variance is high.
dc.identifier.uri	https://jlupub.ub.uni-giessen.de//handle/jlupub/18499
dc.identifier.uri	http://dx.doi.org/10.22029/jlupub-17863
dc.language.iso	en
dc.rights	Namensnennung 4.0 International
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject	machine learning
dc.subject	stacked ensembles
dc.subject	gradient boosting
dc.subject	genomic prediction
dc.subject	general combining ability
dc.subject	specific combining ability
dc.subject	hybrid breeding
dc.subject	hybrid prediction
dc.subject.ddc	ddc:630
dc.title	Stacked ensembles on basis of parentage information can predict hybrid performance with an accuracy comparable to marker-based GBLUP
dc.type	article
local.affiliation	FB 09 - Agrarwissenschaften, Ökotrophologie und Umweltmanagement
local.source.articlenumber	1178902
local.source.epage	15
local.source.journaltitle	Frontiers in plant science
local.source.spage	1
local.source.uri	https://doi.org/10.3389/fpls.2023.1178902
local.source.volume	14

Dateien

Originalbündel

Gerade angezeigt 1 - 1 von 1

Name:: 10.3389_fpls.2023.1178902.pdf
Größe:: 2.58 MB
Format:: Adobe Portable Document Format

Herunterladen

Sammlungen

Publikationen im Open Access gefördert durch die UB