Assess the accuracy of predicted previously unobserved genotypes (individuals) based on the available training data. Runs k-fold cross-validation for potentially multiple traits and optionally computing prediction accuracy on user-specified selection index. Three models are enabled: additive-only ("A"), additive-plus-dominance ("AD") and a directional-dominance model that incorporates a genome-wide homozygosity effect ("DirDom"). The union of all genotypes scored for all traits is broken into k-folds a user specified number of times. Subsequently each train-test pair is predicted for each trait and accuracies are computed.
runCrossVal(
blups,
modelType,
selInd,
SIwts = NULL,
grms,
dosages = NULL,
nrepeats,
nfolds,
ncores = 1,
nBLASthreads = NULL,
gid = "GID",
seed = NULL,
...
)
nested data.frame with list-column "TrainingData" containing BLUPs. Each element of "TrainingData" list, is data.frame with de-regressed BLUPs, BLUPs and weights (WT) for training and test.
string, "A", "AD", "DirDom". modelType="A": additive-only, GEBVS modelType="AD": the "classic" add-dom model, GEBVS+GEDDs = GETGVs modelType="DirDom": the "genotypic" add-dom model with prop. homozygous fit as a fixed-effect, to estimate a genome-wide inbreeding effect. obtains add-dom effects, computes allele sub effects (\(\alpha = a + d(q-p)\)) incorporates into GEBV and GETGV. "DirDom" requires dosages
logical, TRUE/FALSE, selection index accuracy estimates,
requires input weights via SIwts
required if selInd=FALSE
, named vector of selection index
weights, names match the "Trait" variable in blups
list of GRMs where each element is named either A, D, or, AD. Matrices supplied must match required by A, AD and ADE models. For ADE grms=list(A=A,D=D)
dosage matrix. required only for modelType=="DirDom". Assumes SNPs coded 0, 1, 2. Nind rows x Nsnp cols, numeric matrix, with rownames and colnames to indicate SNP/ind ID
number of repeats
number of folds
number of cores, parallelizes across repeat-folds
number of cores for each worker to use for multi-thread BLAS
string variable name used for genotype ID's/ in e.g. blups
(default="GID")
numeric, use seed to achieve reproducibile train-test folds.
Returns tidy results in a tibble with accuracy estimates for each rep-fold in a list-column "accuracyEstOut".
Other CrossVal:
runParentWiseCrossVal()