single-step method in a large population, the approximative Mixed reference population in genomic evaluation for clinical mastitis in Czech Holstein cattle

The accuracy of genomic breeding values (GEBV) for clinical mastitis (CM) of Czech Holstein cattle was analysed. The single-step genomic method and mixed genomic reference populations were employed. Dataset included 92,388 Holstein cows and 160,426 lactations. CM lactation incidence was 19.05%. Cows calved between 2017 and 2022 in 119 herds. A total of 4,969 Holstein sires and 35,814 Holstein females were genotyped. Three genomic matrices were used, two of them encompassing females. The linear animal model with repeatability included fixed effects of herd-year-season and parity-age at calving. The highest average accuracy of GEBV occurred for genotyped cows. SD of GEBV accuracy for bulls was significantly higher than these for cows. The accuracy of genotyped cows with health phenotypes ranged from 0.06 to 0.51. For the genotyped bulls born in 2021 and 2022, the maximal accuracy was 0.37; for genotyped heifers, the maximal accuracy was 0.42. The highest average GEBV accuracy occurred for the reference population with genotyped bulls and genotyped cows with phenotypes. The average accuracy for the young genotyped bulls increased by one percentage point when phenotyped cows were considered in the reference genomic population. The cow‘s GEBV accuracy benefited from including their genotypes in the prediction. It has been confirmed that due to the expansion of the genomic reference population to include a group of genotyped cows with phenotypes, the individual accuracy of GEBV for CM had increased.


Introduction
Genomic selection (Meuwissen et al., 2001;Schaeffer, 2006) has been implemented in dairy cattle breeding in many countries.Besides other things, the improved accuracy of breeding values, especially for low heritable traits and, subsequently, the high selection success, were the main reasons.Genomic selection accelerates genetic progress by shortening generation intervals while the accuracy of selection for the young animal is increasing (Schaeffer, 2006;Obšteter et al., 2019).The introduction of genomic selection has its hindrances, for example, the cost of genotyping.The accuracy of the genomic prediction depends on various parameters, including the size of the reference population and its genetic structure (Lee et al., 2017).As Meuwissen et al. (2001) stated, the first step is to genotype a sufficient number of animals with progeny records or phenotypes to create a genomic reference population.Particularly in small livestock populations, it could be a problem, as it is relatively costly to genotype, and a large number of genotyped animals is necessary for successful genomic selection.
Nevertheless, Obšteter et al. (2019) and Jenko et al. (2019) highlight the importance of the increasing number of genotyped females as those represent the genotyped animals with the records.The USA was the first to include females in its reference population (Wiggans et al., 2011).Thomasen et al. (2014Thomasen et al. ( , 2020) ) concluded that genotyping of cows is a quick method to increase accuracies of genomic predictions and upsurge genetic gain in a small population.To select animals as a parent of the next generation, the individual accuracy of genomic breeding value is necessary.For small populations, approximative accuracies can be calculated by the inversion of the left-hand side of a BLUP system of equations (VanRaden 2008).For the genomic evaluation by the single-step method in a large population, the approximative methods are developed by Misztal et al. (2013).Bauer et al. (2015) analysed those methods in the Czech Republic.
Mastitis is the most common disease in dairy cows.The clinical or subclinical form negatively impacts animal welfare and the economic efficiency of dairy farms.The costs associated with treatment, production losses, and reduced animal welfare are high (Jamali et al., 2018;Wolfová et al., 2006), including the increased risk of culling and shortening the production period of a dairy cow.The possibility exists to increase the resistance to mastitis in dairy cattle by selection despite clinical mastitis being a typical low heritable trait (Martin et al., 2018).In most cases, literature results indicate the heritability of mastitis between 0.01 and 0.10.Furthermore, in confirming this suggestion, Heringstadt et al. (2001) proved that the longstanding process of genetic selection to reduce the incidence of clinical mastitis led to a positive genetic trend in Norwegian dairy cattle from the 1970s to the 1990s.
Genomic evaluation for Czech Holstein cattle (Přibyl et al., 2012) started with the single-step genomic method (ss-GBLUP) for production traits.The bull reference population has been increasing gradually, with the oldest bulls born in the 1970s and originating mainly from North America and European countries.Still, most of the increase in the number of genotyped animals is due to the genotyping of females.The Czech Holstein breeder's organisation recently started a long-term project, FitCow, to raise the number of genotyped cows and increase the accuracy of genomic prediction.FitCow is in sync with the project: the national cattle health monitoring system "The Diary of Diseases and Medication" (Kasna et al., 2017).The Diary, as the web application, was made available to the farmers in August 2018 after a one-year trial.It consists of a farmers' online health recording form and a simplified key of diagnoses based on ICAR recommendations.Both projects' goal is to enable the selection for increasing disease resistance by including health traits in the selection index.The outputs of those projects are utilised in genomic evaluation for the health traits of Czech Holstein cattle.
The objectives of this study were to analyse the individual accuracy of genomic breeding values (GEBV) for clinical mastitis (CM) of Czech Holstein cattle when the mixed reference populations were employed.

Material and methods
The edited dataset included 92,388 Holstein cows and 160,426 lactations, with a lactation incidence of clinical mastitis for all lactations at 19.05%.Cows calved between 2017 and 2022 in 119 herds.Only 75% or more Holstein breed cows were included in the edited dataset.The proportions of breed admixture are based on pedigree.

Phenotypes
Farmers collected CM records and registered them voluntarily in the national cattle health monitoring system called "The Diary of Diseases and Medication" (Kasna et al., 2017).CM was defined as a binary trait with 0 (no case) and 1 (at least 1 case) in a lactation up to 305 days in milk.The records collected from the first to sixth parity were used for analysis.The herds included in the analysis must meet the requirements of regularly recording CM health records and exhibit a minimal CM lactation incidence rate (LIR) of 5% in the recording period.The SAS software package, version 9.4 (SAS, 2012), was used to edit data and calculate correlations and basic statistics.

Genotypes
A total of 4,969 Holstein sires and 35,814 Holstein cows and heifers were genotyped by the Illumina BovineSNP50 BeadChip V2 (Illumina Inc., San Diego, USA).54,609 SNP genotypes were available.
Categories of genotyped animals, their numbers and years of birth are listed in Table 1.
Three compositions (GM) of the genomic relationship matrix (G) according to the inclusion of the categories of genotyped animals in the genomic matrix were considered (Table 2 and Figure 1): -(GM_1) only genotyped bulls, -(GM_2) genotyped bulls and genotyped cows with a health phenotype, -(GM_3), genotyped bulls and genotyped cows with and without health phenotype.The GM_1B and GM_2B analyses complement the respective G matrix layouts (Table 2).In GM_1B and GM_2B, the genotyped females without health phenotypes were added to the pedigree without considering their genotypes for comparison between the analyses.Because of their phenotypes, genotyped cows with health phenotypes are included in the GM_1 analysis.Nevertheless, their genotypes were not considered in the GM_1.

Statistical methods
A mixed linear animal model was used to predict the GEBV for the CM trait.
Model equation: where y ijklm -the clinical mastitis (CM) as all-or-none trait in a lactation; parity_age i -the effect of the parity and age at calving i (15 levels); herd j -the effect of the herd j (119 levels); year_season kthe effect of the calving year (2017-2021) and season (January-March; April-June; July-September; December) k (19 levels); pe l -the permanent environmental effect of the cow l; a m -the additive genetic effect of the animal m; e ijklm -the residual effect Pedigree (Table 2) included 200,529 animals (GM_1; GM_2): 8,614 bulls and 191,915 cows; or 220,398 animals (GM_1b; GM_2b; GM_3): 8,614 bulls and 211,779 cows.
The single-step genomic method (SSGBLUP) used for genomic prediction enables to obtain GEBV for each animal included in the analysis (Misztal et al., 2009).It included the genomic information following the next equation, where genetic relationships from the genomic data (G matrix) and pedigree data (A matrix) were The BLUPF90 family programs by Misztal et al. (2018) were employed to predict the genomic breeding values and calculate their accuracy.We run the programme Renumf90 for preparing the renumbered datasets.For the prediction of GEBV, the programme Blupf90, and for calculating the accuracy of GEBV, the programme ACCGS was used.
The accuracy of GEBV by program ACCGS is calculated using the second approximative procedure by Misztal et al. (2013).This method involved the diagonal elements of inverses of the genomic relationship matrix and the pedigree relationship matrix for genotyped animals, and the accuracies are corrected for inflation.

Genomic breeding values for clinical mastitis
The principal advantage of ss-GBLUP is that it works directly with performance records and whole sets of animals included in the pedigree, simultaneously utilising genomic information of the animals.Genotyped animals' estimates are primarily influenced by using genomic information in breeding value prediction (Lee et al., 2017).Non-genotyped animals' estimates are influenced less according to the degree of kinship to genotyped animals.The changes due to the employment of genomic information appear in the predicted breeding values and the accuracies of breeding values.The changes in GEBV and its accuracy depend on the genetic structure of the genomic relationship matrix and the size of the reference population (VanRaden, 2008).
Table 3 presents the average breeding values of particular groups of animals for different analyses according to the composition of the G matrix.The average values were positive, ranging from 0.18 (young females without phenotypes) to 0.29 (females with phenotypes).The differences in the average GEBV are more noteworthy between the particular groups of genotyped animals than between the analyses.The lowest mean values occurred for bulls and females without phenotypes and the highest for cows with phenotypes (from 0.18 to 0.21).Both young animal groups showed lower mean values compared to older animals, especially cows (from 0.26 to 0.28).The analyses within three groups of genotyped animals, the bulls, young bulls and cows with phenotypes, yielded very similar means and SD: the bulls (0.019-0.023), young bulls (0.18-0.021), cows with phenotypes (0.025 to 0.026).Analysis GM_3 showed the highest SD within all groups.GM_3 means were the same as those for GM_1 except for young bulls born in 2021 and 2022.GM_2 analysis that included the genotyped bulls and the phenotyped and genotyped cows exhibited the highest means and average SD compared to GM_1 and GM_3, except for young bulls.
When the genotyped females without the health phenotype were considered in the analysis but only in pedigree, without their genotypes (GM_1B and GM_2B) included in the genomic matrix, the resulting means were the same when using G matrixes GM_1 and 1)

D m p p
Acta fytotechn zootechn, 26, 2023(1): 46-54 http://www.acta.fapz.uniag.skGM_1B.At the same time, SD in GM_2B decreased in comparison with GM_2.The means and SD of GM_1B and GM_2B were very similar for the groups of genotyped females without the health phenotypes.For GM3, when genotypes of those females are considered, the mean of GEBV decreased, and SD increased.
As Gao et al. (2015) stated, the variance of GEBV was inflated by including genotyped females in genomic prediction, producing more bias than predictions with proven bulls only.The conclusion is that the standard deviation of the breeding values will increase not simply by adding animals to the pedigree but by adding their genotypes.Of course, the GEBV value is also affected.In what direction changes of GEBV appear depends on how the genomic matrix specifies the relationship of animals, especially to the phenotyped animals.
Our analysis's primary aim was to determine if the use of the mixed genomic reference population (including genotyped cows and bulls) instead of only the bulls' genomic reference population influenced the accuracy of individual genomic breeding values, especially for clinical mastitis.Following the literature (Pryce et al., 2012;Obšteter et al., 2019), we hypothesised that it is advantageous in the genomic prediction to use genomic information on cows, not only bulls, because the GEBV accuracy will improve due to the increase in the reference population size.
The results on accuracy are presented in Table 4.We found the highest average accuracy of GEBV for the genotyped cows, not for the genotyped bulls, when it comes to analyses GM_2, GM_2b and GM_3 where the cow's genotype is considered.We assume that the known phenotype plays a vital role in the final GEBV accuracy for cows.On the other hand, the SD of GEBV accuracy for bulls is significantly higher than for cows, indicating that the range of GEBV accuracy in bulls is much higher than this for cows.Maximal accuracy values for genotyped bulls were 0.91, and the correlation between accuracy and the number of bulls' daughters lay around 56%.It shows that the GEBV accuracies of bulls depend partially on the number of daughters of the bull.
The accuracy of genotyped cows with health phenotypes ranged from 0.06 to 0.51.These cows showed higher GEBV accuracy, probably because they were phenotyped.
For the genotyped bulls born in 2021 and 2022, maximal accuracy amounts to 0.37, minimal to 0.08, while for genotyped cows born in 2021 and 2022, maximal accuracy amounts to 0.42, minimal to 0.01.We can conclude that the cows benefited from including their genotypes in the prediction.
Comparing the GM analyses (Table 4), the highest average GEBV accuracy occurred for GM_2 or GM_2B; for the young genotyped bulls born in 2021 and 2022 and the cows with phenotypes born in 2010 and 2020.On the contrary, for the group of genotyped bulls, the decrease occurred in GM_2.Generally, the average accuracy decreased in GM_3, where all genotyped animals were included, for all groups of genotyped animals.
Figures 1 and 2 show the average accuracy per birth year (from 2017 to 2022) for genotyped bulls (Figure 2) and genotyped cows (Figure 3), respectively.It turns out that the increase in accuracy in GM_2 is also observable in genotyped bulls if results are presented per birth year.Following Table 4, the lowest accuracy value for genotyped bulls occurred for GM_3.In Figure 3, the average accuracy for genotyped cows for GM_1 represent the accuracy without cows' genotypes.Therefore GM_1 average is much lower than those of * only in pedigree, without genotype GM_1 only genotyped bulls, GM_2 genotyped bulls and genotyped cows with a health phenotype, GM_3, genotyped bulls and genotyped cows with and without a health phenotype.The GM_1B and GM_2B analyses complement the respective G matrix layouts; the genotyped females without health phenotypes were added to the pedigree without considering their genotypes for comparison between the analyses GM_2 and GM_3.The lowest average accuracy for GM_3 in Figures corresponds with Table 4.
If the average accuracy is presented for all bulls (Figure 4) and all cows (Figure 5), the trends found for all bulls (Figure 4) are the same as for genotyped bulls (Figure 2).Still, the averages are lower for all bulls than for the genotyped bulls, and the maximum accuracy occurred in 2018.In the calculation of average accuracy, the nongenotyped bulls with lower accuracy represent the higher ratio of bulls.Figure 5 shows an interesting picture where trends in accuracy for all cows and heifers are captured.The accuracy for GM_2 and GM_3 (0.19-0.24) is lower than for the genotyped females in Figure 3 (0.27-0.33).GM_1 is again the lowest of all layouts.In Figure 4, GM_2 and GM_3 report the same values until 2019.The reason for these lower accuracy averages in the year 2022 for GM_2 compared with GM_3 is the higher number of genotyped females in GM_3.That increase in the number of genotyped females caused the rise of the GM_3 average accuracy in 2022, as it is represented in Figure 5.
Figures 3 and 4 show that the inclusion of the genotyped animal in genomic evaluation influences the accuracy of GEBV for non-genotyped animals only slightly.
The positive effect on the individual GEBV accuracy mainly manifests for the genotyped animals.When cows were included in the genomic matrix, Nguyen et al. (2016) found some improvement in genomic prediction accuracy, expressed as a correlation between predictions for validation sires, for Holsteins but not for Jersey.They explain the low increase in validation to the low number of genotyped cows.Compared to the analysis we present, the number of cows in the genomic reference population was lower than the number of bulls.
The analysis was focused on the production traits while we analysed the health traits.It is not a rule that there will be an increase in reliability with the inclusion of cow's genotypes, as Cooper et al., 2015 found in US Holstein when they added the genotyped cows to the reference population.The reason probably was a large number of genotyped sires in the genomic reference population.Dehnavi et al. (2018) found an increase in the accuracy of genomic prediction of about 0.166 for low heritable traits due to adding cow genotypic and phenotypic information to the bulls' reference population.A slightly higher increase occurred for low heritable traits than for production traits.Jenko et al. (2017) verified that adding the cows to the bulls' genomic reference population increased the accuracy of genomic prediction for production traits and calving interval in Guernsey cattle.
The increases in the correlation between GBLUP and BLUP approaches were by 0.060 ±0.015 for milk, 0.036 ±0.019 for fat, 0.033 ±0.015 for protein, and 0.024 ±0.024 for calving interval.For mastitis, Gao et al. (2015) found that the reliability of genomic prediction expressed as squared Pearson correlation coefficient between GEBV and deregressed proof divided by the average reliability of the deregressed proof of the validation bulls achieved a gain of 5.1 percentage points when all genotyped cows were included into the genomic matrix.Pryce et al. ( 2012) point out that genotyped females may be included in the reference population of genotyped animals but only cautiously considering preferential treatment of cows.
Nonetheless, the inclusion of females in the genomic reference population is beneficial; e.g., a rise of 8% in bulls' reliabilities has been found due to adding 10,000 genotyped cows to the reference population (Pryce et al., 2012).Buch et al. (2012) showed the positive impact of the inclusion of cows in the reference genomic population on the accuracy of GEBV, especially on functional traits in conditions of small-scale phenotyping, which corresponds to conditions in the current population of Czech Holstein cattle and selection for health traits.For Czech Holstein cattle, where the extent of disease monitoring is limited to about a third of the population, we can, according to Buch et al. (2012), expect to increase the reliability of the estimate until the number of phenotyped population surges.
We can justify why the accuracy of genomic breeding values decreased after the inclusion of non-phenotyped genomic females into the prediction.These females were daughters or relatives of the phenotyped cows, but they added no extra information.The only information they provided was about their genotype and, thus, relatedness to other animals in the estimate.The results presented lead to the conclusion that there is no need to include those genotyped cows, which will never again have a phenotype or be candidates for selection, in the calculation of breeding values.

Conclusions
It has been confirmed that due to the expansion of the genomic reference population to include a group of genotyped cows with phenotypes, the individual reliability of GEBV for CM has increased.Using genotyped cows with phenotypes is one way of successful genomic selection for clinical mastitis in the population of Czech Holstein cattle.With average accuracy for young bulls and heifers of 0.21 to 0.28, we can conclude that genomic selection is a promising approach to accelerate genetic gains for clinical mastitis resistance.Further increase in the accuracies can be expected, especially for young bulls, after the extension of the monitoring period of health traits and obtaining of a higher pool of historical data, an increasing number of monitored herds and dairy cows, and possibly conversion of the presently applied single-trait linear animal model to a multi-trait model using the information on somatic cells and exterior.

Figure 5
Average accuracy of genomic breeding values for all cows, genotypes of cows involved in GM_2 and GM_3

Table 2
Description of the analyses, number of genotyped animals, pedigrees only in pedigree, without genotype GM_1 only genotyped bulls, GM_2 genotyped bulls and genotyped cows with a health phenotype, GM_3, genotyped bulls and genotyped cows with and without health phenotype.The GM_1B and GM_2B analyses complement the respective G matrix layouts; the genotyped females without health phenotypes were added to the pedigree without considering their genotypes for comparison between the analyses * GM_3 Genotyped bulls; Phenotyped and genotyped cows ; genotyped and non phenotyped females GM_2 Genotyped bulls; Phenotyped and genotyped cows GM_1b Genotyped and non phenotyped females only in pedigree http://www.acta.fapz.uniag.skSlovakUniversity of Agriculture in Nitra Faculty of Agrobiology and Food Resources where: D -diagonal with as in Amin et al. (2007) and Leuttenger et al. (2003).To estimate genomic relationship and inbreeding, VanRaden (2008) introduced M, it is the matrix that specifies which marker alleles each individual inherited.Dimensions of M are the number of individuals (n) by the number of loci (m).Equations can include marker information using n × n matrix MM' or m × m matrix M'M.If elements of M are set to −1, 0, and 1 for the homozygote, heterozygote, and other homozygote, respectively, diagonals of MM' count the number of homozygous loci for each individual, and off-diagonals measure the number of alleles shared by relatives.In contrast, diagonals of M'M count the number of homozygous individuals for each locus, and off-diagonals measure the number of times alleles at different loci were inherited by the same individual.Let the frequency of the second allele at locus i be p i , and let P contain allele frequencies expressed as a difference from 0.5 and multiplied by 2, such that column i of P is 2(p i − 0.5).Subtracting P from M gives incidence matrix Z, which sets the mean values of the allele effects to 0.

Table 3
The means of genomic breeding values for the particular groups of animals

Table 4
Average accuracy of genomic breeding values for the groups of animals only in pedigree, without genotype GM_1 only genotyped bulls, GM_2 genotyped bulls and genotyped cows with a health phenotype, and GM_3 genotyped bulls and genotyped cows with and without a health phenotype.The GM_1B and GM_2B analyses complement the respective G matrix layouts; the genotyped females without health phenotypes were added to the pedigree without considering their genotypes for comparison between the analyses * GM_1 only genotyped bulls, GM_2 genotyped bulls and genotyped cows with a health phenotype, GM_3, genotyped bulls and genotyped cows with and without health phenotype