Breeding values prediction for clinical mastitis in Czech Holstein cattle

This study aims to genetically evaluate clinical mastitis (CM) in Holstein cattle using a two-trait repeatability animal model with the average lactation somatic cell score (LSCS) as an indicator trait of mastitis. The data set included 21,786 Holsteins with 29,110 lactations in 59 herds and with a calving date between 2015 and 2019. CM was considered as an all-or-none trait (values 0 or 1) in the period from calving to 305 days in milk, and the LSCS was obtained by logarithmic transformation of the average of the individual test-day records for somatic cell count over lactation. Heritability of CM was estimated using a single-trait repeatability animal model, whereas the genetic correlation between CM and LSCS was assessed through a two-trait repeatability animal model. Fixed effects included in the analyses were parity-age and herd-year-season, and the random effects were the permanent environment and the animal. The (co)variance matrix was employed in breeding values estimation for both single-trait (only CM) and bivariate models (CM and LSCS) including genomic prediction. Only genotyped sires formed the reference population for the single-step genomic evaluation. The heritability for CM was 0.04 in the single-trait and 0.05 in the two-trait analysis. Genetic correlation between CM and LSCS was 0.80. The employment of the two-trait model had a considerably strong influence on reliability. The reliability increased for cows with records as well as for the genotyped sires. This study indicates that the two-trait analysis of CM and LSCS is feasible and improves the reliability of the estimated breeding values.


Introduction
Clinical mastitis (CM) in dairy cows causes noticeable deterioration in the production and reproduction of animals (Oltenacu and Broom, 2010) as well as considerable worsening of cow's welfare. Also, CM is often the reason for culling. According to Kvapilík et al. (2016), udder diseases are the most common reasons for involuntary culling of dairy cows in the Czech Republic. Rilanto et al. (2020) reported that udder disease is the second most common reason for culling, after foot and claw disorders. The improvement of cow treatment and herd management can lead to a decrease of the CM incidence in dairy herds. Still, the selection of animals based on genetic evaluation should be used for the strengthening of heritable resistance to udder diseases. Breeding for a reduced rate of CM in dairy herds can be achieved by direct selection based on CM records combined with indirect selection using genetically correlated traits. Nevertheless, CM generally shows low heritability (up to 10%) (Martin et al., 2018). Neuenschwander et al. (2012) claimed that because of low heritability of CM, only slow genetic improvement can be expected unless CM is strongly weighted in the selection index. Heringstad et al. (2001) have confirmed the positive results of direct genetic selection on CM traits.
The somatic cell count (SCC) or its logarithmic expression, i.e. somatic cell score (SCS), are widely used as indicator traits for CM and subclinical mastitis due to their strong phenotypic and genetic relationships with udder health and to higher heritability than CM. Rupp and Boichard (2003) estimated a heritability around 0.15 with a range from 0.10 to 0.18 for average lactation somatic cell score (LSCS) obtained by averaging the individual test-day records over lactation. Positive genetic correlation of 0.70 between CM and LSCS was published by Carlén et al. (2004). Pérez-Cabal and Charfeddine (2013) reported genetic correlations between LSCS and CM from 0.76 to 0.85. Heringstad et al. (2006) stated that based on a high genetic relationship, LSCS is a valid indicator of CM, but still, LSCS is only an indirect selection trait.
The most significant response to selection against CM could be expected if CM, as the direct trait, would be used in combined estimation with an indirect trait such as LSCS. The multi-trait model approach is qualitatively better in comparison to the single-trait model because of the higher accuracy of the breeding values (Schaeffer, 1984) and a better model prediction performance (Negussie et al., 2005).
Heritability estimates for CM in Czech Holstein cattle were 0.11 (Wolf et al., 2010), 0.07 (Zavadilová et al., 2015), and 0.09 to 0.10 (Zavadilová et al., 2017). Kašná et al. (2018) provided further estimates of genetic parameters and breeding values based on the available data and obtained heritability from 0.08 to 0.11 for CM. Zavadilová et al. (2015) estimated genetic correlations from 0.79 to 0.83 between CM and LSCS. The same authors estimated an LSCS heritability of 0.23.
The aim of this study was to perform a genetic evaluation of CM in Czech Holstein cattle using the two-trait linear animal model with LSCS as an indicator trait of CM.

Materials
Records of CM were collected by farmers and registered voluntarily in the national cattle health monitoring system, called "The Diary of Diseases and Medication" (The Diary; Šlosárková et al., 2016). In the Czech Republic, this recording system was implemented in August 2018 after a one-year trial period. It consists of an on-line health recording form for farmers and a simplified key of diagnoses based on the recommendations of the International Committee for Animal Recording (ICAR). The data for cows' lactation traits, such as the date of calving, SCC, length of lactation, and parity were extracted from the official database of the Holstein Cattle Breeders Association of the Czech Republic provided by the Czech Moravian Breeders' Corporation, Inc.. A minimum of 5 daughters per sire, 50 cows per herd and ten contemporaries per herd-year-season was required. Only cows with a lactation length of at least 240 days were included in the analysis. The condition that the cow must always have the first lactation was not applied due to the relatively short period of data collection. SCC was transformed to LSCS according to the following formula (Ali and Shook, 1980): LSCS = 3 + log2(SCC/100). After editing, the dataset included 21,786 Holstein cows with 29,110 lactations in 59 herds and with a calving date between 2015 and 2019. The basic statistics of the data are shown in Table 1.

Statistical methods
The following repeatability linear animal model was employed to estimate heritability for CM (singletrait analysis) or genetic correlation between CM and LSCS (two-trait analysis): where yijkl is CM considered as an all-or-none trait (0 = no CM case; 1 = at least 1 CM case) in the period from calving to 305 days in milk, or/and LSCS; HYSi is the fixed combined effect of the herd (59 levels), year of calving (5 levels, 2015 to 2019), and the season of calving (4 levels: January, February, March; April, May, June; July, August, September; October, November, December), for a total of 546 levels; Pa_agej is the fixed effect of parity-age class (15 levels; first, second, third, fourth, and fifth + sixth parity, and 3 age classes per parity); pek is the random permanent environmental effect of the cow (21,786 levels); al is the random additive genetic effect of the animal (84,205 levels); and eijkl is the random residual effect.
Estimated breeding values (EBV) were predicted through the same single-trait (only CM) and two-trait models (CM and LSCS) used to estimate genetic parameters. The pedigree file contained 84,205 animals (4 generations were traced back). Sires (n = 4,568) were genotyped using the Illumina BovineSNP50 Bead chip (Illumina, San Diego, CA, USA).
A single-step procedure was applied (Aguilar et al., 2010, Christensen andLund, 2010) for genomic breeding value estimation. A genomic relationship matrix (G) was calculated according to deviations from the averages of observed allele frequencies and standardised using division by the average value of the diagonal of G. Average of diagonal elements was 1 (Forni et al., 2011). The elements of an additive pedigree relationship matrix for genotyped animals A22 and elements of G have the same average (Vitezica et al., 2011). The total number of effective SNP used in the calculation of G matrix was 38,883, that of effective animals was 4,380, and the total number of parent-progeny evaluations was 2,992.
The relative EBV (REBV, in %) were calculated using an average of predicted breeding values for each separate analysis as 100% and with a SD of 12%. Higher relative breeding values mean a favourable value for CM or LSCS, i.e. higher resistance to disease.
Variance and covariance components, EBV, and their reliability (REL) were estimated with the BLUPF90 family of programs: RENUMF90 for the renumbering of effects, pedigree file, and incorporation of the genomic matrix; REMLF90 for variance components estimation; BLUPF90 for EBV and genomic breeding value estimation; and ACCf90 for reliability calculation (Misztal et al., 2018). The SAS software package, version 9.4 (SAS Institute Inc., Cary, NC, USA) was used for data editing and calculation of basic statistics.

Results and discussion
The lactation incidence rate for CM across the lactations was 23%, and it increased with parity with values of 16% for primiparous and 33% for animals of the fifth parity (Table 2). Our findings on the lactation incidence rate increasing with parity agree with Mrode et al. (2012), except for lactation incidence rates which are higher in our study. The variances, heritabilities, and genetic correlations estimated for CM and LSCS are in Table 3. The additive genetic and residual variances for CM were higher, and the permanent environmental variance was lower in the two-trait than in the single-trait model. The total variance was the same in both models. Consequently, heritability was higher in the two-trait than in the single-trait model. Genetic correlation between CM and LSCS was 0.80, in agreement with Carlén et al. (2004) and Ødegård et al. (2004). Similarly, Pérez-Cabal and Charfeddine (2013) reported a genetic correlation of 0.85 between SCS305 and CM as a binary trait. In the multi-trait models, the accurate estimation of genetic parameters is necessary. Schaeffer (1984) pointed out that the incorrect estimation of genetic correlation between traits would lead to higher prediction error variance, mainly in traits with low heritability. The estimated (co)variance matrixes were used for prediction of breeding values for CM and their REL. The resulting trends of REBV and REL averaged by year of birth of cows and sires are in Figure  1 and Figure 2, respectively. The REBV for different traits and models are mutually comparable because they are based on the average of EBV for each model prediction. Trends of REBV are very similar for the single-trait and the two-trait model as well as for CM and LSCS. The averages are very close to 100%. The trends for cows and sires for CM are decreasing. However, the genetic trends for LSCS are slightly increasing, especially in sires, where they eventually exceed 100%. This course is likely a consequence of the selection for udder health based on SCS that has been employed in the Czech Holstein population for the last three decades. Despite this, the trends for CM did not change noticeably by using the two-trait instead of the single-trait model for prediction. The alteration of the model had a considerably strong influence on REL. The average REL by birth year increased in cows as well as in sires. However, the values of the two-trait model were higher than the values of the single-trait model.
Similar results for REL of CM EBV are presented in Table 4. The average reliabilities of CM EBV from the single-trait and the two-trait model are presented. It is evident that REL is vastly larger when the two-trait model, instead of the single-trait model is used regardless of the sex of the animals or employment of genomic prediction. For all animals or cows with records, the average REL from twotrait model rise two times compared to the single-trait model. The differences by genomic evaluation were minimal, especially in cows, and no change in REL was found. On the contrary, the average REL increased for genotyped sires as a consequence of the use of the two-trait model or genotypic evaluation. The increase was the same (9 percentage points) for the single-trait and the two-trait model or for the conventional and the genomic evaluation. Different effects of genomic prediction according to the group of animals can be explained by the selection of genotyped animals for the genotypic reference population and relationship of other animals to genotyped animals. In the present study, only genotyped sires constitute the reference population. Therefore, those sires are the most influenced by using genotypic evaluation. The positive effect of the two-trait model evaluation was found for every animal but mostly for cows with records. As Buch et al. (2011) stated, the phenotype of the animal for one trait helps predict the Mendelian sampling term for the second trait in the model.

SD -standard deviation
Correlations between EBV or REL for CM predicted by different models are presented in Figure 3. The decrease of correlations is proportional to the degree of differences between the breeding value prediction or REL estimates due to alterations in the used models. The correlations are compared for three different groups: all animals in the analysis, genotyped sires, and cows with records. Regarding EBV, the correlations between EBV predicted by the single-trait model and EBV predicted by the twotrait model were very similar, irrespective of conventional or genomic prediction. If the correlations were estimated between predictions both from the single-trait or the two-trait model, the correlations were higher than in the previous comparison. The lowest correlations occurred between EBV when the employed models differed in both characteristics, the number of traits and genomic prediction. The correlations were similar for all animals and cows with records. We can conclude that the changes in EBV were more substantial by employing the two-trait model instead of the single-trait model, and a genomic prediction instead of a conventional one. For the genotyped sires, the results are different because of the more important effects of genomic prediction on their EBV. Therefore, the correlations are substantially lower between EBV predicted by the conventional model and EBV predicted by the genomic model. The impact of model change on REL was different in each analysed group. The lowest correlations, i.e. the most significant changes in REL occurred for genotyped sires due to using the two-trait instead of the single-trait model. In compliance with Table 4, the REL of genotyped sires was the most influenced by genomic prediction, but the REL of other groups, i.e. all animals and cows with records, were affected less by genomics and more by the number of traits in the model..

Conclusions
This study indicates that the two-trait analysis of CM and LSCS using the repeatability linear animal model based on whole lactations is feasible and enables an increase in the REL of EBV. The genomic prediction improves the resulting REL in the genotyped animals and their close relatives. The obtained genetic parameters were comparable with other studies.