Genome-wide characterisation of regions under intense selection based on runs of homozygosity in Charolais cattle

In this study, 68 genotyped purebred cows and bulls of Charolais cattle were used to determine runs of homozygosity (ROH) for evaluation of selection effect on the genome structure. ROH analysis was performed for 41,153 SNPs, and homozygous segments considering a minimum of 15 consecutive SNPs. The aim was to identify if regions of the genome with a high frequency of SNPs in ROH include signatures of selection. The most significant outlier SNPs were found on BTA2, 5, 7 and 19 (11 regions), with a sum of ROH length of 15.85 Mb. These regions contained genes included in various biological processes associated with the functioning of the immune system, growth, reproduction and metabolism. Various quantitative traits loci (QTLs) which affected the characteristics of meat production and reproduction have been identified in these regions. Overall, obtained results suggest that the Charolais cattle genome includes selection signatures reflecting the improvement of meat production and reproduction in accordance to breeding objecives.


Introduction
The Charolais cattle is the popular beef breed worldwide. Due to the number of animals, Charolais is one of the most numerous beef breeds in Slovakia. In 2019, 10,225 cows and 283 bulls were recorded. The Charolais cattle originated from France and is characterised by a robust and muscular body, excellent feed conversion rate and calving ease. The main breeding goal included the effort to create a population of modern animals with excellent meat production in combination with maintaining good adaptability to the natural environment, excellent maternal characteristics, good ability of grazing and hornless (Jahuey-Martínet et al., 2019; The Breeding Services of the Slovak Republic, s.e., 2019; Pomichal, 2009).
Selection left footprints in the genome, which were characterised by high genetic differentiation between breeds or an evident decrease in genetic diversity in genomic regions associated with traits that undergone directional selection. The identification of selection signals involved in phenotypic variations was important for understanding the evolutionary processes and mechanisms that underlie properties that have been subjected to natural or artificial selection (Mastrangelo et al., 2020).
The purpose of the analysis of selection signals is to identify genomic regions or loci that showed deviations from neutrality. Genomic regions that have been affected by the intensive selection showed reduced genetic diversity and high homozygosity, and these regions can be called runs of homozygosity (ROH) islands. Current genomic tools have enabled the genotyping of thousands of polymorphisms and thus made it possible to identify long stretches of homozygous genotypes in the genome. Two identical inherited haplotypes from the parents create homozygous segments in the genome, i.e. ROH (Onzima et al., 2018;Szmatoła et al., 2020;Shi et al., 2020, Mastrangelo et al., 2019Purfield et al., 2017).
Genome-wide analysis of ROH may be performed to understand population history, calculate genomic inbreeding, decipher the genetic architecture of complex traits and diseases, as well as to identify genes associated with economically important traits (Dixit et al., 2020).
The purpose of this study was to characterize unique regions potentially under selection and identify genes and metabolic pathways associated with traits of economic interest in Charolais cattle.

Material and methods
A total of 45 purebred cows and 23 bulls of Charolais cattle were genotyped by International Dairy and Beef Chip. Quality control of genotyped data was performed in PLINK v1.9 software (Chang et al., 2015) according to set up of Moravčíková et al. (2018): call rate for SNPs and for individuals > 90 %, MAF > 0.01.
Runs of homozygosity were computed using detectRUNS package in R software environment (Biscarini et al., 2019). Criteria used for defining ROHs were as follows: minimum number of homozygous SNPs in window = 15, maximum number of heterozygous SNPs = 0, maximum number of missing genotype = 1. The frequency of SNPs in ROH longer than 4 Mb were calculated. The outlier SNPs denoted as genomic regions under intense selection were determined by the upper quartile of boxplot and the threshold was set to 8% SNPs in ROH. For each region affected by selection, a list of genes was created using Genome Data Viewer by assembly ARS-UCD1.2 and metabolic pathways were identified by web-based toolkit WebGestalt (http://www.webgestalt.org/).

Results and discussion
After applying the quality control, the final dataset included 41,153 SNPs with an overall length of genome of 2,503 Mb and average distance between adjacent loci of 60.87 ± 63.45 kb. The minimum distance between SNPs was 0.028 kb, and the maximal distance was 2,141.34 kb. Selection signals were identified for all autosomes across the genome and for the subsequent analysis four chromosomes where selection focused on traits of economic importance were chosen (Fig 1). Analysis of homozygous segments within the Charolais genome allowed to determine genomic regions with a high frequency of SNPs in ROH. The average length of ROH was 5.34 ± 4.43 Mb, and overall length of ROH was 15.85 Mb. In this study the most significant outlier SNPs were located in 11 regions in 4 chromosomes (BTA2,5,7,and 19).  Table 1.
Detailed examination of identified signals showed that affected genomic regions were subject to intensive selection for the meat production, reproduction and milk components. The strongest selection signal was observed on BTA5 in the area of 74 protein-coding genes, and various QTLs which affected mainly reproduction and milk composition. The genes HOXC4, HOXC5,HOXC6,HOXC8, are involved in the skeletal system development (Gaudet et al., 2011). Within BTA2 (6.22 -6.23 Mb), gene MSTN; the latter is known for a regulator of muscle growth factor (Trukhachev et al., 2015). The FAF2 gene was identified in BTA7 (39.41 -39.47 Mb). Based on Patel et al. (2017), this gene is associated with some reproductive function in cattle. Gene NLRP3 involved in the innate immune system was also observed on BTA7 (42.18 -42.23) (Mallikarjunappa et al., 2018). Gene CA10 is exclusively expressed in the brain, and it is highly conserved across animal species, was found on BTA 19 (0.60 -1.50 Mb) (Aspatwar et al., 2014).

Conclusions
This study provided information about potential candidate genes and QTL regions affected by directional selection in Charolais cattle. The results indicate several genomic regions affected by intensive selection. The formation of these regions may be based on adaptation to environmental conditions, such as immune system function, disease resistance, response to stimulus these regions also reflect efforts to improve meat production and reproduction traits such as calving ease. This study provided proofs that some areas of the genome in Charolais have been affected by intensive longterm selection on meat-related traits and on specific traits of interest for beef cattle.