Identification of Single Nucleotide Polymorphisms on Cattle Breeds in Indonesia Using Bovine 50k

Single nucleotide polymorphisms (SNPs) abundant in bovine genome influence genetic variation in biological mechanism. The study aimed to identify SNPs on Indonesian cattle breeds and analyze their genetic diversity using Bovine 50K SNP chip. Twenty eight "Ongole Grade" (OG) beef cattle and 20 "Holstein Friesian" (HF) dairy cattle were used for the Infinium II assay test. This assay included amplification of genomic DNA, fragmenta-tion, precipitation, resuspension, hybridization, processing bead chip for single-base extension, and imaging at iScan. Data and clusters were analyzed using GenomeStudio software. The Bovine 50K SNP chip containing 54,609 SNPs was observed spanning all chromosomes of bovine genome. Genotyping for the total SNPs was successfull based on Call Rate, GeneCall and GeneTrain scores. Most SNP markers had alleles that shared among the individuals or breeds, or had specific alleles at distinctive frequencies. Minor allele frequency (MAF) spreads equally with intervals of 0-0.5. The breeds of OG and HF tended to be separated in different clusters without considering their genetic history and twin or normal. This result suggests that most individuals are closely related to one another, regardless of the same breed. Some genes identified on chromosomes 3, 4, 5, 7, 13, 17 and 18 were located in the loci/regions that contained SNPs with specific alleles of either HF or OG breed. These SNPs were more powerful for differentiation of beef cattle and dairy cattle than among individuals in the same breed. These SNP variations and genetic relatedness among individuals and breeds serve basic information for cattle breeding in Indonesia.


INTRODUCTION
Referring to the category of cattle worldwide, beef cattle and dairy cattle are the most cattle farms existing in Indonesia. Beef cattle are great important in Indonesia because of their economic and sociocultural values. In addition to beef cattle, dairy cattle farms also have a potential role in increasing farmers' income and livestock development. In addition to meat and milk, both cattle breeds produce organic fertilizer and increase the use of agricultural waste biomass (Thohari 2000;Aryogi and Romjali 2009). Thus, cattle development has a strategic value to achieve food security in parallel with growing population in Indonesia.
The diverse geographical areas with different climate, environmental conditions and local socioculture influence high diversity of cattle germplasm with distinctive morpho-physiological characteristics. The high diversity of cattle is useful for farms development since the genetic materials are needed to develop new breed with high productivity and other interest characters (Diwyanto 2005).
Cattle "Ongole Grade" (OG) belonging to species Bos indicus and sub-family Bovinae are commonly found in Java and other regions in Indonesia. This OG is relatively pure its genetic, but has experienced adaptation in Indonesia. This cattle breed is highly desirable by farmers as it is profitable, and easy and low cost in maintenance. While "Holstein Friesian" (HF), a dairy cattle breed from Bos taurus originated from Holland, has a high productivity of milk. HF could be used as genetic material in dairy cattle breeding program to increase milk production (Aryogi and Romjali 2009;Prahanisa et al. 2011). Considerable potency of cattle population leads to investigation of relatedness and genetic diversity using molecular characterization as basic information for future breeding program.
Molecular markers in cattle have been applied on several target characters. The characters are not only related to growth and productivity of meat and milk, but also disease resistance, fertility and environmental stress tolerance . On the basis of detection techniques, molecular markers are categorized into hybridization-based and PCR-based markers in cattle genetic research. The PCR-based markers are divided into sequenced-targeted PCR assay and arbitrary PCR assay. The former categories include cleaved amplified polymorphic sequence, alleles specific PCR, PCR amplification of specific alleles, simple sequence length polymorphism, and sequence-targeted microsatellite site. And, arbitrary PCR assays are such as RAPD and microsatelliteprimed PCR. Microsatellites/simple sequence repeat (SSR) markers are popular in genetic characterization on cattle due to their easy application and high variation (Sunnucks 2001;Deb et al. 2013).
Single nucleotide polymorphisms (SNPs), a bi-allelic type of marker, become popular because of many advantages. Compared to other genomic variation, SNPs are the most abundant known so far in animal. The SNPs could be a potential genetic marker and get a higher interest because of the stability and highthroughput automated analysis (Fries et al. 1990;Heaton et al. 2002). To complement the development of molecular markers on the basis of single or few loci, high throughput genotyping via next generation sequencing (NGS) in the form of array or chip-based markers is more useful. Such markers could be used for a variety of purposes including genome-wide association studies, population studies, bulk segregant analyses, quantitative trait loci (QTL) interval mapping, whole genome profiling, background screening, etc. (Kim et al. 2006;Wenzl et al. 2007;Gupta et al. 2008).
In cattle, genomic evaluation was initiated and available years ago. The first generation bead chip with low density, Bovine 3K bead chip, was introduced to increase the adoption of genomic testing in 2010 (Illumina Inc. 2011a; Wiggans et al. 2011). In addition, the high density bead chip called as Bovine 50K SNP from Illumina was commercialized. Unlike Bovine 3K, Bovine 50K SNP chip (Infinium) was available with more than 50,000 informative SNPs that uniformly span the entire bovine genome. Rapid detection of Bovine 50K chip was evidenced by the number of new individuals tested (Illumina Inc. 2011b;Wiggans et al. 2011).
The Bovine 50K chip has been used for many studies and assists selection in cattle breeding program in other countries. However, so far, no study reported bovine genomic evaluation in Indonesian cattle population/breed using the high throughput technology. This study aimed to identify SNP on cattle breeds in Indonesia and to analyze their genetic diversity using Bovine 50K SNP chip with iScan.

Individual Materials
A total of 48 individuals comprising of 28 beef cattle (Ongole Grade/OG) and 20 dairy cattle (Holstein Friesian/HF) were used in this study. The OG and HF cattle breeds were obtained from the collection of Beef Cattle Research Station (BCRS) and Indonesian Research Institute for Animal Production (IRIAP), respectively. Both the research institutes are under the Indonesian Agency for Agricultural Research and Development (IAARD). Most individuals were female accounting for 97.9% of total. According to historical aspect but not genetically, most of the cattle used were considered as twinning, and only 13 individuals were being as normal cattle for comparison of analysis. The age of cattle ranged diversely including calf, heifer and mother cows. In this study, all individuals were fed and maintained following the standard recommendation management. The list of all individuals of cattle along with the detailed information is presented in Table 1.

Isolation and Concentrating of DNA
For DNA isolation, cattle blood was collected using sterile needles and syringe, and then put in a 10 ml specific tube. The blood was kept in ethanol and stored in freezer (-80°C) until used. DNA isolation was done with QIAmp DNA blood mini kit (Qiagen) following the protocol from the biotechnological company. The DNA was eluted with TE buffer and migrated on 0.8% agarose gel electrophoresis. DNA concentration and purity were estimated by measuring the absorbance at 260/280 and 260/230 using NanoDrop1000. The DNA concentration was adjusted to 50 ng µl -1 as recommended for iScan analysis by concentrating it with SpeedVac (Thermoscientific). The pure genomic DNA was stored and prepared at least 15 µl to meet the requirement for Infinium II assay.

SNP Genotyping Using Illumina Bovine 50K SNP Chip
All cattle breeds were genome-wide genotyped with Infinium II assay using Bovine 50K SNP chip (Illumina Inc., San Diego) which comprises SNPs covering the bovine genome (Matukumalli et al. 2009;VanRaden et al. 2009). Approximately 200 ng of genomic DNA of each individual was used for the assay and samples were processed according to the Illumina Infinium-II assay manual. Briefly, each sample was whole-genome amplified, fragmented, precipitated and re-suspended in an appropriate hybridization buffer. Denatured samples were hybridized on the prepared BovineSNP50 chip for a minimum of 16 hours at 48°C. Finally, the bead chips were processed for the single-base extension reaction, stained and imaged on an Illumina iScan array. Normalized bead intensity data for each sample were loaded into the GenomeStudio V2009.1 software facilitated by Illumina, which converted fluorescent intensities into SNP genotypes. SNP clusters for genotype calling were examined for all SNPs. SNP was identified based on the following criteria: (1) the number of genotype group, i.e. one or none (e.g. only AA genotype and no AB or BB), (2) the minor allele frequency (MAF), and (3) proportion of genotyped individuals based on Call Rate, GeneTrain score cutoff of 0.25 and 50% GeneCall (GC50) applied to the whole dataset. Thus, the overall genotyping reliability for the total SNPs was assessed by estimating SNP counts above conventionally used threshold and average values for Call Rate, GC50 and GeneTrain scores. These measures provide some general information about quality and performance of SNPs (Illumina Inc. 2011a; Grattapaglia et al. 2011). Clustering heat map and related SNP analyses were performed with GenomeStudio. The heatmap was generated based on euclidean distance, of which the variables measure were analyzed automatically for clustering.

Performance and Quality of SNP
Genome-wide genotyping results from 54,609 SNPs in the Bovine 50K array revealed the data output generated with the Illumina GenomeStudio software with a no call threshold of 0.25. The performance of call rate, GeneCall (GC50) and GenTrain of SNPs is presented in Figure 1. A Call Rate is defined as the fraction of called SNPs per sample over the total number of SNPs in the dataset with a standard quality threshold of 95%. The Call Rate indicated a high quality of the identified SNP as demonstrated that proportion of SNPs with Call Rate of > 95% was 81.25%. The proportion of SNPs with 50% GeneCall (GC50) scores of > 0.40 was around 98.5% ( Fig. 1A) with an average of 0.818. GenTrain score of SNP representing cluster separation was the lowest at 0.35, higher than the recommended threshold (Illumina Inc. 2011a;2011b;Hoffman et al. 2012) (Fig.  1B). As supported by previous study, GenTrain score as low as 0.3 can still be successfully used to determine a degree of cluster separation (Yan et al. 2010). Above 50% of the SNPs screened in this study  Description of sample code: Sex (X = female, Y = male); Cattle type according to age (A = calf, I = mother cow, D = heifer, U = unidentified as mother/calf/heifer); Cattle type according to heredity twinning (KB0 = historical twinning, KB1 = genetical twinning, KT2 = normal cow); Cattle type according to breed (PR = dairy cattle, Holstein Friesian/HF; PT = beef cattle, Ongole Grade/OG). possessed GC50 and GenTrain scores near one and can be considered as sufficient quality to be correctly scored by the Illumina GenomeStudio genotyping software without manual intervention. The overall parameters for the 54,609 SNPs demonstrated the success of genotyping reliability of total cattle observed in this study.

SNP Distribution and Allele Frequency
The SNP existed in the Bovine 50K developed by Illumina showed their even distribution across 60 chromosomes of entire bovine genome, 29 pair of autosomes and one pair of sex chromosomes (X and Y). The number of SNPs per chromosome ranged from one (on chromosome Y) to almost 3,500 SNPs on chromosome 1 (Fig.2A). Tyler-Smith (2008) reported that as in other mammals, males have an X and a Y chromosome and females have 2x chromosomes, thus, only the autosome was used in this study. The total SNPs were found to have homology with several regions in bovine genome such as BTA, BTB, ARS-BFGL-NGS, UA-IFASA, and Hapmap-SCAFFOLD.
An even distribution of MAF was observed ( Table  2) with rive continued classes from 0 to 0.5. A relatively similar number of SNPs was found in MAF class of 0.3-0.399 (15.54%) and 0.4-0.5 (13.92%). The highest number of SNPs possessed MAF of less than 0.199 was 31.56% (17,236/54,609). Selected SNP markers with high MAF scores in this study could have a high impact and useful on genetic diversity analysis, given the great differentiating power that is in good agreement with previous study (Yan et al. 2010). The difference in allele frequencies may be attributable to divergence of the cattle breeds (Matukumalli et al. 2009;Dadi et al. 2012). In addition, information on the allelic frequencies of these SNPs should help determine the usefulness of this marker for analysis of other cattle breeds in Indonesia.
Particular emphasis was placed on SNP polymorphism, of which for homozygous, one SNP (for example G/C) was able to produce two alleles (G and C). For 54,609 SNPs (with just one SNP per locus), a maximum total of 109,218 alleles can be detected. Of number SNPs surveyed, alleles of A/G (22,579/54,609 or 41%) seemed predominantly in the population and followed with T/C (34%). Alleles of A/C and T/G had a relatively similar proportion, accounting for 10% and 9%, respectively (Fig. 2B). While A/T and T/A were identified as minor alleles in the total of 48 individuals. Clearly, most SNP markers had alleles which were shared among the individuals and/or breeds, or had specific alleles at distinctive frequencies as demonstrated in this study. The major alleles produced by some markers could be specific in Indonesian cattle, leading allelic deviation in the breeds. Major alleles were also essentially equivalent to minor allelic frequency (MAF) in information content for differentiation of animals (Kruglyak 1997;Hasegawa et al. 2014). All SNPs common to both breeds probably arose before the divergence of the breeds. Importantly, the rare and minor alleles could influence economically important traits in livestock species (Freking et al. 2002;Smit et al. 2003).

Analysis of Cluster and Genetic Diversity of Cattle Breeds
Scoring of SNP among individuals using GenomeStudio generally produced three clusters denoting the AA homozygote, BB homozygote and AB heterozygote, but some of data dots ambiguously appeared between the clusters in the genoplot as depicted in Figure 3.
For examples, SNP in ARS-BFGL-NGS-18937 revealed AA genotype for a total of 48 individuals (Fig. 3A), in contrast, ARS-BFGL-NGS-10077 showed mostly BB genotype (Fig. 3B). While Hapmap 27796-BTA-21954 resulted three clusters which presented AA genotype (28 individuals), AB (14 individuals) and BB (6 individuals) (Fig. 3C). In respect to some SNPs showing only homozygote, they were predominated by BB genotype accounting the frequency of 0.458 and AA with 0.314 value in total of individuals observed. A few individuals contained heterozygotes with proportion of 0.228. This cluster separation as denoted by GenTrain score could explain the three classes's separation (AA, AB and BB). In addition to represent SNP quality, the reliable classes' pattern of the cattle breeds virtually reflected their genetic nature based on the stringent SNPs existing in Bovine 50K array. This powerful SNPs in this study is in good agreement with previous studies on BovineSNP50 Bead Chip for genotyping various breeds and species in the tribe Bovini (Bae et al. 2010;Michelizzi et al. 2011;Dadi et al. 2012). Genetic variation within or among breeds is usually explained in terms of allel frequencies. Figure 4 depicts heat map of the 48 individuals according to Bovine 50K SNP. Two main clades were generated and showed almost clear separation of different breeds, 20 individuals mostly HF (with exception of three OG namely X_A_KBO_PT_09/38, X_U_KT2_PT_09931 and X_I_KBO_PT_9702) in clade I and 28 individuals of OG belonging to the clade II. A few HF individuals, i.e. X_D_KBO_PR_B757, X_D_KT2_PR_1215, and X_D_KBO_PR_A751 that were preferentially grouped with most OG (clade II) demonstrated their close   relationship compared to other individual dairy cattle. The only one genetical twinning for each OG (X_I_KB1_PT_7415) and HF (X_I_KB1_PR_IK3) grouped in different clades, reflecting that the two had far genetic distance. Another interesting example, these SNPs were able to identify the genetic twinning mother cow (X_I_KB1_PR_IK3) and her triplet calf (IK3_1, IK3_2 and IK3_3) in the same clade (clade I).
Parent-child heritability frequency would confirm the parent-child relationship (Bae et al. 2008;. Thus, OG beef cattle and HF dairy cattle generally tended to be clearly separated in different clusters without considering their genetic history, sex, historical twin and normal. This result indicated that most individuals were closely related to one another, regardless of the same breed. However, no clear differentiation of individuals found within breed either in OG or HF, indicating that the SNPs developed based on dairy cattle genome (Bovine 50K SNP) were only useful to differentiate cattle according to the genetic background of individual within breed. This is consistent with the preliminary analysis in previous report (Lestari and Tasma 2012). Moreover, these results demonstrated that inbreeding and selection had little effect on reducing genetic diversity and differentiating both within HF and OG breeds in Indonesia at a genome-wide level, similarly to the study case of other HF breed in Australia (Zenger et al. 2007). These SNP markers could be useful for association analysis with phenotypic characters of cattle such as meat productivity, beef quality and milk quality. In line with the previous report (Bae et al. 2010), further research could examine the genetic effects of the SNPs on various economic characters on cattle.
When the location on a chromosome with copy number variation in Bos taurus (Bae et al. 2010) was overlapped with the regions/loci containing SNP in our study, some genes were identified on chromosomes 3, 4, 5, 7, 13, 17 and 18 whose positions were in the loci we observed in this study (Fig. 5). For example, chromosome 3 at position of 36,163,190-36, 36,163,190..36,338,393bp 10,009,287..10,665,698 bp 102,164,053..102,261,488bp 4,650,135..5  Notably, a total of 59 selected SNPs in our study that correspond to genes as identified previously (Bae et al. 2010) revealed specific alleles on OG and HF (Table 3). These reference and alternate alleles were detected on selected chromosomes (3,4,5,7,13,17  and 18 ) in dairy and beef cattle, respectively. The point mutation existing in the cattle breeds in Indonesia varied with bi-allele of T/C, A/C, A/G, T/A, T/G and C/G. These base substitution which may affect phenotypic variation in different breeds of cattle may need to be further investigated and could provide insight into enrichment of phenotypic impact through genomic resources (Gan et al. 2008;Liu et al. 2008). Information on genetic variation of cattle breeds in Indonesia based on bovine genome could complement and enrich previous studies. A number of researches in genome-wide SNP genotyping has been progressively achieved, such as cost-effective dairy cattle breeding programs (Hayes et al.2009), useful information on genetic variation of Korean Hanwoo breed (Dadi et al. 2012) and indicine and African cattle breeds (Matukumalli et al. 2009), and genome wide association for milk production in Danish Jersey cattle (May et al. 2010). Specific SNPs associated with genes have also been elucidated their association with targeted traits in dairy and beef cattle (Liu et al. 2011;Lu et al. 2011;Deb et al. 2014). Thus, our study is relevant with previous studies using Bovine SNP array conducted in many countries to offer a useful knowledge and promise for improving targeted traits in cattle.
Indeed, this result could be a good clue that the use of SNP chip is more powerful and could be functional in genetic diversity analysis. Several SNPs within and close to genes may provide an excellent solution to the disadvantage of SNP markers that have been used in diversity analyses (Zimin et al. 2009;Snelling et al. 2010). The SNP data represent a vast and largely untapped resource to assist the investigation of genetic studies in cattle, and also useful for cattle genetic improvement programs. The patterns of allele frequency variability observed among the breeds signal the genetic imprint of past and presumably on going episodes of selection (Hayes et al. 2009;Dadi et al. 2012).

CONCLUSION
SNPs on bovine genome were successfully identified across total chromosomes of cattle breeds of Ongole Grade (OG) and Holstein Friesian (HF). Some SNP markers with high MAF scores (> 0.2) revealed approximately 69% and could be useful in genetic diversity analyses, given their great differentiating power. Several SNPs within and close to genes may provide an excellent solution to the disadvantage of SNP markers that have been used in diversity analyses. Dairy and beef cattle possessed specific alleles corresponding to known genes which may contribute to the genetic characters of each breed. The Bovine SNP 50K described in this study was more usable for differentiation among breeds than individuals in the same breed of cattle.

ACKNOWLEDGEMENT
This work was supported by a grant from the Indonesian Agency for Agricultural Research and Development, Ministry of Agriculture, Republic of Indonesia. Authors thank to Eryck Andreas and laboratory technicians for kind help in blood sample preparation.