1Filtered reads from all individuals were aligned to the Wuzhishan reference genome by the Burrows-Wheeler Aligner (BWA)36. 2To obtain high-quality SNPs, both SOAPsnp37 and the Genome Analysis Toolkit (GATK)38 were used for SNP calling for each sample. 3SNPs with a read depth of less than 6 or greater than 70 or with a distance of less than 5 bp to a neighboring SNP were removed. 4SOAPsnp generated consensus genotypes for all genomic loci, even in regions with poor coverage. To remove low-quality genotypes called by SOAPsnp, we further excluded all loci that had a SOAPsnp quality score of less than 20 in any 1 of the 69 pigs sequenced. 5 To avoid potential bias between our data and the publicly available data, we called SNPs by comparing the genome sequence of each individual to the Wuzhishan reference genome and then merged the called SNPs to form a common set of SNP data for the 111 individuals. Loss-of-function variants.We defined variants as potential loss-of-function mutations if they corresponded to one of the following variants: (i) a SNP within a coding region resulting in a premature stop codons; (ii) a small indel within a coding region causing a frameshift of the ORF; (iii) a SNP or small indel within 2 bp of a splice site; and (iv) a structural variation overlapping a coding region. All loss-of-function variants were called using Perl scripts. 6First, tagging sequences containing SNPs from the Illumina chip were mapped to the reference genome using BWA36. All mapped SNPs from the chip were further filtered by the same criteria for sequence-based SNP calling; those with a quality score less than 20 were removed. 7LSBL statistics43 were calculated for each polymorphic site with MAF >0.01 in the 69-genome data set on the basis of the fixation index (FST) values between three contrasting groups. |
|
来自: zhuqiaoxiaoxue > 《最新消息》