Common data set
The data set consists of 5,865 individuals from seven generations. There are 6,000 loci evenly distributed over six chromosomes (1,000 markers per chromosome), with 0.1cM between markers. The animals from the first four generations have both pedigree and phenotype information available. The animals from generation five to seven have no given phenotype but complete marker information and thus their genotype can be predicted using genomic selection.
The dataset is divided into two files.
A. The PHENOTYPE FILE: phenotype.txt that contain six columns corresponding to:
Animal_ID, ID_sire, ID_dam, Sex (male=1, female=2), Generation, Trait_value
-
B.The GENOTYPE FILE: genotype_cor.txt.zip that contain the genotype of each animal in the pedigree described in the phenotype file. The genotype file contain one line for each individual in the pedigree and 12,001 columns corresponding to:
Animal_ID, M1_1, M1_2, M2_1, M2_2, ... ,M6000_1, M6000_2
(M1_1 = marker_1 allele 1 , M1_2 = marker_1 allele 2).
The data is haplotyped: left allele (M1) from the father, right allele (M2) from the mother.
Reporting
The dataset is simulated to allow the first four generations to be used for QTL detection (by association, linkage or combinations thereof). No phenotype is given for the last three generations so that these can be used for genomic selection.
The simulated genetic effects
You can now download a table describing the simulated QTL effects, a list of the true breeding values and a Genotype-Phenotype map for the epistatic pair for comparisons to your obtained results.
If you have any questions concerning the data please contact: .