This page contains data resources relating to the population of 2000 heterogeneous stock (HS) mice phenotyped for over 100 traits, and used in the papers:
In addition we provide gene expression data from hippocampus, liver and lung measured on subsets of these mice and published in
These data were previously available from the defunct web site mus.well.ox.ac.uk/GSCAN.
- The dataset we recommend for analysis comprises 10168 SNP genotypes mapped to build 37 of the mouse genome. The data are in this directory and are arranged by chromosome.
- Each chromosome is represented by two files formatted for use by the R HAPPY package.
- Files are named as chrN.Build37.data, chrN.Build37.alleles where N is the chromosome number (1..19, X).
- .data files are in ped-format. and contain the SNP genotypes.
- .alleles files are in HAPPY alleles format and contain the genotypes of the eight founder strains of the HS at these SNPs.
- Missing data are coded as NA.
- The file mapfile.txt contains the bp coordinates of these SNPs relative to build37.
- The raw phenotypes, together with relevant covariates, are in this directory, as a collection of tab-delimited text files. These files, for example Glucose.txt, contain related phenotypes (in this case measurements related to the Glucose Tolernace Test) together with covariates relevant to these phenotypes (ie sigbificantly associated at P<0.05). The phenotypes are combined in the file CombinedPhenotypes.txt.
- Residual phenotypes corrected for relevant covariates are in this directory. We recommend using these for analysis. Each trait is in a separate file with extension .resid
Gene Expression Data
This directory contains transformed gene expression data for hippocampus, liver and lung. Each is held as an RData matrix. Rows are mice and columns are expression traits. Note that each expression trait has been transformed by a Box-Cox transformation.