This code simulates inbred individuals with a particular FIS or a first generation admixed individual. The code simulates four individuals used to create the simulated individuals used in the homework exercise on F statistics.
The code needs the file combined.YRICEU.out, which is available here, and contains ~10,000 SNPs (less a few, due to removing some monomorphic sites), with the genotype frequencies for a set of SNPs in the CEU Europeans and the YRI Africans. Once again these data were created from the PHASE2 HapMap and the genotypes were processed into genotype counts using PLINK‘s HWE option.
geno<-read.table(file="combined.YRICEU_with_freq.out") ## make an individual with a particular FIS. make.ind.FIS<-function(freqs,FIS){ sample.ind<-sapply(freqs,function(prob){ random<-runif(1) if(random < FIS){ my.allele<-(1-rbinom(1,1,prob)) my.geno<- 2*my.allele } else{ my.geno<-(2-rbinom(1,2,prob)) } return(my.geno) }) return(sample.ind) } #make a 1st generation admixed individual make.ind.admix<-function(freqs.1,freqs.2){ sample.ind<-apply(cbind(freqs.1,freqs.2),1,function(probs){ my.allele.1<-(1-rbinom(1,1,probs[1])) my.allele.2<-(1-rbinom(1,1,probs[2])) my.geno<- my.allele.1+my.allele.2 return(my.geno) }) return(sample.ind) } ind.1<-make.ind.FIS(geno$A.freqYRI,0.0) ind.2<-make.ind.FIS(geno$A.freqCEU,0.0) ind.3<-make.ind.FIS(geno$A.freqCEU,0.1) ind.4<-make.ind.admix(geno$A.freqCEU,geno$A.freqYRI) ##1st generation admixed individual individuals<-cbind(ind.1,ind.2,ind.3,ind.4) write.table(file="made_up_individuals.out",individuals)
If you do use these scripts and figures, please acknowledge that fact (mainly so that others can find this resource). Also if you do use them it would be great if you could add a comment to the post, so I can see how widely used they are, to get a sense of how worthwhile this is. If you find a bug or make an improved version do let me know.
This work is licensed under a Creative Commons Attribution 3.0 Unported License.