Continuing the discussion on DAF's, we look at chr6 and specifically the PRIM2 gene region thought to be under balancing selection.
chr <- "chr6";
jpeg(paste("DAF.",chr,".jpeg",sep=""),width=1420)
par(mfrow=c(2,1))
read.table(file=paste("h.",chr,".mean.bed",sep=""),header=F,stringsAsFactors=F)->M
plot(as.numeric(M$V2),as.numeric(M$V4),xlab="Position along chromosome",ylab="Mean derived allele Frequency",main=chr)
lines(c(58830166,61830166),c(0.2,0.2),col="red",lwd=5)
text(63830166, 0.25,labels="Centromere",col="red")
lines(c(171105067,171115067),c(0.3,0.3),col="blue",lwd=5)
text(171105067, 0.35,labels="Telomere",col="blue")
lines(c(0,10000),c(0.3,0.3),col="blue",lwd=5)
text(0, 0.35,labels="Telomere",col="blue")
lines(c(57182422,57513376),c(0.25,0.25),col="brown",lwd=5)
text(57182422, 0.3,labels="PRIM2 gene",col="brown")
M[M$V2>57182422 & M$V3<57513376,]->N
points(N$V2,N$V4,pch=13,col="blue")
read.table(file=paste("h.",chr,".countdgv.bed",sep=""),header=FALSE)->C
plot(C$V2,C$V4,xlab="Position along chromosome",ylab="Count of known structural variants",main=chr)
cor.test(as.numeric(M$V4),C$V4,method="spearman")
dev.off()
The correlation coefficient of 0.2559672 between the mean derived allele frequency and Number of CNV's is line with the results from chr2. The PRIM2 gene that has been shown to have high values of diversity and Tajima's D is located near the centromere and has high mean DAF values. The windows within the PRIM2 gene are marked by blue crosses.
No comments:
Post a Comment