Different programming languages are good at different things. R has many powerful statistical functions, while perl is good at data handling.
N random numbers from a certain range of numbers without re-sampling can be easily done in R with the "sample" function. To do the same thing in Perl, looping has (or atleast some form of iteration) to be used along with storing the results and checking to avoid re-sampling.
N random numbers from a certain range of numbers without re-sampling can be easily done in R with the "sample" function. To do the same thing in Perl, looping has (or atleast some form of iteration) to be used along with storing the results and checking to avoid re-sampling.
args<-commandArgs(TRUE)
totalsnps<-as.integer(args[1])
runumber<-as.integer(args[2])
sample(1:totalsnps,totalsnps,replace=F)->N
write.table(file=paste("rands.out",runumber,sep="."),N,col.names=F,row.names=F,quote=F)
This Rscript can be run using the below line
Rscript sampleit.r $linecount $iterationcount
Once, the file with the new order of lines has been generated, it can be used by the perl script to write the file in the new order. We also keep the first two columns of the file unchanged and just randomise the remaining parts of the file.
#!/usr/bin/perl
open RANDS, $ARGV[1] or die $!;
my %rhash = ();
my $count=1;
#read in the file created by the R script in previous step
while($lines = <RANDS>){ chomp $lines; $rhash{$count}=$lines; $count++; } close RANDS;
#read the file that needs to be randomised and store it in hash with new order
open STATS, $ARGV[0] or die $!; my $hash = {CHR =>my $genename,POS =>my $pid,RESTATS =>my $pco1}; $mycount=1; while($line = <STATS>){ chomp $line; @tabs=split(/[ \t]+/,$line); $hash{$mycount}{CHR}=$tabs[0]; $hash{$mycount}{POS}=$tabs[1]; $line =~ m/\w*\t\w*\t(.*)$/; $hash{$rhash{$mycount}}{RESTATS}=$1; $mycount++; }#end of file while loop
#check if number of lines match in old and new file
if($mycount!=$count){print "mismatch in counts\n";}
#print the file out in new order
foreach $contigs (sort { $a <=> $b } keys %hash) { print "$hash{$contigs}{CHR}\t$hash{$contigs}{POS}\t$hash{$contigs}{RESTATS}\n"; }
This perl script just reads in the input file, stores it in a hash with new line order and then prints it out.