[Shootout-list] Perl k-nucleotide benchmark
Joel Hoffman
hoffmanj@pacifier.com
Thu, 24 Mar 2005 20:48:31 -0800
This is a multi-part message in MIME format.
--------------060401040800090603010802
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Sorry, I meant to write from this address that is actually subscribed to
the list. Please ignore my other one (from jahoffm@pdx.edu). Here it is
again:
I'd like to contribute a simple Perl implementation of the k-nucleotide
benchmark. It could probably be substantially improved, but it seems to
work!
Joel
--------------060401040800090603010802
Content-Type: application/x-perl;
name="nucleotides.pl"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;
filename="nucleotides.pl"
#!/usr/bin/perl
# The Great Computer Language Shootout
# http://shootout.alioth.debian.org/
# k-nucleotide count benchmark
# contributed by Joel Hoffman, 2005-03-24
use strict;
my @keys;
my $sequence = "";
$/ = "\n>"; # Input Record Separator
while (defined(my $l = <>)) {
if (uc substr($l,0,5) eq 'THREE') {
chomp $l;
my @l = split /\n+/, $l;
$sequence = uc join("",grep $_ !~ /^;/, @l[1..$#l]);
last;
}
}
for my $n (1,2,3,4,6,12,18) {
for my $i (0..(length $sequence) - $n) {
$keys[$n]{substr($sequence,$i,$n)}++;
}
}
for my $i (1,2) {
printf "%s %4.2f\n",$_,(100 * $keys[$i]{$_})/(length $sequence)
for sort { $keys[$i]{$b} <=> $keys[$i]{$a} or $a cmp $b }
keys %{$keys[$i]};
print "\n";
}
printf "%d\t%s\n", $keys[length $_]{$_}, $_
for qw<GGT GGTA GGTATT GGTATTTTAATT GGTATTTTAATTTATAGT>;
--------------060401040800090603010802--