Pronunciation defects

TL;DR

I’m probably not using regular expressions in Raku the way they’re designed to work better.

Although I like my strong Perl accent while writing Raku code, it’s probably time to correct a couple of… pronunciation defects.

It all started with Advent of Code 2018 puzzle 5. Taking it as an excuse to do some more Raku exercising, I coded a solution for the first half of the day’s puzzle:

sub part1 ($inputs is copy) {
   my $changed = -1;
   while ($changed && $inputs.chars) {
      $changed = 0;
      my $current = $inputs.substr(0, 1);
      my $i = 0;
      while $i < $inputs.chars - 1 {
         my $succ = $inputs.substr($i + 1, 1);
         if ($current ne $succ && lc($current) eq lc($succ)) {
            ++$changed;
            $inputs.substr-rw($i, 2) = '';
            $current = substr($i, 1) if $i < $inputs.chars;
         }
         else {
            $current = $succ;
            ++$i;
         }
      }
   }
   return $inputs.chars;
}

This is probably a rather boring implementation that might be idiomized a lot. But with my current skills… I think the best I can do is to idiotize it, so it’s working and I call it a day.

Or do I? Certainly not!

I wondered about using a regular expression and substitution to get the job done, so I proceeded to over-engineer a solution:

sub part1_matcher () {
   my $allpairs =('a' .. 'z').map({ .lc ~ .uc, .uc ~ .lc }).flat.join('|');
   return rx{<$allpairs>};
}

sub part1_rx ($inputs is copy) {
   state $matcher = part1_matcher();
   Nil while $inputs ~~ s:g/$matcher//;
   return $inputs.chars;
}

I know, I know… it’s a one-off script, what’s my problem with computing the regular expression once and put it in a state variable? I’m a romantic.

So there I am all happy waiting for a solid performance boost, and I get this:

$ time RX=1 raku 05.raku 05.input
...
real	0m54.025s
user	0m54.008s
sys	0m0.220s

$ time RX=0 raku 05.raku 05.input
...
real	0m9.601s
user	0m9.652s
sys	0m0.192s

You’re guessing it right: the version with the regular expression takes about 6x times than the boring one!

At this point I was intrigued and wondered if it had to do with the approach, so of course I re-implemented the whole thing in Perl. Here’s the boring translation:

sub part1 ($inputs) {
   my $changed = -1;
   while ($changed && length$inputs) {
      $changed = 0;
      my $current = substr $inputs, 0, 1;
      my $i = 0;
      while ($i < length($inputs) - 1) {
         my $succ = substr $inputs, $i + 1, 1;
         if ($current ne $succ && lc($current) eq lc($succ)) {
            ++$changed;
            substr $inputs, $i, 2, '';
            $current = substr($i, 1) if $i < length $inputs;
         }
         else {
            $current = $succ;
            ++$i;
         }
      }
   }
   return length $inputs;
}

and here’s the regular-expressions based version translation:

sub part1_matcher () {
   my $allpairs = join '|',
      map { (lc($_) . uc($_), uc($_) . lc($_)) } 'a' .. 'z';
   return qr{$allpairs};
}

sub part1_rx ($inputs) {
   state $matcher = part1_matcher();
   1 while $inputs =~ s/$matcher//g;
   return length $inputs;
}

This time this is what I got back:

$ time RX=1 perl 05.pl 05.input
...
real	0m0.137s
user	0m0.108s
sys	0m0.008s

$ time RX=0 perl 05.pl 05.input
...
real	0m1.385s
user	0m1.340s
sys	0m0.024s

Now this is what I was expecting!

My (transitory?) take away is that one or more of the following apply:

  • Raku still has some way to go as long as performance is concerned (this is fair enough);
  • I can definitely improve my Raku to leverage on its strengths, instead of writing code with my strong Perl accent.

Sometimes, having a strong accent just means that it will take you much more time to be understood…

Thanks in anticipation to anybody that can help understanding what I’m doing wrong and where I can improve!

Until next time… stay safe and have -Ofun!


Comments? Octodon, , GitHub, Reddit, or drop me a line!