ETOOBUSY 🚀 minimal blogging for the impatient
PWC110 - Valid Phone Numbers
TL;DR
Here we are with TASK #1 from the Perl Weekly Challenge #110. Enjoy!
The challenge
You are given a text file. Write a script to display all valid phone numbers in the given text file.
Acceptable Phone Number Formats:
+nn nnnnnnnnnn (nn) nnnnnnnnnn nnnn nnnnnnnnnn
Example, input file:
0044 1148820341 +44 1148820341 44-11-4882-0341 (44) 1148820341 00 1148820341
Example, output:
0044 1148820341 +44 1148820341 (44) 1148820341
The questions
There’s some… induction required in this challenge, especially for what the input is supposed to look like with respect to spaces:
- do we tolerate leading/trailing spaces? From the templates it seems
not, although the examples seem to imply a different story (the
+44
row is a pass); - do we insist on the exact spacing between the first and the second
part? I mean, the
+nn
template seems to require two spaces before the rest, but the passing example with+44
has only one (having moved the other one before the+44
); - should we stick to plain spaces, or does any spacing do?
We’ll take the examples into account… and consider any spacing acceptable.
The solution
The task is about checking a file, so there are two halves.
Going top-down, we first have to make sure to go through all the lines in the file. Is it a real file? Something different? We will accept anything that can act as a file:
sub valid_phone_numbers ($f) {
$f = ref($f) ? $f
: ($f eq '-') ? \*STDIN
: do { open my $h, '<', $f or die "$!\n"; $h };
is_phone_number_acceptable(s{\A\s+|\s+\z}{}rgmxs) && print while <$f>;
}
The input can be a filename (interpreting -
as take standard input,
as it often happens) or a filehandle. Whatever the case, we need a
filehandle, so we make sure that $f
holds one eventually.
Then we iterate through the file, trimming the lines before doing the
check is_phone_number_acceptable
and printing them if they comply.
The fact that we also accept filehandles makes it easy to code a default case where we feed the challenge example as input:
my $f = shift // do {
my $input = <<'END';
0044 1148820341
+44 1148820341
44-11-4882-0341
(44) 1148820341
00 1148820341
END
open my $fh, '<', \$input;
$fh;
};
valid_phone_numbers($f);
OK, let’s move on to the phone number check function:
sub is_phone_number_acceptable ($n) {
scalar(
$n =~ m{
\A
(?:
\+\d\d # +nn
| \(\d\d\) # (nn)
| \d{4} # nnnn
)
\s+
\d{10} # nnnnnnnnnn
\z
}mxs
);
}
This overly-verbose regular expression takes advantage of Perl’s
/x
modifier, which allows organizing complex expressions with
comments.
The check itself demands that there are no leading or trailing spaces; it just seemd better to have a more precise test, and remove them before calling the function.
There is a first non-capturing group that addresses the first part; here we have three possible alternatives for the prefix:
- one plus sign, followed by two digits,
- or one opening round parenthesis, two digits, one closing round parenthesis,
- or exactly four digits.
Then, after one or more spaces, we have exactly ten digits.
Using a non-capturing group here is a small performance improvement, but also a hint to the next programmer that we’re not really interested in capturing anything, just in making sure that the alternatives are grouped together.
I’m not sure why I felt the urge to wrap the whole thing with scalar
;
again, my default here was probably to make sure that this function
behaves like a boolean (i.e. scalar) test, whatever the way it is
called. Call me a paranoid.
You might be interested into the whole program, so here it is:
#!/usr/bin/env perl
use 5.024;
use warnings;
use experimental qw< postderef signatures >;
no warnings qw< experimental::postderef experimental::signatures >;
sub is_phone_number_acceptable ($n) {
scalar(
$n =~ m{
\A
(?:
\+\d\d # +nn
| \(\d\d\) # (nn)
| \d{4} # nnnn
)
\s+
\d{10} # nnnnnnnnnn
\z
}mxs
);
}
sub valid_phone_numbers ($f) {
$f = ref($f) ? $f
: ($f eq '-') ? \*STDIN
: do { open my $h, '<', $f or die "$!\n"; $h };
is_phone_number_acceptable(s{\A\s+|\s+\z}{}rgmxs) && print while <$f>;
}
my $f = shift // do {
my $input = <<'END';
0044 1148820341
+44 1148820341
44-11-4882-0341
(44) 1148820341
00 1148820341
END
open my $fh, '<', \$input;
$fh;
};
valid_phone_numbers($f);
Have fun and… stay safe!