TL;DR

Here we are with TASK #1 from the Perl Weekly Challenge #110. Enjoy!

# The challenge

You are given a text file. Write a script to display all valid phone numbers in the given text file.

Acceptable Phone Number Formats:

+nn  nnnnnnnnnn
(nn) nnnnnnnnnn
nnnn nnnnnnnnnn


Example, input file:

0044 1148820341
+44 1148820341
44-11-4882-0341
(44) 1148820341
00 1148820341


Example, output:

0044 1148820341
+44 1148820341
(44) 1148820341


# The questions

There’s some… induction required in this challenge, especially for what the input is supposed to look like with respect to spaces:

• do we tolerate leading/trailing spaces? From the templates it seems not, although the examples seem to imply a different story (the +44 row is a pass);
• do we insist on the exact spacing between the first and the second part? I mean, the +nn template seems to require two spaces before the rest, but the passing example with +44 has only one (having moved the other one before the +44);
• should we stick to plain spaces, or does any spacing do?

We’ll take the examples into account… and consider any spacing acceptable.

# The solution

The task is about checking a file, so there are two halves.

Going top-down, we first have to make sure to go through all the lines in the file. Is it a real file? Something different? We will accept anything that can act as a file:

sub valid_phone_numbers ($f) {$f = ref($f) ?$f
: ($f eq '-') ? \*STDIN : do { open my$h, '<', $f or die "$!\n"; $h }; is_phone_number_acceptable(s{\A\s+|\s+\z}{}rgmxs) && print while <$f>;
}


The input can be a filename (interpreting - as take standard input, as it often happens) or a filehandle. Whatever the case, we need a filehandle, so we make sure that $f holds one eventually. Then we iterate through the file, trimming the lines before doing the check is_phone_number_acceptable and printing them if they comply. The fact that we also accept filehandles makes it easy to code a default case where we feed the challenge example as input: my$f = shift // do {
my $input = <<'END'; 0044 1148820341 +44 1148820341 44-11-4882-0341 (44) 1148820341 00 1148820341 END open my$fh, '<', \$input;$fh;
};

valid_phone_numbers($f);  OK, let’s move on to the phone number check function: sub is_phone_number_acceptable ($n) {
scalar(
$n =~ m{ \A (?: \+\d\d # +nn | $\d\d$ # (nn) | \d{4} # nnnn ) \s+ \d{10} # nnnnnnnnnn \z }mxs ); }  This overly-verbose regular expression takes advantage of Perl’s /x modifier, which allows organizing complex expressions with comments. The check itself demands that there are no leading or trailing spaces; it just seemd better to have a more precise test, and remove them before calling the function. There is a first non-capturing group that addresses the first part; here we have three possible alternatives for the prefix: • one plus sign, followed by two digits, • or one opening round parenthesis, two digits, one closing round parenthesis, • or exactly four digits. Then, after one or more spaces, we have exactly ten digits. Using a non-capturing group here is a small performance improvement, but also a hint to the next programmer that we’re not really interested in capturing anything, just in making sure that the alternatives are grouped together. I’m not sure why I felt the urge to wrap the whole thing with scalar; again, my default here was probably to make sure that this function behaves like a boolean (i.e. scalar) test, whatever the way it is called. Call me a paranoid. You might be interested into the whole program, so here it is: #!/usr/bin/env perl use 5.024; use warnings; use experimental qw< postderef signatures >; no warnings qw< experimental::postderef experimental::signatures >; sub is_phone_number_acceptable ($n) {
scalar(
$n =~ m{ \A (?: \+\d\d # +nn | $\d\d$ # (nn) | \d{4} # nnnn ) \s+ \d{10} # nnnnnnnnnn \z }mxs ); } sub valid_phone_numbers ($f) {
$f = ref($f)     ? $f : ($f eq '-') ? \*STDIN
:               do { open my $h, '<',$f or die "$!\n";$h };
is_phone_number_acceptable(s{\A\s+|\s+\z}{}rgmxs) && print while <$f>; } my$f = shift // do {
my $input = <<'END'; 0044 1148820341 +44 1148820341 44-11-4882-0341 (44) 1148820341 00 1148820341 END open my$fh, '<', \$input;$fh;
};

valid_phone_numbers(\$f);


Have fun and… stay safe!