# ETOOBUSY ðŸš€ minimal blogging for the impatient

# PWC097 - Binary Substrings

**TL;DR**

On with TASK #2 from the Perl Weekly Challenge #097. Enjoy!

# The challenge

You are given a binary string

`$B`

and an integer`$S`

. Write a script to split the binary string`$B`

of size`$S`

and then find the minimum number of flips required to make it all the same.

# The questions

I guess the challenge and the examples are quite clear, although thereâ€™s
a lot of input validation that can go wrong! Iâ€™m assuming that raising
an exception (read: `die`

) is fine if inputs are not good.

# The solution

I donâ€™t usually include validation of inputs but this challenge was giving me too many itches:

```
sub binary_substrings ($B, $S) {
die "invalid input string <$B>" unless $B =~ m{\A [01]* \z}mxs;
my $len = length $B or return 0;
die "$S is not a factor of the length of <$B>\n" if $len % $S;
my @parts = map { substr $B, $_ * $S, $S } 0 .. ($len / $S) - 1;
```

We check that `$B`

is *indeed* composed of only `0`

and `1`

characters,
and also that `$S`

evenly divides `$B`

so that all substrings are of the
same length. After the checks, we can divide the input string into parts
and move on.

To address the actual challenge, we can reason like this. Each substring
might be at the *center*, i.e. be the right candidate to be the target
to which every other substring should be changed to.

Hence, our task is to find which of these substrings is actually the best one, i.e. the one that minimizes the amount of bit flipping for all other substrings.

This means that:

- we have to evaluate the
*distance*(i.e. bit flipping) between any two substrings. This works in both ways, of course, because the bit flipping between two strings only is just the number of positions where they have a different character; - we have to count the total amount of bit flipping considering each
substring as the candidate to be the
*center*; - last, we have to get the minimum amount of bit flipping from each candidate.

Piece of cake!

```
my @total_distances;
for my $i (0 .. $#parts) {
for my $j ($i + 1 .. $#parts) {
my $d = 0;
for my $k (0 .. $S - 1) {
$d++ if substr($parts[$i], $k, 1) ne substr($parts[$j], $k, 1);
}
($total_distances[$_] //= 0) += $d for ($i, $j);
}
}
return min @total_distances;
```

Array `@total_distances`

records the amount of bit-flipping when
considering each item in `@parts`

as the right candidate to be at the
*center*. As such, it will eventually hold an integer for each
sub-sequence, indexed by the same index of the sub-sequence inside
`@parts`

.

Distances are symmetric, i.e. the distance between sub-sequence `1`

and
sub-sequence `2`

is the same if we invert the two. Hence, we donâ€™t have
to do the whole `$S x $S`

calculation, but we can just calculate the
distances for different pairs once. This accounts for the two nested
loops with index `$i`

ranging from `0`

to `$#parts - 1`

, and `$j`

ranging from `$i + 1`

to `$#parts`

. Note that the distance of a
sub-sequence from itself isâ€¦ `0`

, so we donâ€™t need to consider it in
our calculation.

Computing the distance is another loop inside the two above: we iterate
over all characters in the two strings, comparing them and increasing
the counter variable `$d`

as we find differences.

The output of this loop gives us the distance between sub-sequence at
index `$i`

and sub-sequence at index `$j`

, so we proceed to increase
both entries inside `@total_distances`

.

As anticipated, the answer to our problem is the minimum value across
all those that are accumulated in `@total_distances`

, so we leverage the
`min`

function from List::Util.

As usualâ€¦ the whole code, should you be interested into it:

```
#!/usr/bin/env perl
use 5.024;
use warnings;
use experimental qw< postderef signatures >;
no warnings qw< experimental::postderef experimental::signatures >;
use List::Util 'min';
sub binary_substrings ($B, $S) {
die "invalid input string <$B>" unless $B =~ m{\A [01]* \z}mxs;
my $len = length $B or return 0;
die "$S is not a factor of the length of <$B>\n" if $len % $S;
my @parts = map { substr $B, $_ * $S, $S } 0 .. ($len / $S) - 1;
my @total_distances;
for my $i (0 .. $#parts - 1) {
for my $j ($i + 1 .. $#parts) {
my $d = 0;
for my $k (0 .. $S - 1) {
$d++ if substr($parts[$i], $k, 1) ne substr($parts[$j], $k, 1);
}
($total_distances[$_] //= 0) += $d for ($i, $j);
}
}
return min @total_distances;
}
my $binary_string = shift // '101100101';
my $substring_length = shift // 3;
say binary_substrings($binary_string, $substring_length);
```

Stay safe!!!

*Comments? Octodon, Twitter, GitHub, Reddit, or drop me a line!*