Add a base URL to xlinx

TL;DR

Adding a base to relative urls in xlinx.

Some time ago I posted about a script to Extract links/images from files or URLs and left with these words:

[…] it does not pre-pend a base URL in case of relative URLs.

Of course, I needed the script and of course I needed absolute URLs.

The robust thing would be to look at what LWP does and replicate it. To be honest, I’m not really in the mood, so I adopted a different approach instead.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28
#!/usr/bin/env perl
use strict;
use warnings;
use feature 'say';
use Mojo::DOM;
use Mojo::File;
use Mojo::URL;
use Mojo::UserAgent;
my $ua = Mojo::UserAgent->new->max_redirects(10);
for my $input (@ARGV) {
    my ($dom, $base);
    if ($input =~ m{\A https?:// }imxs) {
        my $tx = $ua->get($input);
        $base = $tx->req->url;
        $dom = $tx->result->dom;
    }
    else {
        $dom = Mojo::DOM->new(Mojo::File->new($input)->slurp);
        $base = $ENV{XLINX_BASE} // undef;
        $base = Mojo::URL->new($base) if defined $base;
    }
    $dom->find('a[href],img[src]')->each(
        sub {
            my $l = $_[0]->attr(lc($_[0]->tag) eq 'a' ? 'href' : 'src');
            say $base ? Mojo::URL->new($l)->to_abs($base)->to_string : $l;
        }
    );
}

Enough for today, cheers and stay safe!


Comments? Octodon, , GitHub, Reddit, or drop me a line!