TL;DR
Adding a base to relative urls in xlinx .
Some time ago I posted about a script to Extract links/images from
files or URLs and left with these words:
[…] it does not pre-pend a base URL in case of relative URLs.
Of course, I needed the script and of course I needed absolute URLs.
The robust thing would be to look at what LWP does and replicate
it. To be honest, I’m not really in the mood, so I adopted a different
approach instead.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
#!/usr/bin/env perl
use strict ;
use warnings ;
use feature ' say ';
use Mojo:: DOM ;
use Mojo:: File ;
use Mojo:: URL ;
use Mojo:: UserAgent ;
my $ua = Mojo:: UserAgent -> new -> max_redirects ( 10 );
for my $input ( @ARGV ) {
my ( $dom , $base );
if ( $input =~ m{\A https?:// }imxs ) {
my $tx = $ua -> get ( $input );
$base = $tx -> req -> url ;
$dom = $tx -> result -> dom ;
}
else {
$dom = Mojo:: DOM -> new ( Mojo:: File -> new ( $input ) -> slurp );
$base = $ENV { XLINX_BASE } // undef ;
$base = Mojo:: URL -> new ( $base ) if defined $base ;
}
$dom -> find (' a[href],img[src] ') -> each (
sub {
my $l = $_ [ 0 ] -> attr ( lc ( $_ [ 0 ] -> tag ) eq ' a ' ? ' href ' : ' src ');
say $base ? Mojo:: URL -> new ( $l ) -> to_abs ( $base ) -> to_string : $l ;
}
);
}
Enough for today, cheers and stay safe!