# ETOOBUSY đźš€ minimal blogging for the impatient

# PWC207 - H-Index

**TL;DR**

On with TASK #2 from The Weekly Challenge #207. Enjoy!

# The challenge

You are given an array of integers containing citations a researcher has received for each paper.

Write a script to compute the researcherâ€™s

`H-Index`

. For more information please checkout the wikipedia page.The H-Index is the largest number h such that h articles have at least h citations each. For example, if an author has five publications, with 9, 7, 6, 2, and 1 citations (ordered from greatest to least), then the authorâ€™s h-index is 3, because the author has three publications with 3 or more citations. However, the author does not have four publications with 4 or more citations.

Example 1`Input: @citations = (10,8,5,4,3) Output: 4 Because the 4th publication has 4 citations and the 5th has only 3.`

Example 2`Input: @citations = (25,8,5,3,3) Output: 3 The H-Index is 3 because the fourth paper has only 3 citations.`

# The questions

No questions asked, as the domain makes it pretty clear that weâ€™re talking â€śhumanâ€ť numbers, both in terms of size of the input array, as well as each individual count of citations.

# The solution

The text/definition steals us the joy of coming up with a solution, because
it hints about having the array sorteded in *descending* order.

In fact, the key is that we have to compare a *count* of elements in a
subset with the values in the subsets. If we start with an array sorted like
above, as we move on we progressively include articles with *less*
citations, while at the same time increasing the size of the subset. At this
point, itâ€™s just a matter of meeting in the middle.

â€¦ and, of course, a viable solution will exist to include all top-cited articlesâ€¦

Letâ€™s go Perl first:

```
#!/usr/bin/env perl
use v5.24;
use warnings;
use experimental 'signatures';
say h_index(@ARGV);
sub h_index (@citations) {
@citations = reverse sort { $a <=> $b } grep { $_ } @citations;
$_ < $citations[$_] || return $_ for 0 .. $#citations;
return scalar(@citations);
}
```

Weâ€™re removing articles with *no* citations at all, because theyâ€™re pretty
much useless to calculate the H-Index. Then we sort and reverse, to get our
descending list of counts.

Then we iterate through the whole array. If we find a crossing point, then
we can return; otherwise, *every* article is part of the H-Index, so we just
return the size of the subset of articles with at least one citation.

Raku goes pretty much the same way, making the first preparatory part a bit more readable for many westerners (at least those who are used to reading left-to-right):

```
#!/usr/bin/env raku
use v6;
sub MAIN (*@args) { put h-index(@args) }
sub h-index (@citations) {
@citations = @citationsÂ».Int.grep({.so}).sort.reverse;
$_ < @citations[$_] || return $_ for ^@citations;
return @citations.elems;
}
```

Stay safe folks!

*Comments? Octodon, Twitter, GitHub, Reddit, or drop me a line!*