Wikipedia:Wikipedia Signpost/2023-02-04/From the editor - Wikipedia


New for The Signpost: Author pages, tag pages, and a decent article search function

Article Images

Just two short weeks ago, you were promised the world by a heady editor with a gleam in his eye. Well okay, you were promised "something". I'm pleased to announce a couple of somethings (this issue going out late is not one of them). The chief developments we've gotten out of the last couple weeks have been a usable article search function, individual byline pages and tag series pages, made possible by Module:Signpost and Wegweiser (with some bug fixes on the last made possible by the heroism of Mr. Stradivarius).

Byline pages

Everyone else has these, and now so do we. For example, Wikipedia:Wikipedia Signpost/Author/Michael Snow will bring you an automatically generated index of every article from our first editor-in-chief, Wikipedia:Wikipedia Signpost/Author/Smallbones will bring you a weal of hard investigative reporting, and Wikipedia:Wikipedia Signpost/Author/JPxG will bring you a bunch of articles written by JPxG. An index of all individual byline pages is here, and a sortable table of all authors in the history of the Signpost can be found here.

While there are far too many individual authors to reasonably make byline pages for everyone (which would be nearly a thousand pages), limiting it to authors with at least ten article credits brought it to a more manageable hundred-and-some. There was intense effort involved in getting WegweiserBot to parse all of the author data, and then going through it to eliminate the weird errors from the initial run. Since we've gone 18 years without a systematic effort to clean up the metadata, there were plenty of author fields with weird stuff in them like "3 July 2006", "03 July 2006", or "{{{2}}}". Also, we had stuff like "brassratgirl" versus "Brassratgirl", or "Andreas Kolbe" versus "Andreas Kolbe (leads" versus "Andreas Kolbe 1 April 2016 19:58 (UTC)" versus "Andreas Kolbe 19 March 2016 21:12 (UTC)".

After spelunking into the depths and cleaning up all that garbage, my conclusion is that there have been 926 distinct authors in the Signpost's history. Of these, all 926 (duh) have written at least one article, but only 415 have written two or more, and only 122 have written ten or more. The distribution looks something like this:

This many people have written at least this many articles
0 500
1 450
1 400
2 350
3 300
3 250
4 200
7 150
18 100
34 50
41 40
54 30
72 20
90 15
122 10
192 5
223 4
289 3
415 2
926 1

Man, wouldn't that be a neat graph? Too bad I don't feel like making one.

Tag series pages

This issue is a bit more complicated to deal with, and as a result its solution is a bit less complete. But if you go to Wikipedia:Wikipedia Signpost/Series you can see a number of distinct series of articles. While these lists have been around for a while, they existed as embedded sidebar templates with inconsistent naming schemes, scattered randomly through the PrefixIndex of more than 80,000 pages, with a variety of formatting styles that made updating impractical. Now they use Module:Signpost data to automatically fetch articles from the indices that have been tagged with SignpostTagger, and display differently depending on where they're used (they still give a sidebar if transcluded in a Signpost article, and now present a readable list of articles if you go directly to Wikipedia:Wikipedia Signpost/Series/Paid editing.

In addition to the existing pages for each existing series, I came up with a few templates to auto-generate pages for a few dozen of the most-used based on Wegweiser's comprehensive tag analysis. These are on the series page as well. For example, if you want a list of every arbitration report, you can go to Wikipedia:Wikipedia Signpost/Tag/arbitrationreport.

Like with the authors, the tags follow a certain distribution: while there are 519 tags in the module indices (props to Chris Troutman for much good work in the archives), only 357 of them appear on more than one article, and just 109 are used on more than 20. Currently there are 64 auto-generated tag pages, covering every tag with more than 39 uses; but we can always make more later – who knows.

Searching

There is a new template, at Template:Signpost/Search that allows you to easily search Signpost articles. Previously, finding stuff in back issues was difficult; even if a search was restricted with a prefix, it would return stuff from anywhere beginning with Wikipedia:Wikipedia Signpost, including newsroom archives, drafts, documentation pages, and old submissions. With a little magic (and a little hideous regex), though, it is now possible to search only in actual articles. Give it a shot:

Miscellaneous spelunking

Various other random back-office issues have been addressed, as documented in the annals of Wikipedia talk:Wikipedia_Signpost/Technical. These include a heroic effort by Jonesey95 to fix several thousand linter errors and missing </div>s, and the unearthing of yet more bizarre stuff. For instance there are a number of phantom articles, like this one, which was written in 2011, never published, and never actually linked to from a Signpost issue until January 2023. And two redlinked articles in the indices for 2006 and 2013 were retroactively created in 2016 as a prank, then deleted in 2017. Huh?

Anyway, if you are interested in monotonous tasks very few people care about – which you are, let's be real, we are Wikipedia editors – there are about five hundred articles from 2015 to 2020 that need to be given tags with SPT, and any help is appreciated.

S

In this issue