Wikidata:Property proposal/man page - Wikidata
Article Images
Originally proposed at Wikidata:Property proposal/Computing
The majority of notable command-line tools is documented in man pages.
?item wdt:instance of (P31) wd:standard UNIX utility or command (Q18343316) currently yields 175 results. All of these are documented in man pages.
And we also have 216 instances of console application nearly all of which are also documented in man pages.
And we also have a couple of items for configuration files, e.g. hosts file (Q1149007), fstab (Q14717) and /etc/passwd (Q307510) which are also documented in man pages in section 5 (hosts.5, fstab.5, passwd.5).
So I think it would be great to be able to link these items to their man pages :)
I suggest the following constraints:
- property constraint (P2302)format constraint (Q21502404)
format as a regular expression (P1793)[^/]+\.[^/]+ - property constraint (P2302)property scope constraint (Q53869507)
property scope (P5314)as main value (Q54828448) - property constraint (P2302)single-best-value constraint (Q52060874)
Note that the regular expression enforces the specification of the section number which is important because many man pages exist in different sections e.g. man.1 is different from man.7.
Also note that I am intentionally restricting the property scope to as main value (Q54828448) to disallow using the property for references because man pages can differ significantly between Linux distributions, making just the man page name unsuitable to be used as a reference since you wouldn't know which man page version is referenced exactly. So I think for references you should just continue to use reference URL (P854), which does not have this ambiguity problem.
Suggested aliases:
- described by man page
--Push-f (talk) 17:44, 29 November 2022 (UTC)[reply]
- WikiProject Informatics has more than 50 participants and couldn't be pinged. Please post on the WikiProject's talk page instead. Notified participants of WikiProject Websites. --Push-f (talk) 17:46, 29 November 2022 (UTC)[reply]
- Support external identifier. However, I have the following comments
- It's important to create a Wikidata item for Manned.org.
- After going through their about section, I see that their approach is a bit different from other sites of man pages. Manned.org is aiming to index the man pages of different distributions and even different Linux variants. I find this approach quite useful. I think that you need to describe a bit more about Manned.org in the property proposal. John Samuel (talk) 07:53, 30 November 2022 (UTC)[reply]
- Comment Is this identifier intended to be tied to the manned.org website only, as opposed to individual man pages found elsewhere? If so, I wouldn't hurt to be more explicit about that early on in the description, maybe even in the property label, to invalidate any identifier not actually recognized by manned.org. You should also probably relax the syntax constraint a bit, as manned.org does provide a few additional sections besides the classical 1-8, such as 1posix (even mentioned among your examples), 3tcl and 9. Good point about keeping it as main value only, and you may even suggest some other qualifiers to be attached to it, such as language of work or name (P407), publication date (P577), platform (P400) or software version identifier (P348) (along with retrieved (P813) to see how old that metadata is). --SM5POR (talk) 08:02, 30 November 2022 (UTC)[reply]
- Make that additional section tag list 9, kde3 and n; I misread the constraint. Related properties: user manual URL (P2078), documentation files at (P10527). -- SM5POR (talk) 08:08, 30 November 2022 (UTC)[reply]
- Comment I second John Samuel and also I would support it if you rename it to "man page identifier for manned.org" or something similar.--So9q (talk) 10:04, 30 November 2022 (UTC)[reply]
- Thanks I relaxed the regex to
[^/]+\.[^/]+
. --Push-f (talk) 11:48, 30 November 2022 (UTC)[reply] - No this is not meant to be tied to the manned.org website (note how my regex explicitly prohibits that these identifiers contain any slashes, to prevent these identifiers from becoming manned.org specific). These identifiers are not made up by manned.org, they are decided by the people who create the packages for the Linux distributions supported by manned.org. For the majority of man pages these identifiers are the same across distributions because they are in one of the standard sections ... but even in cases where they're in one of the non-standard sections that identifier is still not decided by manned.org. There very much could be multiple formatter URL (P1630) values for this property because there are different online websites for man pages. I just think manned.org should have the preferred rank since it seems to have the largest coverage of different distributions. --Push-f (talk) 11:48, 30 November 2022 (UTC)[reply]
- Question Why the data type is external identifier? Laftp0 (talk) 12:36, 30 November 2022 (UTC)[reply]
- So that the Wikidata interface and data consumers can create links for these identifiers via formatter URL (P1630). --Push-f (talk) 04:16, 1 December 2022 (UTC)[reply]
- I see the properties with string type and formatter URL, such as Unicode character (P487). Laftp0 (talk) 06:36, 1 December 2022 (UTC)[reply]
- It's true that you can have string properties with a formatter URL (P1630), but those can be annoying to deal with, e.g. the Lua function
mw.wikibase.formatValue
does not create hyperlinks for string properties. Based on my reading of Wikidata:Identifier migration, there seems to be rough consensus among Wikidata editors that external identifiers should have a single-value constraint (Q19474404) and the distinct-values constraint (Q21502410), yet about 11 % and 7% lack single-value and distinct-values constraints, respectively. - I believe that the proposed property will have very few "single value" and "unique value" violations, so the datatype external-id is justified. I tried to think of exceptions, but manned.org even distinguishes between GNU tar and bsdtar. —Dexxor (talk) 12:51, 1 December 2022 (UTC)[reply]
- Support I see. Thank you Laftp0 (talk) 13:16, 1 December 2022 (UTC)[reply]
- Oh interesting, I didn't know that the Wikibase UI supported formatter URL (P1630) for the string datatype. --Push-f (talk) 16:45, 1 December 2022 (UTC)[reply]
- I just realized that many programs have multiple man pages. Would you really say that githooks.5 and git-add.1 are identifiers for Git (Q186055)?
- They were clearly not meant to be identifiers at the very least. I'm conflicted. Dexxor (talk) 10:11, 2 December 2022 (UTC)[reply]
- External identifiers are just identifiers in an external system ... in the case of man pages that system is the filesystem (the
man
command searches for man pages along the manpath). These strings identify the man pages, so I think datatype External identifier makes sense, I think it's quite common for one entity to have multiple identifiers of the same kind in some external system e.g. I recently linked four hackage package (P11246) identifiers for Cabal (Q4035708). However yes in the case of man pages I think it makes sense to have a single-best-value constraint (Q52060874), so that data consumers can identify which man page best describes the subject. Depending on notability and structural need we might end up modeling the git subcommands as individual data items, however I think to properly express the function of the individual subcommands we're still lacking several properties. --Push-f (talk) 15:40, 2 December 2022 (UTC)[reply]
- External identifiers are just identifiers in an external system ... in the case of man pages that system is the filesystem (the
- It's true that you can have string properties with a formatter URL (P1630), but those can be annoying to deal with, e.g. the Lua function
- I see the properties with string type and formatter URL, such as Unicode character (P487). Laftp0 (talk) 06:36, 1 December 2022 (UTC)[reply]
- So that the Wikidata interface and data consumers can create links for these identifiers via formatter URL (P1630). --Push-f (talk) 04:16, 1 December 2022 (UTC)[reply]
- Support Dexxor (talk) 12:54, 1 December 2022 (UTC)[reply]
- Support Good pick with manned.org for the formatter URL. In the tldr-pages project we also chose it as the default manpages site. --Waldyrious (talk) 20:20, 6 December 2022 (UTC)[reply]
- @Push-f, Jsamwrites, Laftp0, Dexxor, Waldyrious: Done, see man page (P11292) --DannyS712 (talk) 21:19, 11 December 2022 (UTC)[reply]