Wikidata:Project chat - Wikidata


8 people in discussion

Article Images

Wikidata project chat

On this page, old discussions are archived after 7 days. An overview of all archives can be found at this page's archive index. The current archive is located at 2024/10.

This website launched and run by the creator of the "Sweet Baby Inc detected" Steam curator would fall under the definition of a self-published source on Wikipedia. The Steam curator has been linked to the harassment campaign against Sweet Baby Inc. by reputable sources like PC Gamer, The Verge, and multiple others.

Wikidata has a page for the website, with the website linked via the described at URL property, by User:Kirilloparma on more than one if not every occasion. Even within the scope of that source, it is done in a very targeted way in that the website seems to be added to the Wikidata pages only when the game is recommended against at deidetected.com (e.g. The First Descendant, Abathor, Valfaris: Mecha Therion recommended as "DEI FREE" by deidetected do not have the property set). Based on that, its goal of harassment or POV pushing appears to be evident.

Does Wikidata have any guidelines that would explicitly allow or disallow this behavior or the coverage of deidetected.com at all? Daisy Blue (talk) 09:45, 14 September 2024 (UTC)Reply

There is no policy on WD for blacklisting websites for other than malicious cases such as spam or malware Trade (talk) 11:59, 14 September 2024 (UTC)Reply
Now from having read the property description for described at URL on its talk page, which explains that it's for "reliable external resources", I'm convinced the website has no place on Wikidata, as it's not a reliable source (at least not per the guidelines of Wikipedia (WP:RSSELF)). What is the best place to initiate its removal without having to start a potential edit war? A bot would also do a more efficient job at removing it from all the pages. Daisy Blue (talk) 12:03, 14 September 2024 (UTC)Reply
You might have more luck if you stopped bringing up Wikipedia guidelines and used the Wikidata ones instead Trade (talk) 00:09, 15 September 2024 (UTC)Reply
Wikidata itself cites the Wikipedia guidelines on self-published sources (and on original research). Daisy Blue (talk) 05:04, 15 September 2024 (UTC)Reply
English Wikipedia policy is im many cases useful to decide what should be done in Wikidata (e.g. which sources are reliable), but should never be considered normative and have no more authoritativeness than policies in any other project. GZWDer (talk) 06:37, 15 September 2024 (UTC)Reply

This could be used to mass undo 18 of the edits that introduced the links, but it's not progressing for me when trying. Daisy Blue (talk) 11:14, 15 September 2024 (UTC)Reply

Seems like a low-quality, private website that doesn't seem to add anything of value to our items. There are countless websites out there, but we generally don't add every single site via described at URL (P973) just for simply existing. IIRC, there were various cases in the past where users added unreliable websites to lots of items, that were then considered spam and deleted accordingly. And if the site's primary purpose is indeed purely malicious and causing harassment, there's really no point in keeping it. Best to simply put it on the spam blacklist and keep the whole culture war nonsense out of serious projects like Wikidata. Additionally, DEIDetected (Q126365310) currently has zero sources indicating a clear lack of notability. --2A02:810B:5C0:1F84:45A2:7410:158A:615B 13:50, 15 September 2024 (UTC)Reply

I've already nominated that and Sweet Baby Inc detected for deletion citing the same reason, though specifically for the curator, one could stretch point 2 of Wikidata:Notability to argue against it, but I'm not sure what value it would bring to the project apart from enabling harassment and its use to justify any other related additions. Daisy Blue (talk) 16:06, 15 September 2024 (UTC)Reply
Just add this website to the spam blacklist, no one will be able to add links to this website on Wikimedia projects anymore. Midleading (talk) 17:18, 16 September 2024 (UTC)Reply
What's the proper venue for proposing that? Also, seeing how you have a bot, could you suggest a quick way to mass remove the remaining instances from Wikidata? I've already undone a number by hand but it's not the greatest experience. Having the knowledge may also help in the future. Daisy Blue (talk) 18:24, 16 September 2024 (UTC)Reply
On the home page of Meta-Wiki, click Spam blacklist, and follow instructions there.
To clean up links to this website, I recommend External links search. A WDQS search is likely to time out. I also recommend reviewing each case manually, sometimes the item should be nominated for deletion, but tools can't do that. Midleading (talk) 01:27, 17 September 2024 (UTC)Reply
Thanks. I'll remove the rest by hand then. As for the Wikimedia spam blacklist, it says that "Spam that only affects a single project should go to that project's local blacklist". I'm not sure if there have been any attempts to cite deidetected on Wikipedia or elsewhere. We can search for the live references (there are none) but not through the potential reverted edits, I don't think. Daisy Blue (talk) 07:33, 17 September 2024 (UTC)Reply
Well, you may request this website be banned on Wikipedia first, then you may find some users who agree with you. Midleading (talk) 08:45, 18 September 2024 (UTC)Reply
I believe Wikipedia has the same policy in that if it hasn't been abused (and I wouldn't know if it has been specifically on Wikipedia), then there is no reason to block it. On Wikidata, as it stands now, the additions come from one user, Kirilloparma, who pushed back on my removals here but hasn't reverted. Unless it becomes a sustained effort by multiple users, it will come down to whether Kirilloparma concedes that described at URL is for reliable sources and the website is not a reliable source. Daisy Blue (talk) 12:14, 18 September 2024 (UTC)Reply
For some reason Kirilloparma keeps making points on the subject on the Requests for deletions page rather than here (despite having been informed), now arguing that the short property description takes precedence over the property documentation on the talk page, which is dismissed as "outdated". Daisy Blue (talk) 09:29, 20 September 2024 (UTC)Reply
  • Wikidata has items for many websites even if those websites are worthy of criticism. Knowing that "Sweet Baby Inc detected" is linked to "DeiDetected" is useful information even if both of those sources would be completely unreliable.
I don't see any use of links to deidetected.com within Wikidata where it's used for the purpose of harassement which would justify putting it on a blacklist. ChristianKl13:09, 26 September 2024 (UTC)Reply
The whole purpose of that website is to incite harassment, so intentionally linking to it within Wkkidata directly contributes to that problem. --2A02:810B:5C0:1F84:2836:F2FD:EE77:CF71 19:38, 28 September 2024 (UTC)Reply
@ChristianKl: Quite frankly, your comment is insensitive and I agree with the IP. Note that the OP did say that the only edits adding them have been to "recommended against" games' items, so your point does not stand I'm afraid. Other than information on the sites themselves, we really should not provide "described at" claims linking them to people. Such is arguably a gross violation of Wikidata:Living people.--Jasper Deng (talk) 19:41, 28 September 2024 (UTC)Reply
What part of Wikidata:Living people do you believe is violated here and by which edits?
Instead of focusing on what the OP said, why don't you look yourself to get an impression of what we talk about?
The OP asked for the item to be deleted. Currently DEIDetected (Q126365310) does link to Sweet Baby Inc detected (Q124830722). The described at URL (P973) claims on Sweet Baby Inc detected (Q124830722) seem to me like the go to relatively neutral sources like Wired saying things like "Although early efforts began on sites like notorious harassment hub Kiwi Farms last year, much of the misinformation about Sweet Baby has coalesced around Sweet Baby Inc Detected, a Steam curation group that bills itself as “a tracker for games involved with” the company." ChristianKl13:40, 2 October 2024 (UTC)Reply
I don't oppose the existence of these items and the existing claims you quoted. It is when these claims are added to particular games' items that it begins to create problems for the game's developers by inviting harassment targeted around their alleged ties to Sweet Baby and other organizations.--Jasper Deng (talk) 18:18, 2 October 2024 (UTC)Reply
If deidetected would dox individual employees, that would be a privacy violation. Saying that a particular company was consulting another company for developing a product is not a violation of the privacy of individual people. Boycott of commercial products like games based on political justifications, is not something that the living people policy is intended to prevent. Using the word "harrassement" for it, does not mean it's the same as actions against individuals.
That said, we do have discussions over whether to add properties for external sources to require consensus to sources to all entries of a website and using described at URL (P973) as a workaround because there's no property for an individual website is generally a bad idea. ChristianKl23:38, 3 October 2024 (UTC)Reply
@Kirilloparma: Please do not reintroduce any of these links in the future. Doing so is a violation of Wikidata:Living people on the grounds of privacy.--Jasper Deng (talk) 19:47, 28 September 2024 (UTC)Reply

I have boldly block-listed the domain on Wikidata. In accordance with the Wikimedia Foundation DEI principles, linking a low-quality harassment site in a way that causes LP violations is not appropriate. Exceptions, such as for items on articles covering the site, can be handled using edit requests. I request that the blacklisting stand unless an explicit consensus rises against it.--Jasper Deng (talk) 20:05, 28 September 2024 (UTC)Reply

The Campaigns Product team at the Wikimedia Foundation is proposing to enable the CampaignEvents extension on Wikidata by the second week of October.

This extension is designed to make it easier for organizers to manage community events and projects on the wikis, and it makes it easier for all contributors to discover and join events and projects on the wikis. Once it's enabled on Wikidata, you will have access to features that will help with planning, organizing, and promoting events/projects on Wikidata.

These features include:

  • Event Registration: A tool that helps organizers and participants manage event registration directly on the wiki.
  • Event List: A simple event calendar that shows all events happening on the wiki, particularly those using the Event namespace. It will also be expanded soon to have an additional tab to discover WikiProjects on a wiki.
  • Invitation Lists: A feature that helps organizers identify editors who might be interested in their events, based on their editor history.

Please note that some of these features, like Event Registration and the Invitation List, require users to have the Event Organizer right. When the extension is enabled on Wikidata, the Wikidata admins will be responsible for managing the Event Organizer right on Wikidata. This includes granting or removing the right, as well as establishing related policies and criteria, similar to how it’s done on Meta.


We invite you to help develop the criteria/policy for granting and managing this right on Wikidata. As a starting point for the discussion, we suggest the following criteria:

  1. No active blocks on the wiki.
  2. A minimum of 300 edits on Wikidata.
  3. Active on Wikidata for at least 6 months.


Additional criteria could include:

  1. The user has  received a Wikimedia grant for an event.
  2. The user plans to organize a Wikidata event.


We would appreciate your input on two things:

  1. Please share your thoughts and any concerns you may have about the proposal to enable the CampaignEvents extension on Wikidata.
  2. Review the starting criteria listed above and suggest any changes or additions you think would be helpful.

Looking forward to your contributions - Udehb-WMF (talk) 16:00, 19 September 2024 (UTC)Reply

300 edits may be too low; Wikidata edits are generally very granular, so it's easy to make a lot of them. Maybe set the minimum at 1000? ArthurPSmith (talk) 18:04, 19 September 2024 (UTC)Reply
I think 300 or 1000 matters little. The rights also don't give much room to mess up, so it is okay to have a low bar. From the additional criteria, I think a grant is way too restrictive, but the plan to organize is a must. Why else would the rights be needed? Ainali (talk) 18:22, 19 September 2024 (UTC)Reply
I think the proposed criteria are reasonable. It is really hard to judge someone by the amount of edits because of the tools we are using on Wikidata. Perhaps we want to use a trial period for granting the rights (at least for less experienced users). We could grant it temporary for one year and renew it if it is still needed. --Ameisenigel (talk) 19:35, 19 September 2024 (UTC)Reply
Hello! As a staff from an affiliate, I'd suggest to add a criteria that bypasses the number of edits for staff that belongs to an affiliate. In the case of Wikidata it's always useful if they know the platform before running an event, but it could be among the responsibilities of a new member of an affiliate staff to organize an event. Other than that, the criteria seems to follow what other wikis are currently discussing or implementing. Scann (WDU) (talk) 12:48, 20 September 2024 (UTC)Reply
That's an interesting point that makes me question why we need an extra limit at all. Couldn't this right just be added to what autoconfirmed users can do? If someone misbehaves, it wouldn't be too much hassle to notice it and block, and the harm they can do wouldn't be any worse than being able to create items or pages in the Wikidata namespace. Ainali (talk) 13:38, 20 September 2024 (UTC)Reply
We could start of without an edit limit for Wikidata and see whether any problems arise that way. If problems arise we can still increase the limit later.
If I remember right there was in the past some grant funded event that produced a few problems with bad edits. Does anyone remember more and whether the people in question would have fulfilled the limits that are proposed here? ChristianKl20:23, 23 September 2024 (UTC)Reply
@ChristianKl: I think you mean Wikidata:Project chat/Archive/2023/12#Wikidata-related grant proposals and Wikidata:Administrators' noticeboard/Archive/2023/12#Recent crop of new Nigerian items. --Matěj Suchánek (talk) 07:44, 26 September 2024 (UTC)Reply
@Udehb-WMF: what do you think about the case Matěj linked to? Should we assume that the WMF is capable of not repeating that mistake in future grants for events? If so we wouldn't need an amount of edits of Wikidata as a limit. ChristianKl10:47, 26 September 2024 (UTC)Reply
Thank you for your comment/question, @ChristianKl.
I would like to clarify that the Event Organizer right and the CampaignEvents extension are not limited to grantees or events funded through grants. These tools are designed to help any organizer, whether they are running events, WikiProjects, or other on-wiki collaborations, to manage organizing more easily on the wiki.
The community will decide who can use these tools on their wiki. That’s why we are having this discussion now - The idea behind the edit count, as one of the qualifying criteria for the right, is that it could help show a level of experience and engagement on Wikidata. The 300-edit threshold I suggested was just to start the discussion, but the community will ultimately decide on the final criteria.
Exceptions could also be made for affiliate staff members, similar to how it's handled on Meta, since they may need access to these tools to carry out their roles. -Udehb-WMF (talk) 11:05, 27 September 2024 (UTC)Reply
With regards to the question on grants, the team confirmed responding to the questions raised in the past; Grants talk:Programs/Wikimedia Community Fund/Rapid Fund/Wikimedia Awareness in Nafada (ID: 22280836) - Meta . The team has also been stringent in its grant request review process and remains open to further improvement. Feel free to share your input on the grants talk page or connect directly with VThamaini (WMF) at vthamaini@wikimedia.org. -Udehb-WMF (talk) 17:40, 27 September 2024 (UTC)Reply
We currently don't use an edit threshold to decide who to give rollback rights or property creators rights but have qualitative criteria. Given that people with the same amount of edits have quite different amounts of real experience and understanding of Wikidata I don't think that a number-based criteria would be useful. ChristianKl09:21, 4 October 2024 (UTC)Reply
Thank you, @Ainali, for your comment/question.
The reason for the Event Organizer right, instead of giving it to all autoconfirmed users, is that this right grants extra abilities that are specifically useful for event or wikiproject organizers, but not necessary for all autoconfirmed users. These abilities include:
  • Sending mass emails to registered event participants using the event registration feature(see demo).
  • Collecting participant demographic data through the event registration feature(see demo)
  • Creating invitation lists based on user contribution history via the Invitation List feature (see demo)
As you can probably guess, the risk of abuse seems low with this right. However, it’s still important to give this right to people the community trusts - people who meet the community's defined criteria. This is why local admins are responsible for managing this right on each wiki. If the extension is enabled on Wikidata, only users with the Event Organizer right on Wikidata will have access to these extra features. -Udehb-WMF (talk) 11:02, 27 September 2024 (UTC)Reply
Good idea, if we can enable this extension, we may need to remove Wikidata:Account creators group?--S8321414 (talk) 12:34, 25 September 2024 (UTC)Reply
This is completely unrelated since the event organizer role is only for the usage of the events extension. Account creator is nearly unused on Wikidata. --Ameisenigel (talk) 15:21, 25 September 2024 (UTC)Reply
As a user who has used the Campaign extension in Meta, I'm happy to see it being enabled in Wikidata, especially with Wikidata's birthday approaching. Users will be able to use this for the upcoming birthday events on Wikidata. Since the right now allows users to create event pages and send mass messages to those who register for the event. I agree with @ChristianKl that there doesn't seem to be a need for a minimum edit count requirement. Many organizers may not have 300 edits or be active for 6 months on Wikidata. Affiliate members requesting this right may not meet these criteria. An endorsement from their affiliate group should be considered instead. Other users can also request the right with supporting links explaining why they need the right on Wikidata rather than Meta. Like in metawiki believe all the events created by the users will be listed at the Special:AllEvents page in WD. So this can be easily monitered can tracked.-❙❚❚❙❙ GnOeee ❚❙❚❙❙ 11:38, 27 September 2024 (UTC)Reply

Feel free to join the discussion about making Wikidata great and sustainable 🤩 https://phabricator.wikimedia.org/T375352 So9q (talk) 04:34, 23 September 2024 (UTC)Reply

It not a ticket about making it scalable. It's a ticket about wanting it to be scalable without understanding the reasons why Wikidata isn't. SPARQL-based databases don't scale horizontally the way a lot of other databases do. ChristianKl08:10, 23 September 2024 (UTC)Reply
Don't you think that's unnecessarily blunt? But well, graphs are very nice but they don't scale into eternity either. And I'm not sure how well SQL scales beyond single-server. A laymans naive impression is we might get two decades if we federate and get a better triplestore. But yeah, at some point, if we refuse to set hard guidelines for what we include (which I believe So9q have advocated for) we will eventually reach the point where graphs simply is no longer an option so a fundamental change is inevitable. As the CouchDB docs say "disks are cheap", but expanding from 3 indexes to a stupid amount also have a cost, although it certainly will scale, but it will also have lost some of its appeal. Infrastruktur (talk) 15:05, 23 September 2024 (UTC)Reply
So9q wrote a post claiming that he knows what the community wants without having done the work of figuring out what the community wants, in a case like that I do think a blunt statement is warrented. I don't think people should write in that way if they just speak about their own opinion.
As one of the lead CouchDB developers once explained to me, CouchDB has a philosophy of not allowing you features that don't scale. If you ask them "Why does CouchDB does not support feature X that MongoDB supports?" the standard answer is "Because there's no way to develop the feature so that it scales to really large datasets".
Disks are cheap and some problems are solved by having more disks. Storing data on WikiCommons for example is solved by simply having more disks and thus we could use "tabular data" more to offload some data off Wikidata. ChristianKl17:40, 23 September 2024 (UTC)Reply
Thanks for pointing that out. I will gladly copyedit the statements in question. Which are you referring to?
The issue here from my point of view is that very little discussion has happened here since 2019 about what the community wants.
Based on the very recent discussion about import-policy I conclude that the community does not want to limit the growth.
It wants the WMF to fix any scaling issues so we don't have to worry about technical limits or choosing to import some amount of information over another despite both being notable. So9q (talk) 09:05, 24 September 2024 (UTC)Reply
I think statements about what the community wants in a phabricator ticket should only be made if there's community consensus for a given position. You wrote "The Wikidata community does not want to bother or worry about technical limits". For my part, having more information about the technical limits so that we can optimize Wikidata to work better within the existing technical limits would be great.
Ideally, we would have a system that scales perfectly. Unfortunately, that's not possible. The fact that a system like Telegram can easily run on a NoSQL databases and thus scale does not imply that this is possible for a triple store that can be queried with SPARQL. If you want to Wikidata to scale horizontally in a way that makes it impossible to run SPARQL queries that currently run fine, there are likely going to be people in our community who think that this isn't worth it.
WMDE recently developed the "mul" datatype to reduce the amount of unnecessary edits that get made and information that's stored in the database. That's a decision that allows us to have more data overall. ChristianKl11:34, 24 September 2024 (UTC)Reply
I'm not talking about the sparql database per se. I know they don't scale well.
The graph split can be viewed as a kind of manual sharding of the graph database with the downside that it affects queries and thus the user which is undesirable, but hard to avoid I'm the case of Blazegraph (and perhaps any other graph database in existence) So9q (talk) 08:57, 24 September 2024 (UTC)Reply
I think User:ASarabadani_(WMF)/Growth_of_databases_of_Wikidata would be a better place to discuss things. Vicarage (talk) 15:11, 23 September 2024 (UTC)Reply
I disagree, the scalability issues reported in that page is a concern for the whole Wikidata community and wider ecosystem IMO.
Perhaps it should be moved to meta since a failure of the Wikidata mariadb cluster would effect all wikis that are linked to Wikidata which is all of them.
The technical and community health of Wikidata is concerning all wikis and thus the whole movement. So9q (talk) 08:51, 24 September 2024 (UTC)Reply
I followed up with two child tickets initiating a search for a replacement of the master-n-replicas mariadb setup is outdated and does not scale horizontally for both read and write operations.
Also it has issues like lack of automated failover, lack of features like sharding, self-healing nodes, etc.
See https://phabricator.wikimedia.org/T375472 So9q (talk) 08:54, 24 September 2024 (UTC)Reply
I got a response from the lead mediawiki backend operations engineer and a decline of the ticket and subtickets I wrote. See my response
As I note in the response the mariadb backend is NOT scalable and offloading all the scholarly articles to a separate Wikibase (which has not been funded or approved by the board yet, see the proposal) is NOT a viable long term solution.
Basically our engineers are using a 2005 database setup (master on a single machine with a few replicas) not geared to big data at all. It's NOT best practise as of 2024 and it's not going to get any better by sticking our heads in the sand and hoping for good luck (like the lead engineer seems to want along with a few optimizations to the table layout).
Soon enough we will reach 100M items again once @Egon Willighagen imports millions of more chemicals or someone imports all the named streets of the USA, Russia and Russia, all bridges in Sweden, etc.
We need the WMF board and tech team to consider ways forward and time is running out for wikidatawiki according to @ASarabadani (WMF) NOW.
I'm considering writing a letter to the new board alerting them to this precarious situation, you are very welcome to join me, write me an email through my user page or reach out to me in telegram. So9q (talk) 10:52, 26 September 2024 (UTC)Reply
The database architect of WMF seems surprisingly pessimistic when it comes to scaling a SQL database horizontally. I just replied in phabricator to one of this comments with a possible open source drop in replacement for mariadb.
I urge the readers and users of Wikidata to ask themselves, if a community member can find a solution to the problem stated by @ASarabadani (WMF) in his spare time in a few minutes browsing Wikipedia for distributed SQL database engines that are open source, why have the WMF engineering team which is highly paid not done anything about this since the scalability issues became common knowledge? Why are they so negative to community members pointing to possible solutions? Why are they so unwilling to reflect on their own architecture decisions?
What could be causing this? What has hindered a solution to be found since 2012? (they could have continuously projected the growth of Wikidata and tested their current setup with dummy data and forecasted that we would outgrow a single machine master mariadb database long ago). Why did they fail to do that?
Imagine having a technical management and team of lead engineers who would rather try to impose growth limits on our thriving community of 23k contributors (and millions of consumers world-wide of the data every month) than do their job and make sure the backend scales according to community needs and the vision of the foundation[horizontally 1]. Is that what is going on?
I wonder if this situation is known to the board and what consequences it is going to get. WDYT? So9q (talk) 12:04, 26 September 2024 (UTC)Reply
The fact that ASarabadani wrote the post, suggests to me that he's considering ways forward. Writing a letter to the WMF board suggesting that he isn't considering the problems because he closed your tickets, seem like unnecessary drama.
Basically, you claim that you have a better idea of the kind of work that would be needed to change the present code base to software like MySQL Cluster than ASarabadani does. I find it highly unlikely that this is true. If you write a letter to the board, I would expect that you are not going to convince them that you understand the MediaWiki code base and what would be required to change it to be horziontally scalable better than ASarabadani just because you read a few articles on Wikipedia about distributed SQL database engines.
Writing software new software COBOL is not "best practice". That doesn't mean that banks aren't still running on a lot of COBOL code. Changing legacy system is not easy.
The scalablity bottleneck that Wikidata had to deal with in 2019 was about the amount of edits that Wikidata is able to do per minute. It was not about the size of the SQL database. Focusing engineering resources on the SQL database would not have helped with resolving the bottleneck we had at that time.
When optimizing a system it's important to understand the bottlenecks that exist and focus on solving them. You make suggestions without having tried to understand the existing bottlenecks. ChristianKl12:37, 26 September 2024 (UTC)Reply

The scalablity bottleneck that Wikidata had to deal with in 2019 was about the amount of edits that Wikidata is able to do per minute. It was not about the size of the SQL database. Focusing engineering resources on the SQL database would not have helped with resolving the bottleneck we had at that time.

Are you sure? The master on a single server + replica setup helps scale read operations but not write operations. Moving to a distributed SQL database scales both write and read operations. So9q (talk) 13:08, 26 September 2024 (UTC)Reply
Changing the SQL database can only help scaling the write and read operations when the bottleneck is about the SQL database in the first place. When the bottleneck however is about the performance of the triple store, it doesn't help you at all. ChristianKl13:18, 26 September 2024 (UTC)Reply
I agree, but my proposal did not intend to solve that since I gave up on the whole Blazegraph issue a while ago.
What I targeted with my suggestion was a solution to the issue with single machine SQL setup that @ASarabadani (WMF) highlighted recently. So9q (talk) 14:23, 3 October 2024 (UTC)Reply

Writing software new software COBOL is not "best practice". That doesn't mean that banks aren't still running on a lot of COBOL code. Changing legacy system is not easy.

I agree, but this situation is very different. I'm NOT talking about rewriting any code. The MediaWiki software is separated from the database. How the database distribute queries and sharding etc. is not affecting the code in any way AFAIK. That is why it is a drop-in solution that could be tested out in a weekend by anyone who wants. The only thing you need is two networked machines, a good internet connection and a bit of linux command line know how to load the data from the dumps to setup a wikidata clone on a distributed database. So9q (talk) 13:12, 26 September 2024 (UTC)Reply
AFAIK doesn't bring you very far when you don't know what you are talking about. If you ask ChatGPT, who also doesn't understand all roadblocks, it's able to give you a bunch of reasons why it would require a lot of work to change to MySQL Cluster such as limits of transaction size. ASarabadani is going to know a lot of other reasons why it's hard to simply switch databases. ChristianKl13:25, 26 September 2024 (UTC)Reply

When optimizing a system it's important to understand the bottlenecks that exist and focus on solving them. You make suggestions without having tried to understand the existing bottlenecks.

Are you sure? If I understood @ASarabadani (WMF)s information correctly the core problem is that the sheer size of the wikidatawiki tables makes it hard for the master and replicas to keep all the information needed to serve MediaWiki in a timely manner in RAM. Buying larger servers is not a solution because of the growth rate of the project. Distributing the load over multiple servers is the go-to industry solution when doing big data projects like Wikidata seem to have become. So9q (talk) 13:17, 26 September 2024 (UTC)Reply
While ASarabadani used to work on Wikidata (and WMDE) he's now at the WMF and chief database architect for MediaWiki.
As such the bottleneck that Wikidata faces that are outside of MediaWiki currently aren't his job. That does not mean that Wikidata does not have other bottlenecks that come from the triple store. If you look at the evaluation documents for choosing a new triple store for Wikidata, you find that amount of triples that those triple stores can store is unfortunately limited.
While there are technical solutions that require a lot of work that might allow MediaWiki to be horizontally scalable, implementing them would not result in the Wikidata Community not having to worry about our triple count. You don't get 100x growth out of the available triple store technology. ChristianKl13:47, 26 September 2024 (UTC)Reply
Wikidata will never be horizontally scalable. Asking who are the POTUS and asking who are male humans have no sematic difference. If there are as many POTUS as there are male humans, Wikidata will not be able to give an answer to either question. Midleading (talk) 09:41, 25 September 2024 (UTC)Reply
Let's make one thing clear. Wikidata is a MediaWiki-run wiki. MediaWiki supports nothing but a relational (SQL) database. Such databases are known not to be horizontally scalable. Therefore, Wikidata simply cannot be completely horizontally scalable. I can't imagine the amount of work needed to implement support for a (hybrid) NoSQL storage.
Note that this has actually nothing to do with the Wikidata Query Service split. These are, unfortunately, two different problems, which do have a common cause: Wikidata is becoming unsustainably large. This is the only thing we can do something about right now. --Matěj Suchánek (talk) 15:38, 26 September 2024 (UTC)Reply
There are many things that could be done. Currently, the knowledge about how various knowledge modeling decisions affect performance isn't readily available. Gathering that knowledge, writing it up and then bringing it up in relevant decisions would be helpful.
Initiatives like "mul" can free up capacity that we can use better otherwise. ChristianKl22:38, 26 September 2024 (UTC)Reply
We know long property chains is expensive but they are also handled efficiently so it all comes back to the size of the graph, ergo federation solves the problem. From my experience the community is unwilling to change their data model even if you present them with good reasons for why it makes sense they will refuse. You could or example insist that P131 would only go as low as municipality. But when you include neighborhoods the computational cost becomes unreasonable. Infrastruktur (talk) 16:36, 27 September 2024 (UTC)Reply
"But when you include neighborhoods the computational cost becomes unreasonable" how do you know?
Without good documentation about how costly various decisions happen to be it's hard to know whether an individual decision is worth the computational cost or whether that cost is unreasonable. ChristianKl10:27, 1 October 2024 (UTC)Reply
"How do you know?" If we just want to illustrate it without diving into the matters, that's quick enough. Germany have 13425 municipalities (according to Wikidata anyways). If we ask for a count of P131* of Q183 that will give us over a million items and and takes 30-50 seconds to run (three sample runs; and asked for count to exclude data transfer overhead). That leaves only 10-30 seconds for the rest of the query to do all the things it needs to do. If we ask for subclasses of watercraft that yields over 10000 items, and doesn't even take a second to complete. I didn't bother to look into the distribution, but that might also be interesting to look into if someone have the time. Infrastruktur (talk) 14:37, 1 October 2024 (UTC)Reply
If our goals is to be able to have as much items as possible on Wikidata, the computational cost we care about, is how much size items take up in the database and not how fast queries run.
If the goal is to be able to run more queries, buying servers that mirror the Wikidata Query Service is easily possible while you can't get the capacity to store more items simply by buying more servers. ChristianKl15:11, 2 October 2024 (UTC)Reply
If our goals is to be able to have as much items as possible on Wikidata, the computational cost we care about, is how much size items take up in the database and not how fast queries run.
If the goal is to be able to run more queries, buying servers that mirror the Wikidata Query Service is easily possible while you can't get the capacity to store more items simply by buying more servers. ChristianKl15:28, 2 October 2024 (UTC)Reply
I wonder how many users would be happy to have queries that ran in 2 or even 10 minutes, if they could be confident they wouldn't time out. This could be done with just more servers, but would be more useful if the server had an internal measure of task completion so it could abort early if the task was getting out of control, and that might require software changes. Vicarage (talk) 15:43, 2 October 2024 (UTC)Reply
"mul" can help, but it will be only a minor bit, in comparison with e.g. "Wikimedia category/template" descriptions. Just for consideration. --Matěj Suchánek (talk) 12:57, 29 September 2024 (UTC)Reply
@Matěj Suchánek Over the long term, I don't think the Query service needs direct access to descriptions and the job serving descriptions could be separated to a separate server. If WikiFunctions works better, it's possible that all these kinds of descriptions could be created over at a WikiFunctions driven server and cached there. ChristianKl14:22, 2 October 2024 (UTC)Reply

Refs:

  1. share in the sum of all knowledge

I’d like to add an FBI file number to a Wikidata profile, ( i.e. 100-HQ-34789, or 92-NY-1456, etc.). However, many FBI files were destroyed or are still classified, so I can’t link the file number to an external copy of the file in every case. I can provide a reference for each file number though.

  1. Is there an existing property, such as “Described by Source” or “Inventory Number”, that could be used for these numbers? If so, would it be best to create a new Item for each FBI file?
  2. If not, would this be appropriate for a new property (something like “Federal Bureau of Investigation File Number”), even if the file numbers won’t link to an external database or site?

Thanks! Nvss132 (talk) 10:40, 26 September 2024 (UTC)Reply

I think (2) is preferred, but you should probably start a property proposal to have a more in-depth discussion about this. I'm not entirely clear what these file numbers identify - they are for individual people? Can one person have more than one file number? Anyway a property proposal discussion would be a good place to clarify current options or whether we really should create a new property for this. ArthurPSmith (talk) 13:17, 27 September 2024 (UTC)Reply
Thanks for responding. After researching this weekend, I don’t think creating a new property will work anymore. Not every FBI file maps to a specific Wikidata item. (For example, FBI file 100-HQ-4869 is on the funding of the Communist Party while file 100-HQ-365088 covers the sale of foreign publications in America.) Since these subjects won’t correspond to one Wikidata item, I think the best solution is to create an item for each file, treating them like individual works. In addition, this also lets people use Template:P1343 to link individual people described in the file who are not the main subject of the file, such as a spouse being described in someone’s FBI file or when a file covers multiple members of an organization. Nvss132 (talk) 00:04, 30 September 2024 (UTC)Reply

It seems we have the WST.tv property: World Snooker Tour player ID (P4498) and the SnookerScores.net property: WPBSA SnookerScores player ID (P10857), but we do not have an ID property for wpbsa.com. It appears that wpbsa.com actually contains a significant amount of data, for example: Mark Allen on WPBSA, which is more than on: the same player on WST. Nux (talk) 19:07, 26 September 2024 (UTC)Reply

@Nux You can always propose a new property: Wikidata:Property proposal RVA2869 (talk) 12:55, 27 September 2024 (UTC)Reply
Thanks for the tip :).
Vote or discuss here: Wikidata:Property proposal/WPBSA com player ID :) --Nux (talk) 21:21, 27 September 2024 (UTC)Reply

Landau an der Isar (Q509536) and Landau an der Isar (Q32084506) seem to be the same but ceb.wiki has two articles. Magnus Manske (talk) 09:24, 27 September 2024 (UTC)Reply

I have merged the Cebuano pages into one because they both about the same subject. But the WD items are about the different concepts -- the commune and the centre of the commune. Landau an der Isar is divided into seven settlements (? quarters?), and the main one shares the same name with the commune. See w:de:Landau an der Isar#Gemeindegliederung. --Wolverène (talk) 10:04, 27 September 2024 (UTC)Reply
@Magnus Manske There are a lot of bot created pages in ceb.wiki because of GeoNames see https://www.wikidata.org/wiki/Wikidata:WikiProject_Territorial_Entities/Geonames_and_CebWiki for more background. ChristianKl13:13, 27 September 2024 (UTC)Reply

Hi all,

I'm a plant enthusiast interested in enhancing Wikidata's plant entries. I'm contemplating adding statements to plant species that reflect required care and features of plants.

For example, to Goeppertia insignis (Q90458733) (Calathea orbifolia), I would add something like the following:

Property: Value
Cycle: Herbaceous Perennial
Watering: Average
Propagation: Division,Stem Propagation,Leaf Cutting,Air Layering Propagation
Flowers: Yellow Flowers
Sun: part shade,part sun/part shade
Leaf: Yes
Leaf Colour: green,purple
Growth Rate: Low
Maintenance: Moderate
Tropical: Yes
Indoors: Yes
Care Level: Medium

I believe these additions would be valuable for several reasons:
1. They would provide more detailed information for plant care.
2. They could facilitate SPARQL queries for plant selection based on specific criteria.
3. They might aid in botanical research and education.

Before proceeding, I have a few questions:

1. Are there existing properties in Wikidata that cover some of these aspects? If so how can I find them?
2. If not, what is the process for proposing new properties?
3. Do you think these additions would be acceptable and valuable for Wikidata?
4. Are there any concerns or potential issues with adding this type of information?

I would greatly appreciate your feedback on the specific properties I've listed and any suggestions for improvement or additional properties to consider.

Thank you for your time and input! Inkpotmonkey (talk) 11:52, 28 September 2024 (UTC)Reply

Most of these are subjective, and therefore are not suitable for use in a database, unless they are rigorously defined and widely agreed-on by scientists.--Jasper Deng (talk) 22:16, 28 September 2024 (UTC)Reply
@Inkpotmonkey: Most of the proposed data sounds very subjective, which means it is hard to make them compatible with Wikidata. However, if you want to help, you may add properties like flower color (P2827), foliage type (P10906) and leaf morphology (P12616) with together with realiable references. Samoasambia 08:39, 1 October 2024 (UTC)Reply

Hi, in Europe which has 45+ languages to handle we have some transnational languages level framework called en:Common European Framework of References for Languages (Q221385) together with languages levels (Q104381881) structured as as :

I assigned :

  • A1 & A2 as sub-class of A,
  • B1 & B2 as sub-class of B,
  • C1 & C2 as sub-class of C.

But are A, B, C of *instance of* (P31) or of *sub-class of* (P279) of Common Reference Levels for languages (Q104381881) ??

See also WDQS https://w.wiki/BMKo . Yug (talk) 20:47, 28 September 2024 (UTC)Reply

@VIGNERON:. Yug (talk) 10:43, 29 September 2024 (UTC)Reply
I would suggest that all of the items listed above should be part of (P361)CEFR common reference level (Q104381881) instead of instance or subclass, but I won't claim to be an expert. Huntster (t @ c) 13:45, 29 September 2024 (UTC)Reply
I think Q104381881 should be edited to "CEFR language level" (or something similar) so that having it as an "instance of" value would make sense. In addition all of the levels could have part of (P361)Common European Framework of Reference for Languages (Q221385). Samoasambia 16:40, 29 September 2024 (UTC)Reply
Good catch, agreed on all points. Huntster (t @ c) 17:03, 29 September 2024 (UTC)Reply
@Yug, Huntster: I did the changes I proposed now. I assigned (now renamed) CEFR common reference level (Q104381881) both as a instance of and subclass of value for the "group levels" (A, B, C) which looks a bit awkward. That's because otherwise the contraint checks on the "lower levels" (A1, A2, B1 etc.) would be trigged by being a subclass of an item that is not subclass of anything. Samoasambia 08:59, 1 October 2024 (UTC)Reply
@Samoasambia: I've removed rank (Q4120621) from CEFR common reference level (Q104381881) (since it's not really a rank in and of itself), and added it to each of the levels in place of CEFR common reference level (Q104381881) to avoid the issue you pointed out. Let me know if you disagree. Huntster (t @ c) 13:45, 1 October 2024 (UTC)Reply
Thanks Huntster, that seems to work well. Samoasambia 19:33, 1 October 2024 (UTC)Reply

Hello! I scanned a book about the Nigerian legislature called Nigeria Legislature 1861-2011 with lists of the members of the Nigerian Senate and House of Representatives. Many of them are not on Wikidata (or anywhere I can find online :/) so I wanted to add them. They come in the form of infoboxes that look like this [1]. I'm slowly compiling these infoboxes (there are a LOT) into a spreadsheet to add to Wikidata through QuickStatements. Unfortunately, I'm not extremely familiar with Wikidata so I wanted some help, feedback, and other comments about how I should go about this.

Right now, my CSV has columns for Name, Constituency, State, Date of Birth, and Education. I wish I could add an image for them but I'm not sure about the copyright of a book published by the Nigerian government. Fields for Date of Birth and Education can be pretty spotty, with Education in particular varying in specificity from specific subject details of a Ph.D to simply listing a diploma in a subject, if any is listed at all. Politicians from Oct-Dec 1983 in particular have sparse details likely due to the military coup in 1983.

Some questions I have about this,

1) Some names only have initials without full names. Is this okay?

2) Some list in their Education field a Grade III/II Teacher's Cert. I can't find anything related to this education on Wikidata (seems to be an old teacher credential used in the 1960s or so). What should I do here?

3) Right now, fields in the spreadsheet are the plain text from the infoboxes. I plan on using Pandas to transform it into properties and qualifiers that QuickStatements would like. How would I go about adding "inner qualities" of a property? Not sure what the correct jargon for it is but an example is in Leslie Lamport under Doctor of Philosophy, it lists his academic major as mathematics.

Thanks for reading, and let me know any questions, comments, or concerns! Moon motif (talk) 02:23, 29 September 2024 (UTC)Reply

@Moon motif: Good questions. Have you looked into OpenRefine as a tool to convert your CSV into wikidata statements directly (no need to go through QuickStatements)? Initials instead of full names are fine; the description should disambiguate who they are. For education, we typically use educated at (P69) for the educational institution, with qualifiers (I assume that's what you mean by "inner qualities") for dates and degree attained. It's possible that Teachers' Training Certificate (Q98793260) or some other type of academic degree (Q189533) meets your needs for the degree; if not it's fine to add a new item as long as you're sure it's not a duplicate of something already here. ArthurPSmith (talk) 19:48, 30 September 2024 (UTC)Reply
Oh cool! Didn't hear about OpenRefine and it definitely looks like exactly what I need. And thanks for answering my questions! Moon motif (talk) 15:15, 1 October 2024 (UTC)Reply

Take a look at Q23649754. There are currently four statements for identifiers that are meant exclusively for video games, not software (Can You Run it ID, HowLongToBeat ID, Lutris game ID and Rock Paper Shotgun game ID). However because these sites scrapes everything from Steam the identifiers were created anyways Trade (talk) 02:57, 29 September 2024 (UTC)Reply

I think these should not be deprecated, unless the website deprecates, redirects or deletes these identifiers themselves. Midleading (talk) 10:44, 29 September 2024 (UTC)Reply
It does create an annoying amount of constraint errors Trade (talk) 18:08, 29 September 2024 (UTC)Reply

The entry here for Sydney Walker Barnaby, here is wrong. His surname is Barnaby, on commons it shows up as a given name? Meanwhile I added it to commons as a surname, but the given name derived from wikidata, still shows as a given name? Why? Broichmore (talk) 17:10, 29 September 2024 (UTC)Reply

  Fixed RVA2869 (talk) 17:47, 29 September 2024 (UTC)Reply

I think that this discussion is resolved and can be archived. If you disagree, don't hesitate to replace this template with your comment. RVA2869 (talk) 17:05, 4 October 2024 (UTC)Reply

We have official residence (Q481289) and official residence (Q11452137). Both seem to me to be too specific to cover the official residence of a university president. Do we have something more general, short of simply residence (Q699405)?

This came up for New York Building (Q130320815). - Jmabel (talk) 14:59, 30 September 2024 (UTC)Reply

@Jmabel: My first instinct was to say be bold and create one if you don't find an existing one. However, I question whether a class is really the best way to model this. I'm not sure that official residences of universities are a class with common features enough that instance of (P31) is the right relationship. Being an official residence seems less like an inherent characteristic of a building and more like a status temporarily conferred. I know well that many existing Wikidata classes similarly fit this description, but it doesn't seem ideal. I'd model this case as:
Daask (talk) President of the University of Washington (Q6603245) so it would be clear he used it as a home rather than an office? - Jmabel (talk) 05:12, 3 October 2024 (UTC)Reply
 

Here's your quick overview of what has been happening around Wikidata in the
week leading up to 2024-09-30. Please help Translate. Missed the previous one?
See issue #646

Discussions

Events

  • Wikidata's 12th birthday is coming up on October 29th. Have a look at the birthday parties and more planned around the world.
  • Next Linked Data for Libraries LD4 Wikidata Affinity Group session 1 October, 2024: We have our next LD4 Wikidata Affinity Group Session on Tuesday, 1 October, 2024 at 9am PT / 12pm ET / 17:00 UTC / 6pm CET (Time zone converter). Christa Strickler will be our first Project Series lead with her joint project with the Wikidata Religion & Theology Community of Practice to contribute biographical data to Wikidata from the IRFA database using the Mix’n’Match tool. We are excited to learn more about this project, provide a forum for discussion and shared learning, and lend a hand while building new skills. Event page.

Press, articles, blog posts, videos

Tool of the week

Other Noteworthy Stuff

Newest properties and property proposals to review

You can comment on all open property proposals!

Did you know?

Development

  • Search: The haswbstatement search magic word has been improved by the Search Platform Team. Previously it was limited in which Properties were indexed for it. Going forward haswbstatement:P123 will work for all Properties, regardless of their datatype. This will allow you to filter search results for Items that have a statement with a specific Property. (Searching for a specific complete statement with haswbstatement:P123=xxx will still only work for specific datatypes.) For this to work all Items have to be reindexed and this will take up to 1 month.
  • Design system migration: We have migrated the Special:NewLexeme page from Wikit to Codex and are working on finishing the migration for the Query Builder.
  • EntitySchemas: We finished the investigation about how to support search for EntitySchemas by label or alias when linking to an EntitySchema in a statement. (phab:T362005)
  • Wikibase REST API: We worked on integrating language fallbacks into the API (phab:T371605)

You can see all open tickets related to Wikidata here. If you want to help, you can also have a look at the tasks needing a volunteer.

Weekly Tasks

Currently, an alias is added with no ability to add a "reference" to support that claimed alias. How and do I make this proposal? Thank you, -- Ooligan (talk) 17:53, 30 September 2024 (UTC)Reply

Labels and aliases are different from other properties in that they are mainly for the use of human editors, and are somewhat outside the graph database logic. Generally if you need to be more specific about the name of an item, the period it applies for, or its variants, and provide references, you should be using one of the "name" properties like name (P2561) or official name (P1448), and adding references to those, and repeating the names as aliases so human searches can see them. Vicarage (talk) 17:59, 30 September 2024 (UTC)Reply

I have a feeling that Comptes Rendus de la Association Française pour l'Avancement des Sciences. (Q51458548) should be merged into Compte Rendu de l'Association Francaise Pour l'Avancement des Sciences (Q5780218). However there are quite a few very similarly named scientific journals from this time period, so I'm not entirely sure—hence haven't gone ahead and actually done anything. Please advice, if you have access to more detailed information than I have. Thanks! Tommy Kronkvist (talk), 22:50, 30 September 2024 (UTC).Reply

The way forward would to look at all the external ID properties and the information they store to see whether that matches. ChristianKl11:20, 1 October 2024 (UTC)Reply

Can we partnership in building a morden Health center in Liberia my country. I'm Michael M. Edwards from Liberia. 41.57.95.221 08:25, 1 October 2024 (UTC)Reply

Who do you mean with "we"? Wikidata is not an institution that builds hospitals. ChristianKl09:58, 1 October 2024 (UTC)Reply

Hello, while I develop Wikivoyage modules, I found that mw.Wikibase does not have a method to search items by properties. How can this be implemented, or is it simply impossible? Thanks, Tmv (talk) 08:55, 1 October 2024 (UTC)Reply

@Tmv: I'm not sure about wikibase generally, but in Wikidata there's a haswbstatement filter for the search box that allows property-based searches. Put "haswbstatement:P18" in the search box and you'll get all the items with images, or put "haswbstatement:P31=Q5" in to find humans (i.e. a specific property value). This can be very useful combined with other search terms. ArthurPSmith (talk) 13:48, 1 October 2024 (UTC)Reply

Select an NLP task for which an annotated dataset is available and a knowledge graph can be useful (e.g., Question Answering) – Embed the selected knowledge graph – Analyse the advantages of using the graph directly or its embeddings when performing the task. how can i do a project related to this 194.210.175.150 13:58, 1 October 2024 (UTC)Reply

As of October 1 2024 the Author Disambiguator tool has switched from using the original Wikidata Query Service to using the new split graph services. The tool defaults to using the "scholarly" graph to find authored items; however this can be changed on a session-by-session basis using a new "Preferences" page. Check the box to switch to using the "main" subgraph instead of the scholarly one for authored works. Please let me know if you run into any problems; suggestions can also be submitted as a GitHub issue. ArthurPSmith (talk) 14:52, 1 October 2024 (UTC)Reply

Bartolomeus Anglicus's late medieval encyclopedia De proprietatibus rerum, mentions (book 5 chapter 2, in Stephen Bateman's 1582 translation):

...a beaſt that is called Lamia, that hath as the Gloſe ſaith Super Tre. an head as a maide, and bodie like a grimme beaſt.

Which Lamia is the proper target of a link? Wikidata has Lamia (Q200073) and lamia in a work of fiction (Q59312503), but it's neither of those because Bartolomeus clearly believed they were real. Marnanel (talk) 15:52, 1 October 2024 (UTC)Reply

So, recently, I created a new page on main Wikipedia entitled "LGBTQ themes in Western animation". Site links have been added to redirect to those entries to the revised page. That's find. However, the Wikidata entries for the now-merged pages still exist as "LGBTQ themes in Western animation (Q96381090)", "LGBTQ themes in Western animation (Q104862909)", "LGBTQ themes in Western animation (Q104862902)", "LGBTQ themes in Western animation (Q104862898)" and "LGBTQ themes in Western animation (Q96381091)" still remain. I would like to merge them into "LGBTQ themes in Western animation (Q130371258)". How do I do that? Historyday01 (talk) 18:51, 1 October 2024 (UTC)Reply

@Historyday01: Hi, I did it for you. But for the future you can find instructions at Help:Merge. Samoasambia 19:30, 1 October 2024 (UTC)Reply
Thanks. I'll definitely keep that in mind going forward. Historyday01 (talk) 19:33, 1 October 2024 (UTC)Reply

I think that this discussion is resolved and can be archived. If you disagree, don't hesitate to replace this template with your comment. RVA2869 (talk) 17:06, 4 October 2024 (UTC)Reply

I created this item and list by mistake. There was no Salon in 1871 because of the German-France War. Carl Ha (talk) 19:29, 1 October 2024 (UTC)Reply

@Carl Ha: done, but please use Template:Delete or WD:RfD in future. --Wüstenspringmaus talk 10:25, 2 October 2024 (UTC)Reply

I think that this discussion is resolved and can be archived. If you disagree, don't hesitate to replace this template with your comment. RVA2869 (talk) 17:06, 4 October 2024 (UTC)Reply

Apparently the latter includes the former, but I can't really figure out the difference, Strainu (talk) 21:47, 1 October 2024 (UTC)Reply

The former seems to be for actual events within the discipline like snowboarding at the 2010 Winter Olympics – women's halfpipe (Q263926) whereas the latter is for general disciplines like snowboarding at the 2010 Winter Olympics (Q381127) — Martin (MSGJ · talk) 11:21, 2 October 2024 (UTC)Reply

Hello, What next steps would you recommend for the situation when my User Report hangs unattended by the administrators almost for a week, while the Wikidata items affected still have incorrect data? Flipping Switches (talk) 09:49, 2 October 2024 (UTC)Reply

Plus, user's tone took close to offencive turn. Flipping Switches (talk) 10:43, 2 October 2024 (UTC)Reply
Does this relate to Wikidata:Administrators'_noticeboard#Report_concerning_User:Шкурба_Андрій_Вікторович? Probably better to keep the discussion in one place. Keep posting until you get a response — Martin (MSGJ · talk) 11:22, 2 October 2024 (UTC)Reply
MSGJ, That's the one. The conversation started so I'll continue there, indeed. Thanks for the reply. Flipping Switches (talk) 18:52, 2 October 2024 (UTC)Reply

Hey there, it seems someone behind 114.5.110.202 vandalised some items: [2]. Can somebody with the right tools revert the edits, please?

--Frlgin (talk) 13:11, 2 October 2024 (UTC)Reply

  Done: Reverted & blocked. @Frlgin: Please report vandals on WD:AN next time. Thanks! --Wüstenspringmaus talk 13:17, 2 October 2024 (UTC)Reply

I think that this discussion is resolved and can be archived. If you disagree, don't hesitate to replace this template with your comment. RVA2869 (talk) 17:06, 4 October 2024 (UTC)Reply

The new pattern is "http://www.filae.com/nom-de-famille/$1.html". And because the .html part is new, the automatic redirection to Filae returns a 404 error message. Rosenzweig (talk) 17:52, 2 October 2024 (UTC)Reply

Can you give me an example of a correct link? — Martin (MSGJ · talk) 18:56, 2 October 2024 (UTC)Reply
[3], but it seems you already figured it out yourself [4]. --Rosenzweig (talk) 11:10, 3 October 2024 (UTC)Reply
That also gives a 404 message, which is why I asked — Martin (MSGJ · talk) 07:57, 4 October 2024 (UTC)Reply

I see this on Hurricane Helene (Q130358528) (Hurricane Helene) for the Spanish Wikipedia. Batrachoseps (talk) 16:48, 3 October 2024 (UTC)Reply

Nothing, validated (Q20748093)'s point is "badge being used as a Wikisource work status indicator". ChristianKl18:39, 3 October 2024 (UTC)Reply
For what it's worth, the badge was present in the original version of the Spanish Wikidata page that was merged into the current page. Jonesey95 (talk) 12:53, 4 October 2024 (UTC)Reply

I am trying to set up a Mix'n'match catalogue for Bundesstiftung Aufarbeitung person ID (P9671) using https://mix-n-match.toolforge.org/#/scraper/new, but I am failing. The URL where every entry is listed is https://www.bundesstiftung-aufarbeitung.de/de/recherche/kataloge-datenbanken/biographische-datenbanken (you would have to press "Mehr laden"="Load more" a lot of time to get all of them displayed). An example URL for an entry is https://www.bundesstiftung-aufarbeitung.de/de/recherche/kataloge-datenbanken/biographische-datenbanken/franziska-van-almsick. Could someone help me please? Dorades (talk) 21:15, 3 October 2024 (UTC)Reply

Please see Wikidata:Bot requests#Request to adjust badge color for "recommended article" (2024-10-02). I don't mean to double-post; I posted there first because previous discussion of the mw:Extension:WikimediaBadges extension.

Please respond either here or there, depending on what the appropriate process is on Wikidata. (I'm coming over from en.WP, so I'm not familiar with procedures and norms here.) Thanks. Jonesey95 (talk) 23:32, 3 October 2024 (UTC)Reply

It's not something that can be done via a bot, so it's not really a bot request. The project chat is a good place to have a discussion on Wikidata. That said, Phabricator is likely the main point where a discussion that actually results in a change would happen. ChristianKl13:30, 4 October 2024 (UTC)Reply

We are excited to introduce ProVe, a tool for checking the quality of references in Wikidata. Wikidata item statements should be verifiable and referenced by a source (e.g. a book, scientific paper, etc.) but many statements lack them or could do with better ones. ProVe helps with this by providing information about the quality of the references of Wikidata items, based on techniques like large language models, triple verbalisation, and semantic similairty. We hope that it will be useful for Wikidata editors, and that it will help improve the trustworthiness of Wikidata by aiding in the reference verification task.

We have also developed the **ProVe Gadget**, which visually presents ProVe's results as a widget at the top of a Wikidata item page. Any Wikidata user can easily turn this gadget on, see here for install instructions. You can use it to request the processing of references, showing reference scores, navigating problematic references, and quickly fix them with better ones.

We have also started a WikiProject on Reference Verification to provide worklists of priority items, in case you are looking for items to test the tool.

We would greatly appreciate it if any interested users test the tool, and let us know any feedback so we can improve it :-) Albert.meronyo (talk) 07:50, 4 October 2024 (UTC)Reply

@Albert.meronyo, thank you creating this great tool! I just want to note that there is a JavaScript function for loading scripts from other pages, so you don't have to add the whole code to your own common.js. For ProVe it works like this:
mw.loader.load( '//www.wikidata.org/w/index.php?title=User%3A1hangzhao%2FProVe.js&action=raw&ctype=text%2Fjavascript' ); // [[User:1hangzhao/ProVe.js]]
Here's an example edit on my own common.js. Also, remember to add ProVe to the tools catalog so that people can find it more easily. I will give further feedback after I've tested it a bit more :). Samoasambia 11:12, 4 October 2024 (UTC)Reply

I accidentally leaked my IP on the 2015 DM319 page. Can someone hide it? MJGTMKME123 (talk) 12:39, 4 October 2024 (UTC)Reply

I hid the IP. ChristianKl13:15, 4 October 2024 (UTC)Reply
Thanks! MJGTMKME123 (talk) 13:50, 4 October 2024 (UTC)Reply

I think that this discussion is resolved and can be archived. If you disagree, don't hesitate to replace this template with your comment. --Matěj Suchánek (talk) 15:45, 4 October 2024 (UTC)Reply

Please how do I add my new language Wlx on the wikidata which is now under Wikipidia test wiki project? Zakaria Tunsung (talk) 12:46, 4 October 2024 (UTC)Reply

As far as I know we don't link from Wikidata to the Wikipedia Incubator. The way to get sitelinks from Wikidata is to graduate the website out of the incubator. ChristianKl13:59, 4 October 2024 (UTC)Reply
ChristianKl is right. Test projects on the Incubator often use the old interwiki linking method ([[lang:page name]]) to connect to articles on other wikis with separate domains. Tmv (talk) 16:44, 4 October 2024 (UTC)Reply

Dear Wikidata community,

We are researchers at King's College London investigating how content gaps arise and can be measured in Wikidata. Currently we have explored existing research papers to identify several categories of gaps. However, we have noted a lack of consideration of editors’ experiences in this existing research and are keen to hear about editors’ views on and methods for identifying and addressing content gaps. While this topic has seen a lot of attention in Wikipedia, we believe Wikidata presents unique challenges and content which warrant further investigation. For this reason, we are planning an interactive workshop to gather the perspectives of editors. We already have some sign-ups, but wanted to open it up to the community more widely.

We will be hosting a 90 minute workshop on the 9th of October at 16:00 UK time (15:00 UTC) and would like to invite any interested editors to attend. There are no requirements to participate beyond being an active editor, although unfortunately and by necessity, the workshop will be conducted in English only. The main goal of the workshop is to understand your perspectives on how to measure and monitor content gaps as well as potentially identify further metrics or even to propose new methods to identify and quantify gaps in Wikidata.

Participation is completely voluntary. All personal data will be kept confidential in compliance with GDPR. If you are interest in taking part, you can find out more about the workshop from our participant information sheet. You can also read more about the research at our meta page.

You can sign-up to take part from our registration form. If you have any questions or concerns, my contact details are listed at the top of the sign-up form.

Thank you for your time and consideration.

With kind regards, Celestialtoast (talk) 15:20, 4 October 2024 (UTC)Reply

Did you really post this message 40 minutes before the time of your workshop? Some more notice would have helped better participation! — Martin (MSGJ · talk) 15:24, 4 October 2024 (UTC)Reply
Oops, my mistake. Thanks for pointing that out! Date should be corrected now. Celestialtoast (talk) 15:27, 4 October 2024 (UTC)Reply
Ah, that is slightly better :) — Martin (MSGJ · talk) 15:29, 4 October 2024 (UTC)Reply