⚓ T218918 Some interface messages (e.g. sitenotice, others) are loading old revisions of their messages


Article Images

Some interface messages (e.g. sitenotice, others) are loading old revisions of their messages

In simplewiki, the Mediawiki:Mycontris is set to "My changes" and has been changed more than nine years ago but at the top of the page, it shows "My edits":

image.png (27×648 px, 5 KB)

It's not falling back to English but it's using a really old value instead so I'm confused what's going on here.

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

T218983 suggests that this problem exists for *all* locally overwritten interface messages. Can anyone confirm this?

That ticket also states that a null edit fixes the issue for a moment, but then it's reverted to the old bad state.

I noticed this on itwiki, too, so I'll provide another example for completeness: this week I have changed visualeditor-ca-editsource from "Modifica wikitesto" to "Modifica sorgente". Viewing any page will result in a quick flash of the new message, which is then suddenly replaced by the old one. While trying to debug I went to this page, which uses {{in:visualeditor-ca-editsource}} and showed the old version. Purging the page updated the message to the new one.

Scheduled SWAT deployment between 18:00 and 19:00 UTC.

No SWAT today, it's Friday. Silly me.

After the fix is live, affected messages can be fixed by purging the respective pages in the MediaWiki namespace.
We may be able to purge them in bulk, but not sure yet.

Scheduled SWAT deployment between 18:00 and 19:00 UTC.

There's no SWAT today (it's Friday and you put it in yesterday SWAT) but this task is UBN! giving it permission to deploy at any time.

There's no SWAT today (it's Friday and you put it in yesterday SWAT) but this task is UBN! giving it permission to deploy at any time.

Hahaha! ooops!

Some thoughts on how to update caches:

Calling MessageCache::load() should work, perhaps from a one-off script. There are some caveats to this:

  • The method is protected. Would need to use reflection to call it, or make it accessible.
  • The method is per language. We'd probably want to call it for the content language at least. Other languages may be less important.
  • MessageCache::load() grabs a lock, other threads trying to do the same may pile up. These shouldn't be too frequent, though. Also, running load() shouldn't take very long (wild guess: ten seconds).

MessageCache::clear() could also work, but then we have no control over when and how the re-caching is triggered. For all I know, it may happen during a web request, possibly causing a timeout.

MessageCache::clear would fit the bill, but I'm not sure it will fully be able to avoid cache refill storm due to web requests (it should be fine due to use of locks, but not my area of expertise). Bumping MSG_CACHE_VERSION would probably do the same.

$wgMsgCacheExpiry seems to be unaltered, default is one day, so we could just wait.

MessageCache::recache method would be ideal, but doesn't exist.

Trying to purge with manual edits is both difficult and doesn't update whole cache anyway, so that is out of question.

Perhaps one could try to write a script that temporarily makes one of the conditions on MessageCache::isCacheExpired to return false (most likely the return value of wfTimestampNow) so that it gets recached with (in)direct calls to load() but that's getting way too complicated imho.

I noticed this on itwiki, too, so I'll provide another example for completeness: this week I have changed visualeditor-ca-editsource from "Modifica wikitesto" to "Modifica sorgente". Viewing any page will result in a quick flash of the new message, which is then suddenly replaced by the old one. While trying to debug I went to this page, which uses {{in:visualeditor-ca-editsource}} and showed the old version. Purging the page updated the message to the new one.

But this was on TranslateWiki, not on it.wikipedia.org directly, so if I understand @daniel’s fix correctly that shouldn’t even have been affected? [itwiki:MediaWiki:visualeditor-ca-editsource](https://it.wikipedia.org/wiki/MediaWiki:Visualeditor-ca-editsource) doesn’t exist as a page.

I noticed this on itwiki, too, so I'll provide another example for completeness: this week I have changed visualeditor-ca-editsource from "Modifica wikitesto" to "Modifica sorgente". Viewing any page will result in a quick flash of the new message, which is then suddenly replaced by the old one. While trying to debug I went to this page, which uses {{in:visualeditor-ca-editsource}} and showed the old version. Purging the page updated the message to the new one.

But this was on TranslateWiki, not on it.wikipedia.org directly, so if I understand @daniel’s fix correctly that shouldn’t even have been affected? [itwiki:MediaWiki:visualeditor-ca-editsource](https://it.wikipedia.org/wiki/MediaWiki:Visualeditor-ca-editsource) doesn’t exist as a page.

That's correct. I have to say that I didn't check the patch before writing my comment. It could be a different issue, although that would possibly be a peculiar coincidence.

I noticed this on itwiki, too, so I'll provide another example for completeness: this week I have changed visualeditor-ca-editsource from "Modifica wikitesto" to "Modifica sorgente". Viewing any page will result in a quick flash of the new message, which is then suddenly replaced by the old one. While trying to debug I went to this page, which uses {{in:visualeditor-ca-editsource}} and showed the old version. Purging the page updated the message to the new one.

But this was on TranslateWiki, not on it.wikipedia.org directly, so if I understand @daniel’s fix correctly that shouldn’t even have been affected? [itwiki:MediaWiki:visualeditor-ca-editsource](https://it.wikipedia.org/wiki/MediaWiki:Visualeditor-ca-editsource) doesn’t exist as a page.

That's correct. I have to say that I didn't check the patch before writing my comment. It could be a different issue, although that would possibly be a peculiar coincidence.

Translatewiki.net isn't even running the code that caused this issue yet, so it can't be the same thing. Also, message transclusion is a rather different beast. Seeing a "flash" of another version indicates the usage of some JavaScript trickery to apply localization in the user language. This all sounds like it's a different issue.

@daniel Well, while it's true that some JS is altering the message, that JS isn't from a user or wiki script, as the issue still happens in safemode. Maybe VE is replacing the standard message with its own? But yes, it could be a different bug. I guess we'll see after the patch above is deployed.

MessageCache::recache method would be ideal, but doesn't exist.

It looks to me like MessageCache::loadFromDBWithLock should have more or less the desired effect – specifically, these lines:

$cache = $this->loadFromDB( $code, $mode );
$this->cache->set( $code, $cache );
$saveSuccess = $this->saveToCaches( $cache, 'all', $code );

It’s protected, but could be called from eval.php e. g. via TestingAccessWrapper.

I'm gonna go ahead and say it's an off-by-one error.

For the record: it's not an off-by-one error. The problem was that the query would load all revisions, instead of just the latest revision, and write them all to the cache, all using the same key. So the last revision in the query result would win. Since no order is specified on the query, which revision will come last is undefined in theory, but in practice, it seems to always be the one with the largest revision ID. This will usually be the latest revision, so the bug doesn't occur in most cases, and wasn't caught in testing.

To reproduce the bug, I had to import an old revision, to create a revision with a higher revision ID that was still not the correct current revision. The same effect of creating an old revision with a new/high revision ID could in the past also be triggered by deleting and undeleting a page, but we have been restoring the original revision ID for a while now.

I suspect that the affected messages are ones with undeleted or imported revisions. Or the order of the result set is simply different in production, due to index optimization and query planning on large databases.

It looks to me like MessageCache::loadFromDBWithLock should have more or less the desired effect – specifically, these lines:

You are right, loadFromDBWithLock() is probably the best choice. load() would not work, since it loads from the cache instead of refreshing it. And loadFromDB() itself only loads, it doesn't do anything to caches.

But then, if the bad messages have been coming back periodically (due to some cron job or job runner activity, presumably), I'd expect the good messages to be restored soon, in the same way.

FTR, even if mine is a different issue, it has the same solution: purging the local page of the message, even if it doesn't exist, fixed it.

Just FYI, Wikiversity user:Billinghurst upped the Mediawiki site id by one on Wikiversity and very old site notice disappeared at least for now.

Similar SOPA site notice had to be removed from Portuguese wikipedia yesterday evening.

There was such an issue in ptwiki with regards to sitenotice yesterday, when I performed a zero-edit (i.e. opened for editing and clicked save without changes), the banner was gone.

OK, hot-fix is now emergency-deployed. Theoretically this means this should be fixed for new page impressions. It's probable that some broken pages will be in the caches, however, so logged-out users may well get the wrong experience.

abian added a comment.EditedMar 22 2019, 5:27 PM

What's the status? Is the problem gone?

It works for me if I manually purge the corresponding pages in the MediaWiki namespace (what you suggested on T218918#5047858).

It works for me by manually purging the corresponding pages in the MediaWiki namespace (what you suggested on T218918#5047858).

Can you confirm that they do not revert back to a bad state after an hour or so? THis is what was happening before. Purging always helped, but didn't stick.

Since no order is specified on the query, which revision will come last is undefined in theory, but in practice, it seems to always be the one with the largest revision ID. This will usually be the latest revision, so the bug doesn't occur in most cases, and wasn't caught in testing.

More specifically, at least on MySQL/MariaDB it usually matches the order of whatever index the DB used for the fetching of the rows. For the broken query here there are several options. Which of the several candidate indexes it picks is likely up to the details of the replica's index statistics, which can vary between replicas and might change as new revisions are inserted or deleted. If it picks rev_page_id or page_timestamp that would result in it using the latest revision by rev_id or rev_timestamp, but there are other options.

Screenshot_2019-03-23-08-08-48.png (1×720 px, 188 KB)

Screenshot_2019-03-23-13-10-49.png (1×720 px, 89 KB)

These are two live screenshots of Sitenotice that are shown of October 2018. This is Happening from past 2 days and users are getting misguided due to this. Please fix it.

The sites are hi.wikipedia.org and mr.wikipedia.org

Please fix it.

Did they happen after the fix was released into production? (That is, after 2019-03-22 16:00 UTC, or 2019-03-22 19:30 IST.)

Do the problems remain for logged-out users after action=purge?

OK, can you give any URLs where it occurs? I can't replicate that here.

On hiwiki, I found the wrong sitenotice (this one instead of the one) on this article via Special:Random; it had been generated at an old time (2019-03-08T23:50:51, found in the page HTML). After an action=purge, the timestamp was updated (to 2019-03-23T19:19:07). Once I reloaded the page, it showed the correct (current) sitenotice.

Similarly, on mrwiki, I found the wrong sitenotice (this one instead of blank) on this article, generated at 2019-03-21T20:04:03; purging bumped it to 2019-03-23T19:24:10.

Unfortunately it's not possible for us to purge the old cached pages. A quick hack for the sitenotice would be to set .mw-dismissable-notice { display: none } in the site's Common.css until the caches finally expire in thirty days' time.

Content licensed under Creative Commons Attribution-ShareAlike (CC BY-SA) 4.0 unless otherwise noted; code licensed under GNU General Public License (GPL) 2.0 or later and other open source licenses. By using this site, you agree to the Terms of Use, Privacy Policy, and Code of Conduct. · Wikimedia Foundation · Privacy Policy · Code of Conduct · Terms of Use · Disclaimer · CC-BY-SA · GPL · Credits