⚓ T256533 Identify accounts with very high login rate
Article Images
Per T256532, we should identify the accounts so we can contact the owners/maintainers of the individual bot, or the underlying software to fix their issue
In most cases, all these logins aren't needed, just follow https://www.mediawiki.org/wiki/API:Login#Additional_notes
If you are sending a request that should be made by a logged-in user, add assert=user parameter to the request you are sending in order to check whether the user is logged in. If the user is not logged-in, an assertuserfailed error code will be returned.
Event Timeline
There are a very large number of changes, so older changes are hidden. Show Older Changes
Text version for easy copy paste:
JarBot | 114,698 |
ListeriaBot | 70,143 |
Mr.Ibrahembot | 53,443 |
SchlurcherBot | 36,728 |
Hexabot | 34,736 |
Scidudebot | 22,998 |
FaFlo | 16,436 |
WP 1.0 bot | 13,605 |
EmausBot | 7,084 |
Matthobot | 6,846 |
CommonsDelinker | 6,763 |
অভ্যর্থনা কমিটি বট | 5,360 |
HitomiAkane | 3,731 |
DeltaQuadBot | 2,965 |
AlaaBot | 2,667 |
Antigng-bot | 2,632 |
FlickreviewR 2 | 2,616 |
EntretenimientatoBOT | 2,492 |
Luke081515Bot | 2,327 |
Perfect | 1,918 |
Reedy renamed this task from Identify bots with very high login rate to Identify accounts with very high login rate.Jun 27 2020, 2:29 PM
IMHO, anything over 100 in a 48H period (that's just over 2 per hour) is probably doing something wrong
Thanks. I'll review AlaaBot, as in real it's weird to make 2,667!
IMHO, anything over 100 in a 48H period (that's just over 2 per hour) is probably doing something wrong
Thanks. I'll review AlaaBot, as in real it's weird to make 2,667!
Thanks!
2668/48 is like 55 per hour... so almost one a minute. So something doesn't feel quite right :)
Thanks for notification. The problem is what I'm running is not a single task, but ten individual tasks instead. To minimize common mode failures, they were designed to run as individual processes, each being able to do its own task by itself, from login to querying to editing, without relying on sessions/tokens from other processes. On finishing, they logout and exit. A batch file is created to run all of them every 10 mins.
Reusing sessions or sharing sessions between these tasks would require, at least, communication between processes, which is completely beyond my scope, or at worst, code refactoring.
So at this moment, the only thing I can do is to reduce the frequency at which each of them is run, and my bottom line is once every 30 minutes. Otherwise there'll be service degradation, something not acceptable. This will lead to a login every 30/10=3 mins, or 960/48hrs.
That's it. And I cannot go further beyond that. Also I don't think such a login rate could cause threats to your server.
Hundreds or thousands of logins a day is generally a sign of bigger issues.
You don't necessarily need to share sessions between different processes/tasks. You should be able to persist login sessions across different runs of the same tasks though, which may require some work.
If they're long running processes, they don't need to login repeatedly, just at the start, and use the assert method (and error handling) to check they're still logged in
You don't necessarily need to reduce the run rate, your count isn't that particularly high. The example of JarBot logging in 114K in 48 hours is a bigger one. Nearly 2.4K an hour. Which is 40 per minute, basically every 1.5 seconds... There is absolutely no reason a bot needs to login that frequently
It's not necessarily threats to the server, but more that it should be unnecessary to login that frequently. How often do you get logged out of your browser session, for example?
For a few months now, every time I start a Pywikibot script, it starts with
WARNING: No user is logged in on site wikipedia:hu Logging in to wikipedia:hu as <account>
I don’t appear on this list as I only start bots manually, so no more than at most a few dozens of logins a day, but scheduled bots can reach quite large numbers this way.
Text version for easy copy paste:
JarBot 114,698 ListeriaBot 70,143 Mr.Ibrahembot 53,443 SchlurcherBot 36,728 Hexabot 34,736 Scidudebot 22,998 FaFlo 16,436 WP 1.0 bot 13,605 EmausBot 7,084 Matthobot 6,846 CommonsDelinker 6,763 অভ্যর্থনা কমিটি বট 5,360 HitomiAkane 3,731 DeltaQuadBot 2,965 AlaaBot 2,667 Antigng-bot 2,632 FlickreviewR 2 2,616 EntretenimientatoBOT 2,492 Luke081515Bot 2,327 Perfect 1,918
Can you check again the last 12 hours? I fixed some bugs and I want to know if the bugs related to this issue. Thank you.
I just checked and it got much better but it still logins 10 times a minute and still most of logins in Arabic Wikipedia
I just checked and it got much better but it still logins 10 times a minute and still most of logins in Arabic Wikipedia
Good to hear that, I fixed new bugs. If you can post the new logins number of JarBot every 12 hours for the next couple days to know if the issue still I will be thankful.
I just checked and it got much better but it still logins 10 times a minute and still most of logins in Arabic Wikipedia
Good to hear that, I fixed new bugs. If you can post the new logins number of JarBot every 12 hours for the next couple days to know if the issue still I will be thankful.
Last 12 hours: 1,206 logins (~1.67 per minute, almost 1/60th of the original number). Thanks!
Using this to log in with a bot password for the wikiwho API. Logins should not happen THAT often. Are data available if/when the login rates for my account drastically increased? Will try to fix.
Using this to log in with a bot password for the wikiwho API. Logins should not happen THAT often. Are data available if/when the login rates for my account drastically increased? Will try to fix.
Does this help?
This is for the past seven days and each bar is for three-hour timespan (I see a noticeable drop at 25th June went from 20,000 logins every 3 hours to 600 per three hours). It still logs in 10,000 times in the past 48 hours.
I just checked and it got much better but it still logins 10 times a minute and still most of logins in Arabic Wikipedia
Good to hear that, I fixed new bugs. If you can post the new logins number of JarBot every 12 hours for the next couple days to know if the issue still I will be thankful.
Last 12 hours: 1,206 logins (~1.67 per minute, almost 1/60th of the original number). Thanks!
Hello Ladsgroup, does it get lower than that?
Last 12 hours: 1,206 logins (~1.67 per minute, almost 1/60th of the original number). Thanks!
Hello Ladsgroup, does it get lower than that?
YES, last 12 hours only 15. THANK YOU.
Hi, SchlurcherBot should also be fixed. I switched approx. 2 days ago to OAuth verification for the majority of my tasks that need login. Schlurcher
Hi, SchlurcherBot should also be fixed. I switched approx. 2 days ago to OAuth verification for the majority of my tasks that need login. Schlurcher
Yes, it went from 200/hour to 10/hour. Thanks!
The average number of logins (everything, everywhere) in the past seven days went from 20K per hour to 7k per hour. The new list in the past 48 hours:
ListeriaBot | 73,771 |
Mr.Ibrahembot | 58,679 |
AGbot | 13,813 |
WP 1.0 bot | 13,585 |
CommonsDelinker | 9,232 |
EmausBot | 8,748 |
Tylernub | 4,218 |
AlaaBot | 4,096 |
Tylerfed | 3,787 |
TylerTok | 2,937 |
FlickreviewR 2 | 2,616 |
Luke081515Bot | 2,313 |
Entretenimientato | 2,244 |
EntretenimientoBorrarBOT | 2,153 |
Perfect | 1,915 |
DeltaQuadBot | 1,830 |
YouTubeReviewBot | 1,805 |
Antigng-bot | 1,791 |
AlbeROBOT | 1,651 |
Tylernok | 1,431 |
If we can reduce the top four the below 5K per day, I think the first phase can be called done.
"Per T256532"
so what's the actual problem now here? I cannot view this task.
Can somebody give me an answer concerning this question?
"Per T256532"
so what's the actual problem now here? I cannot view this task.
Can somebody give me an answer concerning this question?
That task is private for security reasons. Suffice to say that there is a potential for abuse that we are trying to prevent.
@Ladsgroup When you have a chance, can I get another count? Want to know how i'm doing in reducing my footprint and want to get it as low as possible.
@Ladsgroup When you have a chance, can I get another count? Want to know how i'm doing in reducing my footprint and want to get it as low as possible.
It's 645 in the past 24 hours, in comparison @Mr.Ibrahem's bot did 50K logins in the past 24 hours. @Ibrahem: Will you fix your bot or should I block it?
@Ladsgroup When you have a chance, can I get another count? Want to know how i'm doing in reducing my footprint and want to get it as low as possible.
It's 645 in the past 24 hours, in comparison @Mr.Ibrahem's bot did 50K logins in the past 24 hours. @Ibrahem: Will you fix your bot or should I block it?
Per conversation with Ladsgroup, the bot is now locked until the issue is fixed.
Thanks for the update @Urbanecm
Do we need to refresh the data and see if there are any other accounts that are still logging in too frequently?
Here are current logins ordered by frequency
Mr.Ibrahembot | 127162 |
ListeriaBot | 74121 |
WP 1.0 bot | 20387 |
FaFlo | 15889 |
EmausBot | 12735 |
CommonsDelinker | 11952 |
Matthias Winkelmann | 5351 |
AlaaBot | 4356 |
FlickreviewR 2 | 3776 |
Luke081515Bot | 3482 |
YouTubeReviewBot | 3172 |
Antigng-bot | 2058 |
Lê Lợi (bot) | 1963 |
DeltaQuadBot | 1895 |
Olafbot | 1638 |
WikitanvirBot | 1593 |
MusikBot | 1585 |
Jembot | 1561 |
AlbeROBOT | 1351 |
MusikBot II | 1270 |
from https://logstash.wikimedia.org/goto/d7fcb59c2cc892b96bf1100fd77994df (last 2 days)
@Ladsgroup When you have a chance, can I get another count? Want to know how i'm doing in reducing my footprint and want to get it as low as possible.
It's 645 in the past 24 hours, in comparison @Mr.Ibrahem's bot did 50K logins in the past 24 hours. @Ibrahem: Will you fix your bot or should I block it?
Per conversation with Ladsgroup, the bot is now locked until the issue is fixed.
@Urbanecm
Hi, sorry for that, I had stop the most active jobs in my bot and I had made some changes .. Can you unblock the bot ?
And Is there any way to track my bot logins ?
@Ladsgroup When you have a chance, can I get another count? Want to know how i'm doing in reducing my footprint and want to get it as low as possible.
It's 645 in the past 24 hours, in comparison @Mr.Ibrahem's bot did 50K logins in the past 24 hours. @Ibrahem: Will you fix your bot or should I block it?
Per conversation with Ladsgroup, the bot is now locked until the issue is fixed.
Hi, sorry for that, I had stop the most active jobs in my bot and I had made some changes .. Can you unblock the bot ?
Hello, thanks for the changes you made. I've removed the account lock then.
And Is there any way to track my bot logins ?
Sadly, that's not possible. The data are recorded by the servers, but they are intentionally non-public. I'm happy to query your account data for you, you can just ping me at IRC, or ask here :).
In T256533#6409637, @Mr.Ibrahem wrote:
And Is there any way to track my bot logins ?
If you are using pywikibot to create your bots, then you should know that pywikibot reports every log in as a log on the command line. You should look at the output of your bots, not just through it away.
In T256533#6409637, @Mr.Ibrahem wrote:
And Is there any way to track my bot logins ?If you are using pywikibot to create your bots, then you should know that pywikibot reports every log in as a log on the command line. You should look at the output of your bots, not just through it away.
FYI: According to user agents, the only PWB-based bot is Matthias Winkelmann (who is not really a bot, but a flooder, I don't know if that changes something)
Sadly, that's not possible. The data are recorded by the servers, but they are intentionally non-public. I'm happy to query your account data for you, you can just ping me at IRC, or ask here :).
@Urbanecm Can you tell me the last logins in last 24h and 12h ?
Wanted to come here and acknowledge this issue. I've opened a bug against the WP 1.0 project to fix this. It should be taken care of this weekend. Thank you @DeltaQuad for the link to mwclient workaround!
Okay I've deployed the fix to WP 1.0 bot as of tonight's update. It should be good the next time you run the numbers. Please let me know.
Okay I've deployed the fix to WP 1.0 bot as of tonight's update. It should be good the next time you run the numbers. Please let me know.
Confirmed, in the last two days, I see 28 logins. Thank you!
Here are current logins ordered by frequency
Mr.Ibrahembot 127162 ListeriaBot 74121 WP 1.0 bot 20387 FaFlo 15889 EmausBot 12735 CommonsDelinker 11952 Matthias Winkelmann 5351 AlaaBot 4356 FlickreviewR 2 3776 Luke081515Bot 3482 YouTubeReviewBot 3172 Antigng-bot 2058 Lê Lợi (bot) 1963 DeltaQuadBot 1895 Olafbot 1638 WikitanvirBot 1593 MusikBot 1585 Jembot 1561 AlbeROBOT 1351 MusikBot II 1270 from https://logstash.wikimedia.org/goto/d7fcb59c2cc892b96bf1100fd77994df (last 2 days)
JFYI: On behalf of @FaFlo bot, we 've planned to work around the problem towards the end of the week.
As of today, FaFlo is the only bot that sends moe than 10k requests/3 days. I sent him a follow-up message via both email and talk page, including a note the account may be disabled. Apart from that, CommonsDelinker is very close to 10k (it made 9612 requests). On the positive side, ListeriaBot disapppeared from the list.
Thanks @Olgazgovora for the info.
As of today, FaFlo is the only bot that sends moe than 10k requests/3 days. I sent him a follow-up message via both email and talk page, including a note the account may be disabled. Apart from that, CommonsDelinker is very close to 10k (it made 9612 requests). On the positive side, ListeriaBot disapppeared from the list.
Thanks @Olgazgovora for the info.
Dear @Urbanecm,
Could you, please, provide statistics on how the requests changed for FaFlo bot during last 3-4 hours? Is there any difference? We removed it for one project (WhoColor) but not solved yet for the second (WikiWho). Thank you.
Sincerely,
Olya
Dear @Olgazgovora,
sure. In short, the bot dramatically increased its amount of login requests (from about 5 % of login requests to more than 70 % of login requests), and as such, it has been temporarily blocked.
[urbanecm@mwlog1001 /srv/mw-log]$ head -n 1 goodpass.log | grep -Eo '^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}' 2020-09-06 08:34:43 [urbanecm@mwlog1001 /srv/mw-log]$ grep FaFlo goodpass.log | wc -l 52325 [urbanecm@mwlog1001 /srv/mw-log]$ tail -n1 goodpass.log | grep -Eo '^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}' 2020-09-06 15:02:22 [urbanecm@mwlog1001 /srv/mw-log]$ wc -l < goodpass.log 72722 [urbanecm@mwlog1001 /srv/mw-log]$
According to the logs, your bot made 52325 successful login attempts (out of 72722 attempts in total) between 2020-09-06 08:34:43 and 2020-09-06 15:02:22 (ie. in 6,5 hours). That means it made over 70 % of requests alone, and that it logs in more than twice per second. Very similar numbers apply for both yesterday (September 05) and the day before (September 04).
As such, I have temporarily locked the account. Please fix that issue, and let me know.
Sincerely,
Martin Urbanec
Dear @Olgazgovora,
sure. In short, the bot dramatically increased its amount of login requests (from about 5 % of login requests to more than 70 % of login requests), and as such, it has been temporarily blocked.
[urbanecm@mwlog1001 /srv/mw-log]$ head -n 1 goodpass.log | grep -Eo '^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}' 2020-09-06 08:34:43 [urbanecm@mwlog1001 /srv/mw-log]$ grep FaFlo goodpass.log | wc -l 52325 [urbanecm@mwlog1001 /srv/mw-log]$ tail -n1 goodpass.log | grep -Eo '^[0-9]{4}-[0-9]{2}-[0-9]{2} [0-9]{2}:[0-9]{2}:[0-9]{2}' 2020-09-06 15:02:22 [urbanecm@mwlog1001 /srv/mw-log]$ wc -l < goodpass.log 72722 [urbanecm@mwlog1001 /srv/mw-log]$According to the logs, your bot made 52325 successful login attempts (out of 72722 attempts in total) between 2020-09-06 08:34:43 and 2020-09-06 15:02:22 (ie. in 6,5 hours). That means it made over 70 % of requests alone, and that it logs in more than twice per second. Very similar numbers apply for both yesterday (September 05) and the day before (September 04).
As such, I have temporarily locked the account. Please fix that issue, and let me know.
Sincerely,
Martin Urbanec
I've changed the code for the Wikiwho project too. Could you, please, unlock FaFlo and also tell us how the number of login requests changed? Thank you!
Sincerely,
Olya
I have unlocked the user account, and I will monitor the logins.
Thank you. And, please, let me know if there are changes with login amounts.
I have unlocked the user account, and I will monitor the logins.
Thank you. And, please, let me know if there are changes with login amounts.
Sure. So far, there were three user logins. Tanks!
I think we call this resolved. I don't see a bot that logins extremely frequently now.
Yes, let's create a new one for later cases (like next year) in case the throttle is not implemented by then.
Should we open a new one that would focus on creating an automated mechanism to detect these?
Do we even have similar automated measures? For instance, do DBAs get a notification if a DB's growth suddenly speeds up?
Should we open a new one that would focus on creating an automated mechanism to detect these?
Once T256532 is completed, there shouldn't be any need to manually hunt those.
Do we even have similar automated measures? For instance, do DBAs get a notification if a DB's growth suddenly speeds up?
That'd be a question for the DBAs, @Marostegui?
AFAIK bots are not logged in cu tables (yet). So this ticket is not relevant to size of CU tables. Am I missing something obvious?
You are not. But recall that this ticket is about accounts with very high login rate, not just bots. Right now, it happens that all culprits were bots. But in a future state, a non-bot account (read: a bot without a bot flag) might do the same, and we want to identify that proactively.
Should we open a new one that would focus on creating an automated mechanism to detect these?
Once T256532 is completed, there shouldn't be any need to manually hunt those.
Do we even have similar automated measures? For instance, do DBAs get a notification if a DB's growth suddenly speeds up?
That'd be a question for the DBAs, @Marostegui?
We have an alert based on backups size, so if there's a sudden increase on a backup size from one week to the next week, it will get triggered. It obviously depends on how big the increase (on disk) is.
Entretenimientato and EntretenimientoBorrarBOT were both vandal bots operated by a WMF-legal banned user.