Mikh Posted February 7, 2025 Report Share Posted February 7, 2025 Hi, I recently received a notification from my web hosting provider indicating that my website is using too many resources, particularly MySQL resources, which are significantly exceeding their upper limit => See attached image What would be the best approach to reduce this usage? I have already blocked several bots and crawlers, but the resource consumption hasn't decreased noticeably. I still want Google and Bing to be able to crawl the site for indexing purposes. Any advice would be greatly appreciated. Quote Link to comment Share on other sites More sharing options...
GOGGS Posted February 11, 2025 Report Share Posted February 11, 2025 You might consider changing the filename for some of the pages that bots visit and also have a high "workload" from a database standpoint (e.g. multiple queries). The bots constantly visit/scan my statistics.php page even though I removed it! That one has a lot of number crunching... GOGGS Quote Link to comment Share on other sites More sharing options...
Mikh Posted February 11, 2025 Author Report Share Posted February 11, 2025 Thank you. I will try this out and see if it reduces the load Quote Link to comment Share on other sites More sharing options...
Mikh Posted February 12, 2025 Author Report Share Posted February 12, 2025 The Support Team of my hosting provider provided me with this information: Quote Most of the MySQL load is caused by queries like the following: # SHOW TABLE STATUS FROM `servanhomme_tng` LIKE 'tng_families' # SHOW CREATE TABLE `servanhomme_tng`.`tng_families` # SHOW TABLE STATUS FROM `servanhomme_tng` LIKE 'tng_trees' # SHOW CREATE TABLE `servanhomme_tng`.`tng_trees` # SHOW TABLE STATUS FROM `servanhomme_tng` LIKE 'tng_people' # SHOW CREATE TABLE `servanhomme_tng`.`tng_people` # EXPLAIN /*!50100 PARTITIONS*/ SELECT count(f.ID) as fcount FROM (tng_families as f, tng_trees) LEFT JOIN tng_people AS father ON f.gedcom=father.gedcom AND husband = father.personID LEFT JOIN tng_people AS mother ON f.gedcom=mother.gedcom AND wife = mother.personID WHERE f.gedcom="tree1" AND (f.living != 1) AND (f.gedcom = tng_trees.gedcom) Please check your database structure and website software in order to make sure they are optimized and do not contain obsolete data. You should also make sure that your database tables have proper indexes. Does this help in any way? Quote Link to comment Share on other sites More sharing options...
GOGGS Posted February 13, 2025 Report Share Posted February 13, 2025 That's an execution plan and query to count families (like I indicated, from a statistics page) which is normal for the way TNG was written. I will say that 1) if your data is "clean" there is no reason to join the people table twice with mothers and fathers just to count families (which is already there) and 2) the restriction to not count families if they are "living" adds complexity. If you use a 3rd party hosting environment where resources are an issue, you may want to hide or remove some pages as I indicated. On AWS, I have re-engineered a lot of pages so lots of complicated queries can be run on a single page. GOGGS Quote Link to comment Share on other sites More sharing options...
Steve1200 Posted February 13, 2025 Report Share Posted February 13, 2025 Not sure if this will help but if you are running Apache 2.4, and using 2.4 directives, the following code should limit access to certain pages. Place the code on separate lines before the <Requireall> tag. <If "%{HTTP_USER_AGENT} =~ /^.*(ChatGPT|ChatSensei|CogniBot|GPTbot|HelperBot|Lumina|MegaBot|Pixella|Proxima|Quanta|Synthia|TechWhisper).*$/"> Require all denied </If> <FilesMatch "(ahnentafel|anniversaries|calendar|descend|descendtext|descendtables|descendvert|familychart|familygroup|famsearch)\.php"> <If "%{HTTP_USER_AGENT} =~ /^.*(bot|GPT|slurp|spider|crawler).*$/"> Require all denied </If> </FilesMatch> <FilesMatch "(fan|notes|pedigree|pedigreetext|relationshp|relateform|searchform|search|timeline2|verticalchart)\.php"> <If "%{HTTP_USER_AGENT} =~ /^.*(bot|GPT|slurp|spider|crawler).*$/"> Require all denied </If> </FilesMatch> The first if directive should deny access to the bots listed. The next two <FilesMatch> directives should deny access to specific files by the listed user agents. Quote Link to comment Share on other sites More sharing options...
Mikh Posted February 16, 2025 Author Report Share Posted February 16, 2025 Thank you both so much! The MySQL traffic is now down by more than 90%, and I will be able to continue using the shared server model. Perfect! Quote Link to comment Share on other sites More sharing options...
dmohn Posted April 11, 2025 Report Share Posted April 11, 2025 In which file should the code suggested by Steve1200 be placed? Quote Link to comment Share on other sites More sharing options...
Katryne Posted April 11, 2025 Report Share Posted April 11, 2025 You might also try the restrict access mod that does something like what Steve suggests. I tried it, but found it too restrictive. It limits access to a number of files if the visitor is not a connected member. https://tng.lythgoes.net/wiki/index.php?title=Restrict_Access_mod Quote Link to comment Share on other sites More sharing options...
Steve1200 Posted April 11, 2025 Report Share Posted April 11, 2025 2 hours ago, dmohn said: In which file should the code suggested by Steve1200 be placed? Place the code in the .htaccess file. Quote Link to comment Share on other sites More sharing options...
Brenda Posted July 18, 2025 Report Share Posted July 18, 2025 I am also having issues with bots exceeding my resource usage limits. Screenshot of my access log a few minutes ago showing Calendar page being accessed by static.vnpt.vn (my biggest culprit currently). I have created an .htaccess file with a copy at the root and one in the httpdocs directory and the genealogy directory. So far it doesn't seem to have made much difference... Any suggestions as what else I might do? All sites seem to be from Hanoi, or similar geography and are targeting the calendar.php The IP of static.vnpt.vn and the calendar page should be blocked by the .htaccess file as below, however it doesn't seem to be blocking access. What have I done incorrectly? I've blocked a whack of IP addresses and used the FilesMatch example above. But they still keep coming... order deny,allow deny from <14.161.4.110> deny from <203.162.0.78> deny from <113.191.154.54> deny from <123.22.103.164> deny from <123.20.146.154> deny from <123.20.122.58> deny from <14.232.210.209> deny from <123.20.189.37> deny from <123.21.100.226> deny from <14.162.68.210> deny from <14.241.159.204> deny from <14.191.168.222> deny from <113.168.9.84> deny from <14.227.88.142> deny from <113.172.164.107> deny from <14.228.79.134> deny from <113.168.105.248> deny from <113.172.216.252> deny from <222.252.97.87> deny from <14.191.186.142> deny from <14.187.72.161> deny from <14.169.205.64> deny from <203.162.0.78> deny from <IP address from the list> allow from all <If "%{HTTP_USER_AGENT} =~ /^.*(ChatGPT|ChatSensei|CogniBot|GPTbot|HelperBot|Lumina|MegaBot|Pixella|Proxima|Quanta|Synthia|TechWhisper|PetalBot|static.vnpt.vn).*$/"> Require all denied </If> <FilesMatch "(ahnentafel|anniversaries|calendar|descend|descendtext|descendtables|descendvert|familychart|familygroup|famsearch)\.php"> <If "%{HTTP_USER_AGENT} =~ /^.*(bot|GPT|slurp|spider|crawler).*$/"> Require all denied </If> </FilesMatch> <FilesMatch "(fan|notes|pedigree|pedigreetext|relationshp|relateform|searchform|search|timeline2|verticalchart)\.php"> <If "%{HTTP_USER_AGENT} =~ /^.*(bot|GPT|slurp|spider|crawler).*$/"> Require all denied </If> </FilesMatch> Thanks in advance. Brenda brpedersen.com/genealogy Quote Link to comment Share on other sites More sharing options...
Katryne Posted July 18, 2025 Report Share Posted July 18, 2025 (edited) Brenda, what is the Apache version on your server ? If it has been upgraded to 2.4 (and maybe above), the syntax has changed and the "deny from" won't give any blocking result. The new syntax is something like "require not" instead. I confess I asked Chat GPT to transcribe my old htaccess into the new syntax. Don't ask me for more ... Edited July 18, 2025 by Katryne Error : plugins quoted for another CMS. Sorry Quote Link to comment Share on other sites More sharing options...
Brenda Posted July 18, 2025 Report Share Posted July 18, 2025 Thanks Katryne. Apparently my server does use Apache 2.4. Updated syntax using ChatGPT and now I get a 500 internal server error... Damn!! Tried removing denied IP addresses to no avail. # Deny specific IP addresses <RequireAll> Require not ip <IP address from the list> # Replace with the specific IP Require all granted </RequireAll> # Deny User-Agent based on patterns <If "%{HTTP_USER_AGENT} =~ /^(.*(ChatGPT|ChatSensei|CogniBot|GPTbot|HelperBot|Lumina|MegaBot|Pixella|Proxima|Quanta|Synthia|TechWhisper|PetalBot|static.vnpt.vn).*)$/"> Require all denied </If> # Deny access to certain files if User-Agent matches bot patterns <FilesMatch "(ahnentafel|anniversaries|calendar|descend|descendtext|descendtables|descendvert|familychart|familygroup|famsearch)\.php"> <If "%{HTTP_USER_AGENT} =~ /^(.*(bot|GPT|slurp|spider|crawler).*)$/"> Require all denied </If> </FilesMatch> # Deny access to additional files if User-Agent matches bot patterns <FilesMatch "(fan|notes|pedigree|pedigreetext|relationshp|relateform|searchform|search|timeline2|verticalchart)\.php"> <If "%{HTTP_USER_AGENT} =~ /^(.*(bot|GPT|slurp|spider|crawler).*)$/"> Require all denied </If> </FilesMatch> Have removed the .htaccess file for the moment. Brenda Quote Link to comment Share on other sites More sharing options...
Steve1200 Posted July 18, 2025 Report Share Posted July 18, 2025 Brenda, the command below only affects user agents. <If "%{HTTP_USER_AGENT} =~ /^.*(ChatGPT|ChatSensei|CogniBot|GPTbot|HelperBot|Lumina|MegaBot|Pixella|Proxima|Quanta|Synthia|TechWhisper|PetalBot|static.vnpt.vn).*$/"> Require all denied </If> So if static.vnpt.vn is not listed in the user agent string, it is not detected and therefore not blocked. If you are using Apache 2.4 commands you might try using the following between the <RequireAll></RequireAll> tags: Require not host static.vnpt.vn Since the commands shown in your post are used by Apache 2.2, you could try: deny from static.vnpt.vn Note: mixing Apache 2.2 and Apache 2.4 commands may result in a server ERROR 500. If you are running Apache 2.4 but using 2.2 commands, your host has installed the Apache Module mod_access_compat module. While this allows using some 2.2 commands in 2.4 problems can still occur, depending on the commands. For more information check the Apache website on upgrading commands. Quote Link to comment Share on other sites More sharing options...
Brenda Posted July 18, 2025 Report Share Posted July 18, 2025 I have modified the .htaccess file as below, moving the bot denial to above the <RequireAll> tag Still no joy as have 500 error if .htaccess file is in root directory. Any and all help most appreciated!! # Block ChatGPT‑like or generic “bot” user‑agents <If "%{HTTP_USER_AGENT} =~ /(?:ChatGPT|ChatSensei|CogniBot|GPTbot|HelperBot|Lumina|MegaBot|Pixella|Proxima|Quanta|Synthia|TechWhisper|PetalBot)/i"> Require all denied </If> <FilesMatch "(ahnentafel|anniversaries|calendar|descend|descendtext|descendtables|descendvert|familychart|familygroup|famsearch)\.php"> <If "%{HTTP_USER_AGENT} =~ /(bot|GPT|slurp|spider|crawler)/i"> Require all denied </If> </FilesMatch> <FilesMatch "(fan|notes|pedigree|pedigreetext|relationshp|relateform|searchform|search|timeline2|verticalchart)\.php"> <If "%{HTTP_USER_AGENT} =~ /(bot|GPT|slurp|spider|crawler)/i"> Require all denied </If> </FilesMatch> # ----------------- # Block specific IPs but allow everyone else # ----------------- <RequireAll> Require all granted # old: Order deny,allow + allow from all Require not ip 14.161.4.110 Require not ip 203.162.0.78 # (…add/remove IPs here as needed) # ----------------- </RequireAll> Quote Link to comment Share on other sites More sharing options...
Brenda Posted July 18, 2025 Report Share Posted July 18, 2025 Steve has helped me out with this, bless his heart!! Thank you so much for all your input! Quote Link to comment Share on other sites More sharing options...
Brenda Posted September 4, 2025 Report Share Posted September 4, 2025 I find that I am still having excess traffic - mostly accessing the Calendar page?? See attached. 5 GB traffic already this month... It's more than a little time consuming to deny each of these Brazil and Argentina hosts in .htaccess, but that's what I've resorted to. Any thoughts as to why they would be targeting the Calendar page? Thanks in advance. Brenda brpedersen.com Quote Link to comment Share on other sites More sharing options...
Brenda Posted September 4, 2025 Report Share Posted September 4, 2025 Is there a way to just deny the Calendar page access for all?? The page isn't worth the trouble. Thanks! Quote Link to comment Share on other sites More sharing options...
Katryne Posted September 4, 2025 Report Share Posted September 4, 2025 (edited) In Setup >> Configuration >> General Settings >> Menus, you could uncheck the Dates and Calendar items. Maybe this won't be enough, since the bots have now learnt the url structure to be searched. If such is the case, look at my site to see how I have now blocked most of the bots. Katryne Edited September 4, 2025 by Katryne Edit : sorry for that loooong post : my Enter key was stuck and I had not see it had built so many empty lines. Quote Link to comment Share on other sites More sharing options...
Steve1200 Posted September 4, 2025 Report Share Posted September 4, 2025 To completely stop access to the calendar page add the following code in .htaccess: <Files "calendar.php"> Require all denied </Files> Quote Link to comment Share on other sites More sharing options...
Brenda Posted September 4, 2025 Report Share Posted September 4, 2025 Perfect. I have added that. I tried Katryne's method of removing the Calendar page from the menu, but I don't seem to have a menus under general settings, just a menu location. I guess my site is a hybrid rather than template. I haven't yet recalled where the menus as of yet. Brenda Quote Link to comment Share on other sites More sharing options...
Brenda Posted September 13, 2025 Report Share Posted September 13, 2025 Katryne - where did you get the bot blocker you use on your site? I like the choose difference aspect. Thank you. Brenda Quote Link to comment Share on other sites More sharing options...
Katryne Posted September 14, 2025 Report Share Posted September 14, 2025 Hello Brenda ! I use 2 different mods. They MUST be used together. 1/ Image captcha 2/ Guardian (Steve tells me his mod Bot Manager can do a similar job, with some options, such as choosing between Image Captcha and Google Captcha) Image Captcha builds a barrier on certain pages only. Adding Guardian makes Image Captcha active on all landing pages. Bot Manager lets you choose the pages where Image Captcha should be used more precisely, but I didn't try Steve's mod, as I was satisfied with Guardian's performance. Image Captcha by Rick Bisbee : https://tng.lythgoes.net/wiki/index.php/Image_Captcha Guardian mod by Rick Bisbee : https://tng.lythgoes.net/wiki/index.php/Guardian_Mod Bot Manager by Steven Davis : https://tng.lythgoes.net/wiki/index.php/Bot_Manager Quote Link to comment Share on other sites More sharing options...
Rob Severijns Posted September 14, 2025 Report Share Posted September 14, 2025 1 hour ago, Katryne said: 2/ Guardian (Steve tells me his mod Bot Manager can do a similar job, with some options, such as choosing between Image Captcha and Google Captcha) It also allows specified bots to index your website. Those bots are specified by you in the options menu. Soon to be released is a version that works with hcaptcha which has a free version if you stay below 100.000 hits a month. You can see a working version of hcaptcha on my website. You can set it to passive mode so it doesn't require solving all kinds of puzzles or challenges which makes it more user friendly. Quote Link to comment Share on other sites More sharing options...
Steve1200 Posted September 15, 2025 Report Share Posted September 15, 2025 I just updated Bot Manager to support hcaptcha and added hcaptcha to the WIKI. Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.