Jump to content
TNG Community

Excessive MySQL Resource Usage - How to Optimize?


Mikh

Recommended Posts

Hi,

I recently received a notification from my web hosting provider indicating that my website is using too many resources, particularly MySQL resources, which are significantly exceeding their upper limit => See attached image

MySQL-Time.jpg

What would be the best approach to reduce this usage? I have already blocked several bots and crawlers, but the resource consumption hasn't decreased noticeably. I still want Google and Bing to be able to crawl the site for indexing purposes.

Any advice would be greatly appreciated.

Link to comment
Share on other sites

You might consider changing the filename for some of the pages that bots visit and also have a high "workload" from a database standpoint (e.g. multiple queries). The bots constantly visit/scan my statistics.php page even though I removed it! That one has a lot of number crunching...

 

GOGGS

Link to comment
Share on other sites

The Support Team of my hosting provider provided me with this information:

 

Quote

Most of the MySQL load is caused by queries like the following:

#    SHOW TABLE STATUS FROM `servanhomme_tng` LIKE 'tng_families'
#    SHOW CREATE TABLE `servanhomme_tng`.`tng_families`
#    SHOW TABLE STATUS FROM `servanhomme_tng` LIKE 'tng_trees'
#    SHOW CREATE TABLE `servanhomme_tng`.`tng_trees`
#    SHOW TABLE STATUS FROM `servanhomme_tng` LIKE 'tng_people'
#    SHOW CREATE TABLE `servanhomme_tng`.`tng_people`
# EXPLAIN /*!50100 PARTITIONS*/
SELECT count(f.ID) as fcount
        FROM (tng_families as f, tng_trees)
        LEFT JOIN tng_people AS father ON f.gedcom=father.gedcom AND husband = father.personID
        LEFT JOIN tng_people AS mother ON f.gedcom=mother.gedcom AND wife = mother.personID
        WHERE  f.gedcom="tree1" AND (f.living != 1) AND  (f.gedcom = tng_trees.gedcom)

Please check your database structure and website software in order to make sure they are optimized and do not contain obsolete data. You should also make sure that your database tables have proper indexes.

 

Does this help in any way?

Link to comment
Share on other sites

That's an execution plan and query to count families (like I indicated, from a statistics page) which is normal for the way TNG was written. 

I will say that 1) if your data is "clean" there is no reason to join the people table twice with mothers and fathers just to count families (which is already there) and 2) the restriction to not count families if they are "living" adds complexity.

If you use a 3rd party hosting environment where resources are an issue, you may want to hide or remove some pages as I indicated. 

On AWS, I have re-engineered a lot of pages so lots of complicated queries can be run on a single page.

 

GOGGS

Link to comment
Share on other sites

Not sure if this will help but if you are running Apache 2.4, and using 2.4 directives, the following code should limit access to certain pages. Place the code on separate lines before the <Requireall> tag.

<If "%{HTTP_USER_AGENT} =~ /^.*(ChatGPT|ChatSensei|CogniBot|GPTbot|HelperBot|Lumina|MegaBot|Pixella|Proxima|Quanta|Synthia|TechWhisper).*$/">
	Require all denied
</If>

<FilesMatch "(ahnentafel|anniversaries|calendar|descend|descendtext|descendtables|descendvert|familychart|familygroup|famsearch)\.php">
    <If "%{HTTP_USER_AGENT} =~ /^.*(bot|GPT|slurp|spider|crawler).*$/">
		Require all denied
    </If>
</FilesMatch>

<FilesMatch "(fan|notes|pedigree|pedigreetext|relationshp|relateform|searchform|search|timeline2|verticalchart)\.php">
    <If "%{HTTP_USER_AGENT} =~ /^.*(bot|GPT|slurp|spider|crawler).*$/">
		Require all denied
    </If>
</FilesMatch>

The first if directive should deny access to the bots listed.

The next two <FilesMatch> directives should deny access to specific files by the listed user agents.

Link to comment
Share on other sites

  • 1 month later...
2 hours ago, dmohn said:

In which file should the code suggested by Steve1200 be placed?

Place the code in the .htaccess file.

Link to comment
Share on other sites

  • 3 months later...

I am also having issues with bots exceeding my resource usage limits. Screenshot of my access log a few minutes ago showing Calendar page being accessed by static.vnpt.vn (my biggest culprit currently).

I have created an .htaccess file with a copy at the root and one in the httpdocs directory and the genealogy directory. So far it doesn't seem to have made much difference... Any suggestions as what else I might do? All sites seem to be from Hanoi, or similar geography and are targeting the calendar.php

The IP of static.vnpt.vn and the calendar page should be blocked by the .htaccess file as below, however it doesn't seem to be blocking access. What have I done incorrectly? I've blocked a whack of IP addresses and used the FilesMatch example above. But they still keep coming...

order deny,allow
deny from <14.161.4.110>
deny from <203.162.0.78>
deny from <113.191.154.54>
deny from <123.22.103.164>
deny from <123.20.146.154>
deny from <123.20.122.58>
deny from <14.232.210.209>
deny from <123.20.189.37>
deny from <123.21.100.226>
deny from <14.162.68.210>
deny from <14.241.159.204>
deny from <14.191.168.222>
deny from <113.168.9.84>
deny from <14.227.88.142>
deny from <113.172.164.107>
deny from <14.228.79.134>
deny from <113.168.105.248>
deny from <113.172.216.252>
deny from <222.252.97.87>
deny from <14.191.186.142>
deny from <14.187.72.161>
deny from <14.169.205.64>
deny from <203.162.0.78>
deny from <IP address from the list>

allow from all

<If "%{HTTP_USER_AGENT} =~ /^.*(ChatGPT|ChatSensei|CogniBot|GPTbot|HelperBot|Lumina|MegaBot|Pixella|Proxima|Quanta|Synthia|TechWhisper|PetalBot|static.vnpt.vn).*$/">
    Require all denied
</If>

<FilesMatch "(ahnentafel|anniversaries|calendar|descend|descendtext|descendtables|descendvert|familychart|familygroup|famsearch)\.php">
    <If "%{HTTP_USER_AGENT} =~ /^.*(bot|GPT|slurp|spider|crawler).*$/">
        Require all denied
    </If>
</FilesMatch>

<FilesMatch "(fan|notes|pedigree|pedigreetext|relationshp|relateform|searchform|search|timeline2|verticalchart)\.php">
    <If "%{HTTP_USER_AGENT} =~ /^.*(bot|GPT|slurp|spider|crawler).*$/">
        Require all denied
    </If>
</FilesMatch>

 

Thanks in advance.

Brenda

brpedersen.com/genealogy

Screenshot 2025-07-17 at 7.13.53 PM.png

Link to comment
Share on other sites

Brenda, what is the Apache version on your server ? If it has been upgraded to 2.4 (and maybe above), the syntax has changed and the "deny from" won't give any blocking result. The new syntax is something like "require not" instead.

I confess I asked Chat GPT to transcribe my old htaccess into the new syntax. Don't ask me for more ...

Edited by Katryne
Error : plugins quoted for another CMS. Sorry
Link to comment
Share on other sites

Thanks Katryne. Apparently my server does use Apache 2.4.

Updated syntax using ChatGPT and now I get a 500 internal server error... Damn!! Tried removing denied IP addresses to no avail.

# Deny specific IP addresses
<RequireAll>
    Require not ip <IP address from the list>  # Replace with the specific IP
    Require all granted
</RequireAll>

# Deny User-Agent based on patterns
<If "%{HTTP_USER_AGENT} =~ /^(.*(ChatGPT|ChatSensei|CogniBot|GPTbot|HelperBot|Lumina|MegaBot|Pixella|Proxima|Quanta|Synthia|TechWhisper|PetalBot|static.vnpt.vn).*)$/">
    Require all denied
</If>

# Deny access to certain files if User-Agent matches bot patterns
<FilesMatch "(ahnentafel|anniversaries|calendar|descend|descendtext|descendtables|descendvert|familychart|familygroup|famsearch)\.php">
    <If "%{HTTP_USER_AGENT} =~ /^(.*(bot|GPT|slurp|spider|crawler).*)$/">
        Require all denied
    </If>
</FilesMatch>

# Deny access to additional files if User-Agent matches bot patterns
<FilesMatch "(fan|notes|pedigree|pedigreetext|relationshp|relateform|searchform|search|timeline2|verticalchart)\.php">
    <If "%{HTTP_USER_AGENT} =~ /^(.*(bot|GPT|slurp|spider|crawler).*)$/">
        Require all denied
    </If>
</FilesMatch>

Have removed the .htaccess file for the moment.

Brenda

 

 

 

Link to comment
Share on other sites

Brenda, the command below only affects user agents.

 <If "%{HTTP_USER_AGENT} =~ /^.*(ChatGPT|ChatSensei|CogniBot|GPTbot|HelperBot|Lumina|MegaBot|Pixella|Proxima|Quanta|Synthia|TechWhisper|PetalBot|static.vnpt.vn).*$/">
    Require all denied
</If>

So if static.vnpt.vn is not listed in the user agent string, it is not detected and therefore not blocked.

If you are using Apache 2.4 commands you might try using the following between the <RequireAll></RequireAll> tags:

Require not host static.vnpt.vn

Since the commands shown in your post are used by Apache 2.2, you could try:

deny from static.vnpt.vn

Note: mixing Apache 2.2 and Apache 2.4 commands may result in a server ERROR 500.

If you are running Apache 2.4 but using 2.2 commands, your host has installed the Apache Module mod_access_compat module.
While this allows using some 2.2 commands in 2.4 problems can still occur, depending on the commands.

For more information check the Apache website on upgrading commands.

Link to comment
Share on other sites

I have modified the .htaccess file as below, moving the bot denial to above the <RequireAll> tag

Still no joy as have 500 error if .htaccess file is in root directory. Any and all help most appreciated!!

# Block ChatGPT‑like or generic “bot” user‑agents

<If "%{HTTP_USER_AGENT} =~ /(?:ChatGPT|ChatSensei|CogniBot|GPTbot|HelperBot|Lumina|MegaBot|Pixella|Proxima|Quanta|Synthia|TechWhisper|PetalBot)/i">
    Require all denied
</If>

<FilesMatch "(ahnentafel|anniversaries|calendar|descend|descendtext|descendtables|descendvert|familychart|familygroup|famsearch)\.php">
    <If "%{HTTP_USER_AGENT} =~ /(bot|GPT|slurp|spider|crawler)/i">
        Require all denied
    </If>
</FilesMatch>

<FilesMatch "(fan|notes|pedigree|pedigreetext|relationshp|relateform|searchform|search|timeline2|verticalchart)\.php">
    <If "%{HTTP_USER_AGENT} =~ /(bot|GPT|slurp|spider|crawler)/i">
        Require all denied
    </If>
</FilesMatch>

# -----------------
# Block specific IPs but allow everyone else
# -----------------
<RequireAll>
    Require all granted            # old: Order deny,allow + allow from all
    Require not ip 14.161.4.110
    Require not ip 203.162.0.78
    # (…add/remove IPs here as needed)
    # -----------------
    
    
</RequireAll>

 

 

Link to comment
Share on other sites

  • 1 month later...

I find that I am still having excess traffic - mostly accessing the Calendar page?? See attached. 5 GB traffic already this month...

It's more than a little time consuming to deny each of these Brazil and Argentina hosts in .htaccess, but that's what I've resorted to.  Any thoughts as to why they would be targeting the Calendar page?

Thanks in advance.

Brenda

brpedersen.com

Screenshot 2025-09-04 at 10.10.36 AM.png

Link to comment
Share on other sites

In Setup >> Configuration >> General Settings >> Menus, you could uncheck the Dates and Calendar items.

Maybe this won't be enough, since the bots have now learnt the url structure to be searched. If such is the case, look at my site  to see how I have now blocked most of the bots.

Katryne 

Edited by Katryne
Edit : sorry for that loooong post : my Enter key was stuck and I had not see it had built so many empty lines.
Link to comment
Share on other sites

Perfect. I have added that. I tried Katryne's method of removing the Calendar page from the menu, but I don't seem to have a menus under general settings, just a menu location. I guess my site is a hybrid rather than template. I haven't yet recalled where the menus as of yet.

Brenda

Link to comment
Share on other sites

  • 2 weeks later...

Hello Brenda !

I use 2 different mods. They MUST be used together.

1/ Image captcha
2/ Guardian (Steve tells me his mod Bot Manager can do a similar job, with some options, such as choosing between Image Captcha and Google Captcha)

Image Captcha builds a barrier on certain pages only. Adding Guardian makes Image Captcha active on all landing pages. Bot Manager lets you choose the pages where Image Captcha should be used more precisely, but I didn't try Steve's mod, as I was satisfied with Guardian's performance.

Image Captcha by Rick Bisbee https://tng.lythgoes.net/wiki/index.php/Image_Captcha

Guardian mod by Rick Bisbee https://tng.lythgoes.net/wiki/index.php/Guardian_Mod

Bot Manager by Steven Davis https://tng.lythgoes.net/wiki/index.php/Bot_Manager

 

Link to comment
Share on other sites

1 hour ago, Katryne said:

2/ Guardian (Steve tells me his mod Bot Manager can do a similar job, with some options, such as choosing between Image Captcha and Google Captcha)

It also allows specified bots to index your website. Those bots are specified by you in the options menu.
Soon to be released is a version that works with hcaptcha which has a free version if you stay below 100.000 hits a month.

You can see a working version of hcaptcha on my website. You can set it to passive mode so it doesn't require solving all kinds of puzzles or challenges which makes it more user friendly.

 

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...