dlech Posted March 7, 2025 Report Share Posted March 7, 2025 The last year or so the amount of bots continually fetching data has become unmanageable for me. I have decided to require login for all users. I changed Loginlib.php to provide default username/password entries to a visitor(guest) account. This allows those people who do not want to request an account to still gain entry to my data. But doing this will also kills access for search engine indexing. So I want to require login for all except a select few search engine indexers, for example googlebot. First let me say I have a technical background but retired decades ago, long for PHP etc. Basically I know enough to be dangerous. I have RIP Prevention Mod already installed. Looking at the code I think I see where it fetches and checks the network data for the user $remip = getenv( "REMOTE_ADDR" ); if( !$remip ) $remip = $_SERVER['REMOTE_ADDR']; $remhost = getenv( "REMOTE_HOST" ); if( !$remhost ) $remhost = isset($_SERVER['REMOTE_HOST']) ? $_SERVER['REMOTE_HOST'] : ""; if( !$remhost ) $remhost = @gethostbyaddr( $remip ); if( !$remhost ) $remhost = $remip; if( $charset === "UTF-8" ) $remhost = utf8_encode($remhost); else $remhost = utf8_decode($remhost); To check for googlebot... if (strpos($remhost,"google") !== false) { $remhost = "googlebot.com"; $remip = "66.249.6x.x"; } elseif (strpos($remip,"66.249.6") !== false) { $remhost = "googlebot.com"; $remip = "66.249.6x.x"; } I do not know the sequence for the call to RIP Prevention Mod and Login/Loginlib code to know which code gets executed first. But looking at it simplistically I want to do something like if $remhost !== "googlebot.com" && $remip !== "66.249.6x.x" require Login Hopefully you can see what I'm driving at. Any suggestions would be appreciated. Thanks Quote Link to comment Share on other sites More sharing options...
Rob Severijns Posted September 20, 2025 Report Share Posted September 20, 2025 Hello dlech, I just noticed this post of yours from six months ago which was unanswered. Have a look at Bot Manager - TNG_Wiki combined with a captcha option Hope this helps Quote Link to comment Share on other sites More sharing options...
Rob Severijns Posted September 20, 2025 Report Share Posted September 20, 2025 By The way. You also need to set this in Setup >> Configuration >> General Settings >> Privacy: Require Login: Yes Also check the other setting there. Quote Link to comment Share on other sites More sharing options...
dlech Posted December 3, 2025 Author Report Share Posted December 3, 2025 Hi Rob, Thank you for your reply. In April 2025 when I started working on this, I had RIP Prevention Mod and RIP Challenge Mod installed. Don’t recall seeing Bot Manager Mod back then. What I ended up with is this. I modified code so GoogleBot and MsnBot are allowed unrestricted access but all other access requires the user to login. I changed the Login page so it provides a preset username and password so that all guest users login using the same Guest account. I also modified and enhanced the Admin log messages so I could get more detail on who was attempting to access data and what data they are trying to access (i.e. php module name, IP address, domain name, etc). At that time, I also started saving Google and Msn index packets per hour data, URL indexed counts, and Guest access counts in a spreadsheet. First thing I noticed was how my hosting company and/or AWS who they used was severely restricting the number of Google and Msn indexing packets. I am very aware of problem hosting companies have with Google and Msn bombarding them with packets but I am also aware that if my website never gets indexed, few will find it. My hosting company sincerely tried to remedy the packet restriction but never really succeeded. So, September I changed hosting companies picking a non-AWS related host. In 3 months, the number of indexed URL’s doubled and the number of new legit users per day tripled. To help matters I periodically scan Admin log message to address the worst offenders by blocking specific IP addresses or subnets. The new hosting company also provided the ability to block users by geographic location so the worst offenders in the Far East can be blocked easily. That was a huge help. I’ve been running with the modified code since April with few glitches. I know that blocking offenders is an unwinnable game of Wack-a-Mole but over time the number of offenders has decreased. If nothing else the login required (Are you Human? technique) seems to thwart the countless AI bots probing for data. For now, I think I’ve won the battle. But I have absolutely no delusions about winning the war… Quote Link to comment Share on other sites More sharing options...
Rob Severijns Posted December 3, 2025 Report Share Posted December 3, 2025 Good to see you "won" the battle for now Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.