请教,网站如何有效屏蔽dataforseo-bot 蜘蛛爬虫

查看 89|回复 8
作者:haotui   
请教各位,网站如何有效屏蔽dataforseo-bot 蜘蛛爬虫,robots.txt  设置设置了,User-agent: dataforseobot  Disallow: / 和User-agent: dataforseo-bot  Disallow: /  ,没有任何效果,这个流氓蜘蛛一直爬,

爬虫, 蜘蛛

zhujibcom   
用百度云防护  轻松拦截


.png (30.15 KB, 下载次数: 0)
下载附件
保存到相册
昨天15:51 上传
雨天榕树   
我都宝塔里设置  除了常规的那几个  其他一律咔嚓掉
我思故我在   

呃                  
吕公子   
宝塔里面可以设置
haotui
OP
  
吕公子 发表于 2024-12-27 16:22
宝塔里面可以设置

请教怎么设置
haotui
OP
  
雨天榕树 发表于 2024-12-27 15:54
我都宝塔里设置  除了常规的那几个  其他一律咔嚓掉

请教宝塔怎么设置
golden021   
烦死了这些杂毛蜘蛛、
吕公子   
haotui 发表于 2024-12-27 17:01
请教怎么设置

把这个放在配置文件里:
  
#禁止Scrapy等工具的抓取
if ($http_user_agent ~* (Scrapy|HttpClient|crawl|curb|git|Wtrace)) {
     return 403;
}
#禁止指定UA及UA为空的访问
if ($http_user_agent ~* "CheckMarkNetwork|Synapse|Nimbostratus-Bot|Dark|scraper|LMAO|Hakai|Gemini|Wappalyzer|masscan|crawler4j|Mappy|Center|eright|aiohttp|MauiBot|Crawler|researchscan|Dispatch|AlphaBot|Census|ips-agent|NetcraftSurveyAgent|ToutiaoSpider|EasyHttp|Iframely|sysscan|fasthttp|muhstik|DeuSu|mstshash|HTTP_Request|ExtLinksBot|package|SafeDNSBot|CPython|SiteExplorer|SSH|MegaIndex|BUbiNG|CCBot|NetTrack|Digincore|aiHitBot|SurdotlyBot|null|SemrushBot|Test|Copied|ltx71|Nmap|DotBot|AdsBot|InetURL|Pcore-HTTP|PocketParser|Wotbox|newspaper|DnyzBot|redback|PiplBot|SMTBot|WinHTTP|Auto Spider 1.0|GrabNet|TurnitinBot|Go-Ahead-Got-It|Download Demon|Go!Zilla|GetWeb!|GetRight|libwww-perl|Cliqzbot|MailChimp|SMTBot|Dataprovider|XoviBot|linkdexbot|SeznamBot|Qwantify|spbot|evc-batch|zgrab|Go-http-client|FeedDemon|Jullo|Feedly|YandexBot|oBot|FlightDeckReports|Linguee Bot|JikeSpider|Indy Library|Alexa Toolbar|AskTbFXTV|AhrefsBot|CrawlDaddy|CoolpadWebkit|Java|UniversalFeedParser|ApacheBench|Microsoft URL Control|Swiftbot|ZmEu|jaunty|Python-urllib|lightDeckReports Bot|YYSpider|DigExt|HttpClient|MJ12bot|EasouSpider|LinkpadBot|Ezooms|^$" ) {

     return 403;

}
#禁止非GET|HEAD|POST方式的抓取
if ($request_method !~ ^(GET|HEAD|POST)$) {
    return 403;
}
    #屏蔽垃圾蜘蛛
    if ($request_method ~ ^(HEAD)$ ) {
                return 444 "FUCK U";
    }
    if ($http_range ~ "\d(9,)") {
                return 444;
    }
    if ($http_user_agent ~* (Amazonbot|SemrushBot|python|Linespider|crawler|DingTalkBot|simplecrawler|ZoominfoBot|zoombot|Neevabot|coccocbot|Facebot|YandexBot|Adsbot|DotBot|Applebot|DataForSeoBot|MJ12bot|BLEXBot|trendictionbot0|trendictionbot|AhrefsBot|hubspot|opensiteexplorer|leiki|webmeup|TinyTestBot|Symfony|PetalBot|proximic|GrapeshotCrawler|YaoSouBot|serpstatbot|Scrapy|Go-http-client|CCBot|CensysInspect|facebookexternalhit|GPTBot|ClaudeBot|Python-urllib|meta-externalagent|Yisouspider)) { return 444;
     }
    #禁止访问的文件或目录
    location ~ ^/(\.user.ini|\.htaccess|\.git|\.env|\.svn|\.project|LICENSE|README.md)
    {
        return 404;
    }
您需要登录后才可以回帖 登录 | 立即注册

返回顶部