当前位置导航:炫浪网>>网络学院>>网络应用>>站长指南>>站长心得

常见的100多个搜索引擎蜘蛛爬虫

如果你的网站日志中出现了这些浏览标识,那都有可能是搜索引擎蜘蛛,可根据需求屏蔽或开放.

Yahoo Slurp
Unknown robot (identified by 'crawl')
Googlebot
Yahoo! Slurp China
GouGou
OutfoxBot
GigaBot
Lilina
MSNBot
Java (Often spam bot)
NewsGator Online
BaiDuSpider
Sina Iask Spider
Bloglines
MagpieRSS
Alexa (IA Archiver)
Feedfetcher-Google
MT::Telegraph::Agent
Feedburner
RoJo aggregator
Python-urllib
Jakarta commons-httpclient
Heritrix
Google AdSense
Mydoyouhike
Unknown robot (identified by 'spider')
Voyager
Ocelli
Unknown robot (identified by hit on 'robots.txt')
Unknown robot (identified by 'robot')
larbin
Hylanda
ZyBorg
ZhuaXia
lanshanbot
Technoratibot
Microsoft URL Control
IRLbot
Unknown robot (identified by 'bot/' or 'bot-')
Turn It In
Megite
MSIECrawler
msnbot-media
Sogou Spider
NutchCVS
MJ12bot
Kinjabot
Gaisbot
SurveyBot
Ask
StackRambler
Girafabot
T-H-U-N-D-E-R-S-T-O-N-E
Yahoo Feed Seeker
WordPress
UniversalFeedParser
Sphere Scout
findlinks
SBIder
Yahoo-Blogs
FeedValidator
Yahoo-MMCrawler
lwp-trivial
Webdup
Blogslive
IBM Almaden Research Center WebFountain?
Openfind data gatherer
BlogPulse ISSpider intelliseek.com
HPPrint
Walhello appie
BlogSearch
ping.blo.gs
Biz360 spider
UP.Browser
topicblogs
Exabot
Snappy
LinkWalker
BlogBridge Service
The World Wide Web Worm
Nutch
The web archive (IA Archiver)
Feedster
YahooSeeker-Testing
Voila
aipbot
PluckFeedCrawler
Everest-Vulcan
NG 2.x (Exalead)
MS SharePoint Portal Server - MS Search 4.0 Robot
Missigua_Locator
boitho.com-dc
Sunrise
Blogshares Spiders
ExactSeek Crawler
Nagios
nicebot
HTTrack off-line browser
Harvest
SandCrawler (Microsoft)
edgeio-retriever
NG 1.x (Exalead)
HTMLParser
Scooter
Y!J Yahoo Japan
arks
Tagyu Agent

上一篇:基于Java的Lucene全文检索引擎介绍 下一篇:无!
相关内容
赞助商链接