User-Agent stringWeb clients (pretty much anything that sends requests to a web server) need to identify themselves with a "User-Agent" string. There are specific strings used for various versions of Internet Explorer and Netscape browsers, and these are often examined by websites which wish to serve up different content depending on the browser that you are using. It is also often recorded in the server logs for web server administrators to see where their traffic is coming from. What is ZoomSpider or ZSEBOT?By default, Zoom identifies itself with the following string when it indexes a website in Spider Mode.:
Can I change the User-Agent text?You can modify the User-Agent text in the Enterprise Edition only (on the "Advanced" tab of the Configuration window). This allows you to specify your own domain name or text and can be useful if you wish to program customized behaviour for the spider when indexing your own website. It can also be used to provide server admins with a way of contacting you should you be planning to index many external websites. Note that the "[ZSEBOT]" tag will always remain at the end of the User-Agent string, even if you have specified custom text. I am a server administrator and I'm seeing "ZoomSpider" or "ZSEBOT" appear in my server logsZoom is a software product developed by PassMark Software. Its primary and original purpose is for web developers to index and create a search engine for their own website. Since then, it has grown increasingly popular for developers to index external (other peoples) websites - especially web developers who may be building a portal or a search engine for a certain collection of sites. If you are seeing the above User-Agent string in your access logs, it means that someone is running our software to index your website. NOTE: PassMark does not usually index external websites, except on rare occasions for testing purposes. If you wish to restrict or control the behaviour of ZoomSpider on your website, you can use the "robots.txt" file to specify "Disallow:" and "Crawl-delay:" parameters. For example, the following "robots.txt" file will block Zoom from indexing any files in a folder named "secret" and any files named "private.html". It will also force a delay of 5 seconds between requests to this start point.
# this is a comment - my robots.txt file for www.mysite.com
Note that while "robots.txt" support is enabled by default in all versions of Zoom (from V5.1 upwards), it is possible for users to disable this feature. You will however, always be able to detect the spider by its User-Agent string containing the "[ZSEBOT]" tag. For more information on the "robots.txt" file format, please see online resources such as http://www.robotstxt.org/. Return to the Zoom Search Engine Support page. |