• Main Menu
  • How to Prevent Downloading of Your Entire Website


    Preventing Web Site Downloading Using robots.txt

    The first step is to disallow the downloading programs in your robots.txt file. To do this, you will need to define which bad robots you wish to disallow.

    Disallowing bad programs in robots.txt does not prevent all web site downloading, because many bad programs simply ignore the contents of robots.txt and do what they want to do.

    Preventing Web Site Downloading Using User Agent Blocking in httpd.conf

    Another method is to exclude the downloading programs user agent in httpd.conf.

    Add every agent you wish to exclude to httpd.conf:

    SetEnvIfNoCase User-Agent ^Httrack keep_away
    SetEnvIfNoCase User-Agent ^Offline Explorer keep_away
    SetEnvIfNoCase User-Agent ^psbot keep_away
    SetEnvIfNoCase User-Agent ^Teleport keep_away
    SetEnvIfNoCase User-Agent ^WebCopier keep_away
    SetEnvIfNoCase User-Agent ^WebReaper keep_away
    SetEnvIfNoCase User-Agent ^Webstripper keep_away

    Order Allow,Deny
    Allow from all
    Deny from env=keep_away

    User agent blocking also does not prevent all web site downloading, because the user can delete his user agent or spoof it to appear to be Internet Explorer or another common browser.

    Preventing Web Site Downloading Using User Agent Blocking in PHP

    If the content you are attempting to protect is in PHP, you may be interested in the user agent blocking technique described in Deny Spambots and Prevent Email Harvesting.

    Got Something To Say:

    Your email address will not be published. Required fields are marked *

    Web Publishing
    175 queries in 0.565 seconds.