PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

Searching a Site Based on Folders

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Searching a Site Based on Folders

    I am wracking my brain trying to figure out what is wrong with Zoom, that it won't search through my site. I have tried doing it in offline mode and crawling the site on the internet. I have the site set up with many separate folders and files in those folders, rather than one big list of html files. It is a typical way to lay out a site. No big deal. The folders are like, "/home/" , "/guestbook/" , "/downloads/" , etc. Everything is properly linked together (relatively, rather than absolute links). I have had plenty of experience indexing a site with Zoom, so this one is puzzling me. It keeps giving me errors like, "Skipping http://subdomain.mywebsite.com/new (External site - does not match base URL)". Please help!
    Last edited by Guest; Oct-09-2006, 12:38 AM.

  • #2
    That skipping message means that the URL being indexed does not match the base URL specified. This would happen in Spider Mode if you have a Base URL which points to a different domain to the URL being indexed.

    For example, you may be indexing a page such as "http://www.mysite.com/mypage.html" with a base URL of "http://www.mysite.com/"

    This means that only links matching that base URL will be considered part of the site. It will not automatically follow links to every other domain name (or subdomain) because they would typically be other sites (and doing so would mean the spider would end up indexing the rest of the Internet).

    If you have links to a subdomain which you wish to index as part of the same site, you should specify multiple base URLs separated by a semicolon. For example, "http://www.mywebsite.com/;http://subdomain.mywebsite.com/"

    Information on multiple base URLs, and indexing subdomains can be found in ch 2.1.6 "Base URL" of the Users Guide here:
    http://www.wrensoft.com/zoom/usersguide.html
    --Ray
    Wrensoft Web Software
    Sydney, Australia
    Zoom Search Engine

    Comment


    • #3
      Originally posted by Ray View Post
      That skipping message means that the URL being indexed does not match the base URL specified. This would happen in Spider Mode if you have a Base URL which points to a different domain to the URL being indexed.

      For example, you may be indexing a page such as "http://www.mysite.com/mypage.html" with a base URL of "http://www.mysite.com/"

      This means that only links matching that base URL will be considered part of the site. It will not automatically follow links to every other domain name (or subdomain) because they would typically be other sites (and doing so would mean the spider would end up indexing the rest of the Internet).

      If you have links to a subdomain which you wish to index as part of the same site, you should specify multiple base URLs separated by a semicolon. For example, "http://www.mywebsite.com/;http://subdomain.mywebsite.com/"

      Information on multiple base URLs, and indexing subdomains can be found in ch 2.1.6 "Base URL" of the Users Guide here:
      http://www.wrensoft.com/zoom/usersguide.html
      I don't believe you addressed my problem. I focused on the folders issue. Most of the folders have an index.html file and are linked to other folders.

      Comment


      • #4
        I addressed the most likely cause of the problem from the information you have given me.

        The only useful information you gave me was the skip message. Did you check if the problem is what I suggested? If you have, and have reasons to show that this is not the case, then do tell us and let us know why.

        All you have told me is that you have folders, and you have index files for these folders and they are linked to other folders. There is nothing unusual about that. That's essentially every website out there. Zoom should have no problem indexing a website because of that. The most likely problem, given your skip message, is what I suggested, so I recommend you check that and let us know if you have reasons to believe that is not the case.

        Also, it would be helpful to have the exact message given (with the real URL intact) as well as the start and base URL you are using. E-mail us the actual ZCFG file with your indexing configuration if you are lost.
        --Ray
        Wrensoft Web Software
        Sydney, Australia
        Zoom Search Engine

        Comment


        • #5
          Originally posted by Ray View Post
          I addressed the most likely cause of the problem from the information you have given me.

          The only useful information you gave me was the skip message. Did you check if the problem is what I suggested? If you have, and have reasons to show that this is not the case, then do tell us and let us know why.

          All you have told me is that you have folders, and you have index files for these folders and they are linked to other folders. There is nothing unusual about that. That's essentially every website out there. Zoom should have no problem indexing a website because of that. The most likely problem, given your skip message, is what I suggested, so I recommend you check that and let us know if you have reasons to believe that is not the case.

          Also, it would be helpful to have the exact message given (with the real URL intact) as well as the start and base URL you are using. E-mail us the actual ZCFG file with your indexing configuration if you are lost.
          I got it to work, finally, using strictly the subdomain and spidering the site on the server, rather than doing it locally. I did try that before too, but it didn't work for some reason. Thanks for your time!

          Comment

          Working...
          X