PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

sitemap.xml with MP3 and MP4

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • sitemap.xml with MP3 and MP4

    Hello,

    I bought the Zoom pro friday, and I am facing some difficulties. Is it possible to add a sitemap.xml to zoom? So the results appear on the results page. How can I index those pages? The sites that I am indexing have robots.txt. I wonder if it's possible to use their sitemap.xml? How does zoom work with that?

    Greetings, From the Netherlands

  • #2
    You can import a list of pages into Zoom for indexing.

    You can choose to use (or not) the robots.txt file in the Zoom configuration.

    What is the problem with the sites you are indexing that is making you want to use the sitemap file rather than spidering the site as per normal? Maybe there is a better solution.

    Comment


    • #3
      Thank you for your fast reply. I am building a mobile video search engine. (clicking a link, will get you on a page (mobile video site) where you can choose mp4, 3gp (not directly to the video)etc. Indexing with robot.txt enabled result in skipping(not possible at all) Indexing with robot.txt unabled result in: Suspected invalid html (possible unmatched qute characters) on page. Other pages are blocked by extensions list. The pages are php.

      greetings,
      Last edited by Robinsonnl; Sep-06-2010, 08:49 AM.

      Comment


      • #4
        Would really need a URL to the site in question (and what start spider URL you are using) to really know. But:

        (1) Look at the robots.txt file of the site and see if perhaps it is actually telling the spider to go away and not index the site. This will explain the first problem.

        (2) The page might actually contain broken HTML. Enter it into a HTML Validation service (such as http://validator.w3.org/) and it should tell you if it is broken. If so, this would explain the warning Zoom gives you.

        (3) Pages are blocked by extension list means you don't have the extension entered in the Zoom scan extensions list (under "Configure"->"Scan options"). Even if they are PHP pages, do they serve different files? If, for example, "download.php" serves a .mp4 file, then you will need to have enabled indexing of MP4 files for it to work.

        Note however, that these are the binary formats supported (and plugins are required for them):
        http://www.wrensoft.com/zoom/plugins.html

        If you look at that list, you'll see that there is no mp4 or 3gp support.

        However, if you add them into the Scan Extensions list, you can configure them to be treated as "Binary files (filename only)" and only their filenames will be indexed (or you can use .desc files to specify the title, description and keywords for them). Is that what you want?
        --Ray
        Wrensoft Web Software
        Sydney, Australia
        Zoom Search Engine

        Comment


        • #5
          Thank you for your reply. Yes I know there is no support for mp4. Could you tell me more about indexing those mp4 files as binary? It's seems like a good solution. What do you mean with "only their filenames''? Does the search engine go directly to the file? I don't understand.

          Thank you

          Greetings,

          Comment


          • #6
            It will only index the filenames of the MP4 files, unlike the other supported formats where it will index the contents of the files (e.g. in the case of MP3, it will index the title, song name, album name, duration etc. that is stored within the MP3 file itself).

            The search engine result will link directly to the video file if the video file was indexed and you searched for that filename. But you asking me that makes me wonder if you actually have a HTML web page for each video and you will be indexing these web pages instead. In which case, you probably don't need to index the actual video file.

            Again, without seeing the website, it is very difficult to give meaningful advice because it's hard to know what you are really indexing by your description alone.
            --Ray
            Wrensoft Web Software
            Sydney, Australia
            Zoom Search Engine

            Comment


            • #7
              Hi!

              Thanks for the help. I have indexed over 5000 pages, It works.!!!

              Comment

              Working...
              X