PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

No search term highlighting for *.PDF files??

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • No search term highlighting for *.PDF files??

    This had me tearing my hair out yesterday until I finally figured it out this morning.

    I have the V5 version installed on a box running Windows Server 2003 R2, and was having no problems indexing PDF files with search terms displaying properly in search results. My company installed a new HP N9120 scanner and while the PDF files it generated indexed just fine and came up properly in search results, none of their URLs had the '#search=' parameter appended. This of course, resulted in the inability of the search utility to narrow in and highlight desired terms in the large multi-page PDF documents we needed to reference.

    I had PDFs side by side in the same directory - identical in every way - and the ones produced by our old scanner would come up in the search results with search terms transferred and highlighting working perfectly, however those produced by the new scanner would simply open and do nothing more.

    Turns out (and I saw nothing about this in any of the considerable documentation) that V5 apparently does not support highlighting of PDF documents followed by ".PDF", where the ".PDF" is in all upper case. Manually changing all extensions to ".pdf" and reindexing immediately restored all highlighting functionality.

    Since I cannot set my new $4,000 HP scanner to generate PDF files with the file extensions in lower case, I'm kinda stuck. Is there a work-around/patch available, or perhaps an updated PDF plug-in?
    Last edited by dpeters30; Dec-09-2009, 03:13 PM.

  • #2
    The highlighting isn't related the the plugin, which only extracts the text.
    The appending of the highlighting tag is done by the search script.
    Which search script option are you using? (PHP, ASP, CGI, JS)

    Comment


    • #3
      Originally posted by wrensoft View Post
      The highlighting isn't related the the plugin, which only extracts the text.
      The appending of the highlighting tag is done by the search script.
      Which search script option are you using? (PHP, ASP, CGI, JS)
      I'm using ASP. Sorry, I meant to mention that in the original post, but forgot.

      Comment


      • #4
        I see, starting in lines 1810 and 1884 of the 'search.asp' file, exactly where the URL parameter gets appended. Unfortunately, this file appears to be overwritten at each re-index, so I can't just tweak the code to include the uppercase version of the file extension...

        Comment


        • #5
          In the advanced configuration window in Zoom, you can specify the location of a custom script.

          We also have a look at the issue with a view to fixing it in the ASP script.

          Comment


          • #6
            I noticed that option as I was wrapping up at the office today, thinking I would explore it more tomorrow. If I can save a modified script that the application will pull from when re-indexing then I think we're good. You guys seem to have thought of everything!!

            Comment


            • #7
              Fixed!

              Well, I did what I probably should have done in the first place and read the User's Guide - section 7.4 outlines how to edit the search script, which I did this morning.

              I just copied the original nested IF...THEN statements regarding appending URL parameters (there were two of them, as stated earlier in this thread), inserted the copied code under the original statements, and changed the lower case '.pdfs' to uppercase. This way I have IF...THEN statements handling both file extensions, .pdf and .PDF. Works like a charm!

              Comment


              • #8
                Glad to hear you've fixed the problem on your end.

                This is really a bug in our ASP script though (it should add the "#search=" highlighting parameter regardless of the upper/lower-casing of the ".pdf" extension) and we will fix it for the next build (V6.0 build 1020) release.
                --Ray
                Wrensoft Web Software
                Sydney, Australia
                Zoom Search Engine

                Comment

                Working...
                X