PassMark Logo
Home » Forum

Announcement

Collapse
No announcement yet.

CGI/XML - Narrow Down by Category

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • CGI/XML - Narrow Down by Category

    Is it possible to get the xml output to provide the categories that contain matches with the count of matches (basically, the results you get with non-xml output) ?

    I really think that the xml output should have as much in common as any other output type as possible - but I'm finding it to be limiting in other areas as well (i.e. search word highlighting ) when it should be the most flexible option available.

  • #2
    Perhaps you have a different expectation of XML output than what we had in mind. XML output is commonly designed to be the "raw data" of the search results returned. It should be without any presentation logic (such as highlighting). The idea is to return the unformatted data as minimally, and in as generic a format as possible to allow post-processing and/or transformation of the data for presentation or further usages.

    To return all the presentation options available in HTML somewhat defeats this purpose. For example, with the summary messages "x results found, y containing all words, ...etc.", you can determine and create your own messages based on the information given regarding total number of pages found, etc. Similarly with highlighting.

    I do agree that the category summary information ("x results found in category A"), is something we could add to the XML output. Although it's debatable whether this should be presentation logic or part of the results data - I can consider that it's a bit more work for the post-processing stage to re-calculate and less work for us to include (by "work" i mean processing time). Although, it would once again have to be provided in a non-standard XML format (i.e. not part of OpenSearch as far as I can recall) which limits "generic" use.

    XML should be the most flexible option because of its minimalism, and that it allows you to reformat the data as necessary, and calculate/determine only the information you need. I personally would not expect it to do the most work and return the most information.
    --Ray
    Wrensoft Web Software
    Sydney, Australia
    Zoom Search Engine

    Comment


    • #3
      Yes - I agree with you. I just feel like there is not enough raw data in the xml to be able to accomplish these tasks that are done on non-xml outputs.

      The 2 issues I've outlined and why I don't feel like I have enough data in the xml feed:

      1) Highlighting - The xml returns the search string, which is great - I can do my only search/replace to add highlighting; But what about if the results is found via a synonym or stemming? Or if they search multiple words and the results just have 1 of those words? I don't know that in the feed, so won't be able to highlight the synonym/stem appropriately. If the highlighting were already in the results portion, I wouldn't have to worry about any of that. Or someway in the feed to let me know what to look for to properly highlight. (each result item would need this information since each could be found with different stem/synonym, or is a single word but the search is multiple words)

      2) As far as showing the results per category - I would be fine with it being in xml format (i.e. something like: <list><list_item><category>Category A</category><results_in_category>25</results_in_category></list_item></list>
      As long as this data were there, Im fine doing my own processing of it . But it needs to be there for me to do anything

      When I decided to switch to use CGI/XML - I was under the impression that I could re-create via post-processing anything that is done with PHP or CGI non-xml output. That is all Im looking for - the ability to create similar output to what non-xml versions do.
      Last edited by danf; Apr-13-2009, 09:06 PM.

      Comment


      • #4
        We're gonna have a look at the two things above and what can be done.

        With regards to the highlighting, at best, we'd have to make it a separate option from HTML highlighting as there are already users and systems out there expecting highlighting in HTML output, and no highlighting tags in XML output. I know you've been on a somewhat one-man crusade for this feature so hopefully we can do something to help.
        --Ray
        Wrensoft Web Software
        Sydney, Australia
        Zoom Search Engine

        Comment


        • #5
          Thanks Ray - greatly appreciated. I'm actually surprised that others using XML would not be asking for these features.

          Anyone else out there have my back?

          Especially the highlighting. I find it very annoying when you just see the snippet and it doesn't highlight the word. I've implemented my own highlighting based on the search word, but just today ran into the issue where I search for 'coding' on my site and got a result that didn't have any highlights and it took me a while to see that it was returned because of stemming and found 'code' in the search. There is no way for me to properly highlight that unless I built my own stemming algorithm - which I obviously don't want to do - thats why I purchased zoom!

          Comment


          • #6
            Category summary is now provided in the XML output as of V6.0 build 1013. You can download the latest from here:
            http://www.wrensoft.com/zoom/whatsnew.html

            Highlighting in XML is still to come.
            --Ray
            Wrensoft Web Software
            Sydney, Australia
            Zoom Search Engine

            Comment


            • #7
              Thats great news!

              Thanks

              Comment

              Working...
              X