Hi Natalia,

thx a lot for your reply and making this clear to me. I thought the default 
behaviour for functions such as contains() when receiving a set of text nodes 
as argument was to apply the processing to each node separately and in turn 
return a set of booleans. Seeing as this is wrong, I perfectly get your point. 
As it appears, the online tool I used to check my claim made the same incorrect 
assumption.

Again thanks and regards,

David



----- Original Message ----
From: Natalia Shilenkova <nshilenk...@gmail.com>
To: xindice-users@xml.apache.org
Sent: Thu, October 22, 2009 12:13:55 AM
Subject: Re: Incorrect "No result for query" for XPath expression

David,

The problem you're describing is not a bug, your XPath query is executed 
correctly.

Let's see what happens when query 
/martif/text/body/termEntry[contains(langSet/ntig/termGrp/term/text(),'bancaire')]
 is executed. First, XPath finds all nodes  with path 
/martif/text/body/termEntry/langSet/ntig/termGrp/term and selects their 
children text nodes. The result of this step is node-set, which includes <term> 
data for every language. Then, XPath evaluates function contains(), where first 
argument is node-set. Per XPath specification [1], function contains expects 
two arguments of type string, not node-set, so it converts the first argument 
to string using function string(). When applied to node-set, it returns string 
value of the _first_ node in the document order.

Instead of checking <term> data for every language, it just checks if <term> 
data contains given string for language that happened to be first in the 
document. You can easily verify that by rearranging order of langSet tags in 
the document. The query 
/martif/text/body/termEntry[contains(langSet[starts-with(@lang,'fr')]/ntig/termGrp/term/text(),'bancaire')]
 works because of the same reason: contains() function only gets one langSet.

If you want the query that would check all the text nodes to see if they 
contain some substring, you can try something like that:
/martif/text/body/termEntry[langSet/ntig/termGrp/term[contains(text(),'bancaire')]]

[1] http://www.w3.org/TR/1999/REC-xpath-19991116

Regards,
Natalia


On Oct 21, 2009, at 12:02 PM, David Vergnaud wrote:

> Hi,
> 
> I'm reporting on a problem which I'm pretty much convinced is a bug in the 
> current 1-2.dev version of Xindice (1.2m1). I'm using Xindice running on its 
> own (no Tomcat) as a daemon on a Linux box (Suse 11) with JDK 1.6.
> 
> Basically, I have a DB where I've stored terminology entries that contain 
> information about various banking terms in 4 languages. I want to be able to 
> conduct two types of searches, one where the term is searched for only one of 
> the languages, and one where the search is carried out in all languages. For 
> this, I use two versions of a somewhat complicated XPath expression: one 
> where the language is specified (as attribute of one of the nodes, in a 
> predicate) and one where it isn't. This is the only difference between the 
> two expressions. Surprisingly, the one where the language is fixed does 
> return results where the one without specification doesn't. Besides, I've 
> tested the XPath expression on other systems, and seen that there really 
> should be results.
> 
> The first impression is that when evaluating function arguments inside a 
> predicate, only the first node of a node set is evaluated. In my case, that 
> would be confirmed by the following fact: each entry contains first the 
> German word, then either French or English. When doing an "unrefined" search 
> (no language specification) with a German word, results are returned. When 
> doing the same unrefined search with French or English, no results are 
> returned.
> 
> Here's an example of an XPath we're using, first with the language 
> refinement, then without:
> /martif/text/body/termEntry[contains(langSet[starts-with(@lang,'fr')]/ntig/termGrp/term/text(),'bancaire')]
> /martif/text/body/termEntry[contains(langSet/ntig/termGrp/term/text(),'bancaire')]
> 
> As you can see, the goal is to extract a termEntry element which contains the 
> word "bancaire" under the specified path. In the first path, I set the 
> langSet to have attribute lang start with "fr" (for French), in the second I 
> don't. As I said before, the first expression yields a result and the second 
> one doesn't.
> 
> I'm including an example DB entry which can be used to test this -- I assume 
> it should be possible to observe this behaviour with only one entry in the DB 
> as well. In order to use the xpath above with it, one would need to prefix 
> all node names in the xpath expression with "tbx" (I only removed that for 
> legibility).
> 
> Should this prove to be an error on my side, I'd be grateful to anyone who'd 
> point it out. Otherwise, it might need to be taken onto the Xindice bug list.
> 
> Cheers,
> 
> David
> 
> <?xml version="1.0"?>
> <martif xmlns="http://www.lisa.org/tbx"; type="TBX" xml:lang="de-CH">
>  <martifHeader>
>    <fileDesc>
>      <titleStmt>
>        <title>
>          Test-TerminologieDB        </title>
>      </titleStmt>
>      <publicationStmt>
>        <p>
>           Version 1.1        </p>
>      </publicationStmt>
>      <sourceDesc>
>        <p>
>           Version 1.1        </p>
>      </sourceDesc>
>    </fileDesc>
>  </martifHeader>
>  <text>
>    <body>
>      <termEntry>
>        <descrip type="classificationCode" />
>        <descrip type="subjectField">
>        </descrip>
>        <langSet xml:lang="de-CH">
>          <transacGrp>
>            <transac type="transactionType">
>              created            </transac>
>            <transacNote type="responsibility">
>              STEA            </transacNote>
>            <date>
>              2009-09-15T14:44:54.924+02:00            </date>
>          </transacGrp>
>          <descrip type="reliabilityCode">
>            1          </descrip>
>          <note />
>          <descripGrp>
>            <descrip type="definition">
>              Die Garantie ist eine selbstständige, vom Hauptschuldverhältnis 
> unabhängige Verpflichtung. Der Garant (die Bank) kann keinerlei Einwendungen 
> und Einreden aus dem Grundgeschäft erheben. Das heisst: Der Garant zahlt auf 
> erste schriftliche Anforderung (Inanspruchnahme) des Begünstigten, gegen 
> Einreichung der im Garantietext vorgeschriebenen Bestätigung und allenfalls 
> vorgeschriebenen Dokumente.            </descrip>
>            <adminGrp>
>              <admin type="source">
>                CS Glossar              </admin>
>            </adminGrp>
>          </descripGrp>
>          <ntig>
>            <termGrp>
>              <term>
>                Bankgarantie              </term>
>              <termNote type="partOfSpeech" />
>              <termNote type="grammaticalGender" />
>              <termNote type="grammaticalNumber" />
>              <termCompList type="lemma">
>                <termComp />
>              </termCompList>
>              <termCompList type="morphologicalElement">
>                <termComp />
>              </termCompList>
>              <termNote type="termType">
>                main              </termNote>
>              <termNote type="usageNote" />
>            </termGrp>
>            <adminGrp>
>              <admin type="source">
>                CS Glossar              </admin>
>              <note />
>            </adminGrp>
>            <descripGrp>
>              <descrip type="example" />
>              <adminGrp>
>                <admin type="source" />
>              </adminGrp>
>            </descripGrp>
>            <note />
>          </ntig>
>          <ntig>
>            <termGrp>
>              <term />
>              <termNote type="termType">
>                abbr              </termNote>
>            </termGrp>
>            <adminGrp>
>              <admin type="source" />
>            </adminGrp>
>          </ntig>
>          <ntig>
>            <termGrp>
>              <term />
>              <termNote type="termType">
>                syn              </termNote>
>              <termNote type="grammaticalGender" />
>              <termCompList type="lemma">
>                <termComp />
>              </termCompList>
>              <termCompList type="morphologicalElement">
>                <termComp />
>              </termCompList>
>            </termGrp>
>            <adminGrp>
>              <admin type="source" />
>              <note />
>            </adminGrp>
>            <descrip type="example" />
>            <adminGrp>
>              <admin type="source" />
>            </adminGrp>
>            <note />
>          </ntig>
>        </langSet>
>        <langSet xml:lang="en-GB">
>          <transacGrp>
>            <transac type="transactionType">
>              created            </transac>
>            <transacNote type="responsibility">
>              STEA            </transacNote>
>            <date>
>              2009-09-15T14:44:54.924+02:00            </date>
>          </transacGrp>
>          <descrip type="reliabilityCode">
>            1          </descrip>
>          <note />
>          <descripGrp>
>            <descrip type="definition" />
>            <adminGrp>
>              <admin type="source" />
>            </adminGrp>
>          </descripGrp>
>          <ntig>
>            <termGrp>
>              <term>
>                bank guarantee              </term>
>              <termNote type="partOfSpeech" />
>              <termNote type="grammaticalGender" />
>              <termNote type="grammaticalNumber" />
>              <termCompList type="lemma">
>                <termComp />
>              </termCompList>
>              <termCompList type="morphologicalElement">
>                <termComp />
>              </termCompList>
>              <termNote type="termType">
>                main              </termNote>
>              <termNote type="usageNote" />
>            </termGrp>
>            <adminGrp>
>              <admin type="source">
>                CS Glossar              </admin>
>              <note />
>            </adminGrp>
>            <descripGrp>
>              <descrip type="example" />
>              <adminGrp>
>                <admin type="source" />
>              </adminGrp>
>            </descripGrp>
>            <note />
>          </ntig>
>          <ntig>
>            <termGrp>
>              <term />
>              <termNote type="termType">
>                abbr              </termNote>
>            </termGrp>
>            <adminGrp>
>              <admin type="source" />
>            </adminGrp>
>          </ntig>
>          <ntig>
>            <termGrp>
>              <term />
>              <termNote type="termType">
>                syn              </termNote>
>              <termNote type="grammaticalGender" />
>              <termCompList type="lemma">
>                <termComp />
>              </termCompList>
>              <termCompList type="morphologicalElement">
>                <termComp />
>              </termCompList>
>            </termGrp>
>            <adminGrp>
>              <admin type="source" />
>              <note />
>            </adminGrp>
>            <descrip type="example" />
>            <adminGrp>
>              <admin type="source" />
>            </adminGrp>
>            <note />
>          </ntig>
>        </langSet>
>        <langSet xml:lang="fr-CH">
>          <transacGrp>
>            <transac type="transactionType">
>              created            </transac>
>            <transacNote type="responsibility">
>              STEA            </transacNote>
>            <date>
>              2009-09-15T14:44:54.924+02:00            </date>
>          </transacGrp>
>          <descrip type="reliabilityCode">
>            1          </descrip>
>          <note />
>          <descripGrp>
>            <descrip type="definition" />
>            <adminGrp>
>              <admin type="source" />
>            </adminGrp>
>          </descripGrp>
>          <ntig>
>            <termGrp>
>              <term>
>                garantie bancaire              </term>
>              <termNote type="partOfSpeech" />
>              <termNote type="grammaticalGender" />
>              <termNote type="grammaticalNumber" />
>              <termCompList type="lemma">
>                <termComp />
>              </termCompList>
>              <termCompList type="morphologicalElement">
>                <termComp />
>              </termCompList>
>              <termNote type="termType">
>                main              </termNote>
>              <termNote type="usageNote" />
>            </termGrp>
>            <adminGrp>
>              <admin type="source">
>                CS Glossar              </admin>
>              <note />
>            </adminGrp>
>            <descripGrp>
>              <descrip type="example" />
>              <adminGrp>
>                <admin type="source" />
>              </adminGrp>
>            </descripGrp>
>            <note />
>          </ntig>
>          <ntig>
>            <termGrp>
>              <term />
>              <termNote type="termType">
>                abbr              </termNote>
>            </termGrp>
>            <adminGrp>
>              <admin type="source" />
>            </adminGrp>
>          </ntig>
>          <ntig>
>            <termGrp>
>              <term />
>              <termNote type="termType">
>                syn              </termNote>
>              <termNote type="grammaticalGender" />
>              <termCompList type="lemma">
>                <termComp />
>              </termCompList>
>              <termCompList type="morphologicalElement">
>                <termComp />
>              </termCompList>
>            </termGrp>
>            <adminGrp>
>              <admin type="source" />
>              <note />
>            </adminGrp>
>            <descrip type="example" />
>            <adminGrp>
>              <admin type="source" />
>            </adminGrp>
>            <note />
>          </ntig>
>        </langSet>
>        <langSet xml:lang="it-CH">
>          <transacGrp>
>            <transac type="transactionType">
>              created            </transac>
>            <transacNote type="responsibility">
>              STEA            </transacNote>
>            <date>
>              2009-09-15T14:44:54.924+02:00            </date>
>          </transacGrp>
>          <descrip type="reliabilityCode">
>            1          </descrip>
>          <note />
>          <descripGrp>
>            <descrip type="definition" />
>            <adminGrp>
>              <admin type="source" />
>            </adminGrp>
>          </descripGrp>
>          <ntig>
>            <termGrp>
>              <term>
>                garanzia bancaria              </term>
>              <termNote type="partOfSpeech" />
>              <termNote type="grammaticalGender" />
>              <termNote type="grammaticalNumber" />
>              <termCompList type="lemma">
>                <termComp />
>              </termCompList>
>              <termCompList type="morphologicalElement">
>                <termComp />
>              </termCompList>
>              <termNote type="termType">
>                main              </termNote>
>              <termNote type="usageNote" />
>            </termGrp>
>            <adminGrp>
>              <admin type="source">
>                CS Glossar              </admin>
>              <note />
>            </adminGrp>
>            <descripGrp>
>              <descrip type="example" />
>              <adminGrp>
>                <admin type="source" />
>              </adminGrp>
>            </descripGrp>
>            <note />
>          </ntig>
>          <ntig>
>            <termGrp>
>              <term />
>              <termNote type="termType">
>                abbr              </termNote>
>            </termGrp>
>            <adminGrp>
>              <admin type="source" />
>            </adminGrp>
>          </ntig>
>          <ntig>
>            <termGrp>
>              <term />
>              <termNote type="termType">
>                syn              </termNote>
>              <termNote type="grammaticalGender" />
>              <termCompList type="lemma">
>                <termComp />
>              </termCompList>
>              <termCompList type="morphologicalElement">
>                <termComp />
>              </termCompList>
>            </termGrp>
>            <adminGrp>
>              <admin type="source" />
>              <note />
>            </adminGrp>
>            <descrip type="example" />
>            <adminGrp>
>              <admin type="source" />
>            </adminGrp>
>            <note />
>          </ntig>
>        </langSet>
>      </termEntry>
>    </body>
>  </text>
> </martif>
> 
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com

__________________________________________________
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com

Reply via email to