Question

I want to query nodes from a XOM document which contains certain value but case insensitive. Something like this:

doc.query('/root/book[contains(.,"case-insentive-string")]')

But it contains is case sensitive.

  1. I tried to use regexes, but it is only XPATH2.0 and XOM does not seem to support it.
  2. I tried contains(translate(."ABCEDF...","abcdef..."),"case-insentive-string")]' failed too.
  3. I tried to match subnodes and read parent attributes using getParent, but there is no method to read parents attributes.

Any suggestions ?

Was it helpful?

Solution

If you are using XOM, then you can use Saxon to run XPath or XQuery against it. That gives you the ability to use the greatly increased function library in XPath 2.0, which includes functions lower-case() and upper-case(), and also the ability (though in a somewhat product-specific way) to choose your own collations for use with functions such as contains() - which means you can do matching that ignores accents as well as case, for example.

OTHER TIPS

2.I tried contains(translate(."ABCEDF...","abcdef..."),"case-insentive-string")]' failed too.

The proper way to write this is:

/root/book[contains(translate(., $vUpper, $vLower),
                    translate($vCaseInsentiveString, $vUpper, $vLower)
                    )
          ]

where $vUpper and $vLower are defined as (should be substituted by) the strings:

'ABCDEFGHIJKLMNOPQRSTUVWXYZ'

and

'abcdefghijklmnopqrstuvwxyz'

and $vCaseInsentiveString is defined as (should be substituted by) the specific case-insensitive string.

For example, given the following XML document:

<authors>
  <author>
    <name>Victor Hugo &amp; Co.</name>
    <nationality>French</nationality>
  </author>
  <author period="classical" category="children">
    <name>J.K.Rollings</name>
    <nationality>British</nationality>
  </author>
  <author period="classical">
    <name>Sophocles</name>
    <nationality>Greek</nationality>
  </author>
  <author>
    <name>Leo Tolstoy</name>
    <nationality>Russian</nationality>
  </author>
  <author>
    <name>Alexander Pushkin</name>
    <nationality>Russian</nationality>
  </author>
  <author period="classical">
    <name>Plato</name>
    <nationality>Greek</nationality>
  </author>
</authors>

the following XPath expression (substitute the variables by the corresponding strings):

   /*/author/name
              [contains(translate(., $vUpper, $vLower),
                        translate('lEo', $vUpper, $vLower)
                        )
              ]

selects this element:

<name>Leo Tolstoy</name>

Explanation: Both arguments of the contains() function are converted to lower-case, and then the comparison is performed.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top