Re: XML Query Use Case TEXT from Jonathan Robie on 2001-06-21 (www-xml-query-comments@w3.org from June 2001)

From: Jonathan Robie <Jonathan.Robie@SoftwareAG-USA.com>
Date: Thu, 21 Jun 2001 17:28:10 -0400
To: www-xml-query-comments@w3.org
Message-Id: <5.1.0.14.0.20010621172505.01b51610@pop.mindspring.com>

Dear Michael,

This is a response to the following message, which you posted to the XML
Query Working Group's comments list:

http://lists.w3.org/Archives/Public/www-xml-query-comments/2001Apr/0049.html

Here are our responses:

  > contains

You are right, the contains function in XPath does not mention case 
sensitivity. This function is underspecified, and we will raise an issue on 
Joint Task Force on Functions and Operators that XQuery and XSL participate on.

We will also need to consider full-text search operators for contains. It 
is  also possible that we will decide that we need to have more than one 
function for this range of functionality.

  > 1.6.3 Sample Data

I have fixed that spelling now.

  > Q1

In the current release of the document, we give this query and solution:


Solution in XQuery:


//news_item/title[contains(./text(), "Foobar Corporation")]

Expected Results


<title>Foobar Corporation releases its new line of Foo products today</title>
<title>Foobar Corporation is suing Gorilla Corporation for patent
infringement </title>

 >1.6.4.3 Q3 & 1.6.4.6 Q6
 >
 >contains_in_same_sentence() and contains_stems_in_same_sentence():
 >These seem pretty ad hoc to be built-in functions, so are they
 > supposed to be user-defined functions for which the definition
 > hasn't been written yet?

Yes, these are indeed rather ad-hoc.  I am adding editorial notes that
point this out.

 >(1)
 >The Solution in XQuery uses
 >     string( ($item//par)[1] )
 >but this doesn't reproduce the <quote> element in the Expected Result.
 >Instead, use
 >     ($item//par)[1]/node()

Although the <quote/> element itself is not reproduced, the text of the
element is. I do not believe the query results are wrong here.

 >(2)
 >The Expected Result's whitespace doesn't match that of the news document.

I have corrected the whitespace in most of the use cases, but in this
particular use case, the correct whitespace makes the results very hard to 
read. Therefore, I added an editorial note that mentions this.

 >Pulling all these together, I suggest:


I have put together parts of your query and parts of the earlier use cases 
query to produce the following:

     LET $companies := distinct(
             document("data/text-data.xml")//company/name/text()
         UNION document("data/text-data.xml")//company//partner/text()
         UNION document("data/text-data.xml")//company//competitor/text())

     FOR $item IN //news_item
     LET $places := $item/title UNION $item//par
     WHERE
         SOME $c1 IN $companies SATISFIES
           SOME $c2 IN $companies SATISFIES
             ( $c1 != $c2 AND
               contains_stems_in_same_sentence(
                  $places/text(), $c1, $c2, "acquire") )
     RETURN $item


We appreciate your feedback on the XML Query specifications. Please let us 
know if this response is satisfactory. If not, please respond to this
message, explaining your concerns.

Jonathan Robie
On behalf of the XML Query Working Group

Received on Thursday, 21 June 2001 17:28:13 UTC