- From: Michael Dyck <MichaelDyck@home.com>
- Date: Sun, 29 Apr 2001 22:36:55 -0700
- To: www-xml-query-comments@w3.org
XML Query Use Cases W3C Working Draft 15 February 2001 Use Case TEXT ------------------- 1.6.1 vs all queries "In this use case, searches for company names are to be interpreted as word-based searches. The words in a company name may be in any case and may be separated by any kind of white space." The Solutions in XQuery use "contains()", and there's nothing in XPath section 4.2 to indicate that this function is case-insensitive, or that it treats every chunk of whitespace as the same. ----------------- 1.6.3 Sample Data /news/news_item[1]/content/par[1] contains a spelling mistake: corparation should be at least corporation and preferably Corporation This mistake causes queries 3 and 5 to return unexpected results. ---------- 1.6.4.1 Q1 The informal query asks for particular news items, but the Solution in XQuery only yields the titles of those items. So //news_item/title[contains(./text(), "Foobar Corporation")] should be changed to //news_item[title[contains(./text(), "Foobar Corporation")]] or //news_item[contains(title/text(), "Foobar Corporation")] ----------------------- 1.6.4.3 Q3 & 1.6.4.6 Q6 contains_in_same_sentence() and contains_stems_in_same_sentence(): These seem pretty ad hoc to be built-in functions, so are they supposed to be user-defined functions for which the definition hasn't been written yet? Also, if "." designates the end of a sentence, "YouNameItWeIntegrateIt.com" will never be deemed to appear in a sentence. (This is a problem for Q6.) ---------- 1.6.4.5 Q5 (1) The Solution in XQuery uses string( ($item//par)[1] ) but this doesn't reproduce the <quote> element in the Expected Result. Instead, use ($item//par)[1]/node() (2) The Expected Result's whitespace doesn't match that of the news document. ---------- 1.6.4.6 Q6 (1) In the Solution in XQuery, "para" should be "par". (2) The construct $item_title IN $item/title, $item_para IN $item//par is bad, because if the $item has no par elements (which is allowed by the DTD), the FOR will "abort", even if it should have found a hit in the title. (3) The construct WHERE different_companies AND title_mentions OR para_mentions needs parentheses: WHERE different_companies AND ( title_mentions OR para_mentions ) (4) The function call distinct($item) is useless, because $item is always just a single node. Instead, you could pass the result of the whole FLWR expression to distinct(). But really, you don't need distinct(), because FOR $item IN //news_item generates distinct items. Pulling all these together, I suggest: LET $companies := ... FOR $item IN //news_item LET $places := $item/title UNION $item//par WHERE SOME $c1 IN $companies SATISFIES SOME $c2 IN $companies SATISFIES ( $c1 != $c2 AND contains_stems_in_same_sentence( $places/text(), $c1, $c2, "acquire") ) RETURN $item (It would be nice if the two quantifications could be written SOME $c1 IN $companies, $c2 IN $companies SATISFIES ... ) -Michael Dyck
Received on Monday, 30 April 2001 01:38:57 UTC