RE: [F&O] 14.1.5 fn:lang from Ashok Malhotra on 2003-12-03 (public-qt-comments@w3.org from December 2003)

From: Ashok Malhotra <ashokma@microsoft.com>
Date: Wed, 3 Dec 2003 15:17:53 -0800
To: "David Carlisle" <davidc@nag.co.uk>, <public-qt-comments@w3.org>
Message-ID: <EDB607C8AC991F40BE646533A1A673E8C14088@RED-MSG-42.redmond.corp.microsoft.com>

David:
Thank you for your comment.
On the 12-2-2003 telcon the F&O taskforce decided to accept your
recommendations.  We agreed to the fix the bug you identified and to
change the text hat defined matching to refer to "caseless matching" in
the Unicode TR 21.

All the best, Ashok

-----Original Message-----
From: public-qt-comments-request@w3.org
[mailto:public-qt-comments-request@w3.org] On Behalf Of David Carlisle
Sent: Monday, November 17, 2003 7:22 AM
To: public-qt-comments@w3.org
Subject: [F&O] 14.1.5 fn:lang



The informal description has a couple of instances of 
"then returns false." 
which would probably read better as "then the function returns false"
as otherwise there doesn't appear to be a subject.

 
> If $lang is the empty sequence it is interpreted as the zero-length
string.

This should presumably be $testlang as there is no parameter called
$lang in the function signature.

> (The character "-" is HYPHEN-MINUS, %x002D) 

The syntax %x002D isn't used elsewhere, I suspect that #x002D is meant,
as that is used in other places to denote a character by hex unicode
value.

The relevent attribute is unambiguously specified by an Xpath expression
but the equality test used is described by a rather contorted (and
perhaps under specified) prose description. It would be clearer if
function was specified in its entirety by an Xpath expression,

something like

fn:upper-case(fn:replace((ancestor-or-self::*/@xml:lang)[last()],'-.*','
'))
=
fn:upper-case($testlang)


using an explict expression (specified as using unicode codepoint
collation) such as the above would make it clearer what is meant by the
current wording "ignoring case".

If "ignoring case" means first uppercase then compare then

dotless i would compare equal to i
and
ess-zed would compare equal to SS

however if "ignoring case" means first lowercase then compare
then both of the above two comparisons would compare non-equal.

I don't think the current wording defines which of these (or other
possible interpretations) should be used.


Alternatively (and perhaps preferably) "ignoring case" should be
replaced by an explict reference to
1.3 Caseless Matching
in the Unicode case mapping appendix (TR 21).


Note however that the current normative reference to Tr21 in F&O is to
http://www.unicode.org/unicode/reports/tr21/ 
however that page just says

  Superseded Technical Report

  UAX#21: Case Mappings has been incorporated into the Unicode Standard
  Version 4.0, and is thus now superseded. The last version of that
  document before it was superseded can be found at
  http://www.unicode.org/reports/tr21/tr21-5.html

So it might be better to make the normative reference be Unicode 4, with
a non-normative reference to tr21-5, for the benefit of those of us
without a unicode 4 book.


David


________________________________________________________________________
This e-mail has been scanned for all viruses by Star Internet. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk
________________________________________________________________________

Received on Wednesday, 3 December 2003 18:19:49 UTC