comments on Character Model for the World Wide Web: String Matching and Searching from Matitiahu Allouche on 2014-06-19 (www-international@w3.org from April to June 2014)

From: Matitiahu Allouche <matitiahu.allouche@gmail.com>
Date: Thu, 19 Jun 2014 18:15:53 +0300
To: <www-international@w3.org>
Message-ID: <0df601cf8bd1$5d7973a0$186c5ae0$@gmail.com>

These are my comments on chapters 3 and 4 of the subject document ( http://www.w3.org/International/docs/charmod-norm/ ).

12) In 3, some requirements (e.g. Req 1) are labelled with [S], [I], [C] and some are not (e.g. Req 2, 3). I think that they should all be labelled with at least one for those marks.

13) In Req 4, the first token "C3xx" seems to be a leftover from something else. In the same paragraph, item 2 mentions character "includes". This term has not been mentioned and explained before, and is not obvious for me.

13) In 3.1.2, "languaguages" should be "languages".

14) Ibidem, "occaisionally" should be "occasionally".

15) Here and there: the document postulates that if a protocol allows user-defined names or identifiers, those tokens must allow non-ASCII characters, thus ASCII case-insensitive comparison is forbidden. It seems to me that this is extending the requirements in this document beyond its scope.

I agree that we (meaning the i18n crowd) like to promote use of non-ASCII characters everywhere, but it is not the scope of this document to state whether a given protocol allows such characters in identifiers. If it does not, why should we ban the use of ASCII case-insensitive comparison? We can recommend against it, we can explain that this restricts options for a more liberal character set in the future, but should we really make it non conformant?

16) When text between 2 requirements mentions "this requirement", it is not clear if it refers to the requirement above it or below it. For instance, see the text between Req 14 and Req 15.

17) In 4, "this section addressed" should be "This section addresses".

18) In 4.1, "generate different user expectations" should be "generates…".

19) In 4.1, instead of "they expect their more-specific input to match only what has been input", I suggest "they expect the search results to match closely their more-specific input".

Shalom (Regards), Mati

Received on Thursday, 19 June 2014 15:16:25 UTC