- From: Sandy Gao <sandygao@ca.ibm.com>
- Date: Wed, 19 Sep 2007 14:10:39 -0400
- To: public-sml@w3.org
- Message-ID: <OF00C94099.512BB0BD-ON8525735B.00630212-8525735B.0063D9E0@ca.ibm.com>
This is a simple proposal, and being simple is normally good, but I'll leave this to the URI/IRI gurus to determine whether the simple solution is good enough to cover real-life scenarios. One thing that worries me is the "case insensitive" part. Why? As far as I can tell, this doesn't match any of the steps in "6.2. Comparison Ladder" of RFC 3986. If we want the simplest possible solution, then we should use what's defined in 6.2.1 and compare strings character-by-character case-sensitivly. Thanks, Sandy Gao XML Technologies, IBM Canada Editor, W3C XML Schema WG Member, W3C SML WG (1-905) 413-3255 T/L 969-3255 Kumar Pandit <kumarp@windows.microsoft.com> Sent by: public-sml-request@w3.org 2007-09-12 11:02 PM To "public-sml@w3.org" <public-sml@w3.org> cc Kumar Pandit <kumarp@windows.microsoft.com> Subject [w3c sml][4665] Clarify URI equivalence in reference to RFC 3986 Here is my proposal to resolve this issue. Proposal: Uri equivalence in SML-IF should be defined as case insensitive simple string comparison based on codepoint-by-codepoint comparison of the corresponding characters in the uri. Justification: 1. Performance: Simple string comparison provides highest performance. Although it is true that two aliases of the same uri may not compare as equal without normalization, the problem does not exist in the specific context of an SML-IF producer. This is because, when a producer is writing out an SML-IF document, it can apply normalizations (if necessary) such that a given uri always appears in the same way. This allows consumers to perform fast string comparison without needing to perform any type of normalization. RFC 3986 section 2 (Comparison Ladder) describes many different forms of normalizations (syntax-based/case/percent-encoding/path-segment/scheme-based/protocol-based). If we want a consumer to perform normalizations, we not only make a consumer less efficient but also need to add very specific normalization step definitions in the SML-IF spec. On the other hand, if we leave the burden of normalization to the producer, we can keep the SML-IF spec much simpler and allow consumers to be more efficient. This way the spec does not need to talk about any specific comparison ladder step(s) to be performed by a producer. The producer is free to apply any (or none) normalization steps as long as it knows it will write a given uri in the same format. 2. Precise definition: RFC 3986 section 6.2.1 (Simple String Comparison) discusses issues involved in performing a string comparison but does not provide a precise definition of how the comparison must be performed. In other words, it leaves some room for interpretation. We should avoid this by presenting an unambiguous definition based on that discussion.
Received on Wednesday, 19 September 2007 18:10:57 UTC