- From: Ivan Herman <ivan@w3.org>
- Date: Thu, 13 Oct 2011 06:48:54 +0100
- To: John Cowan <cowan@mercury.ccil.org>, RDF Working Group WG <public-rdf-wg@w3.org>
- Message-Id: <49260A6F-9455-4CAE-8164-F00AFEA34F03@w3.org>
On Oct 12, 2011, at 04:28 , Leif Halvard Silli wrote: > John Cowan, Tue, 11 Oct 2011 10:57:45 -0400: >> Phillips, Addison scripsit: >> >>> XML is an interesting case because it makes the opposite decision >>> consciously: two canonically-equivalent but unequal identifiers are >>> not equal. >> >> And this applies to both XML names and to namespace URIs. > > One - probably strong - reason why HTML5 could end up with the same > solution as XML is that HTML5 has XML 1.0 compatibility as design goal. > For that reason, it is also probably smart to focus on XML 1.0 if one > wants to drive HTML5 in a particular direction ... > > Btw, I filed bug 12839 on 1st of June to make the HTML5 spec say that > normalization should be performed on @id attributes before establishing > whether they are unique or not.[1] If the proposal would go through, > then <p id='å'> and <p id='å'> would be considered having > he same value and thus would make the document invalid due to identical > @id-s. > > In the discussion inside the bug report, the others, including Henri, > wanted @id-s that differ only w.r.t. NFC and NFD, to be considered > unique. Still, Validator.nu would consider @id variant with the > decomposed character as invalid because it isn't NFC normalized. Still, > I think HTML5 says nothing yet, about normalization. So I think this at > best speaks about what Henri think HTML5 should say: That only early > normalization should occur (read: @id values not in NFC form should be > illegal). But if two equivalent variants of the same character occur in > the same document, then parsers should still consider them different. > > W.r.t. to the CharmodNormSummary document, then for C005, I'd like to > suggest two examples when the author might want to avoid NFC: If the > author wants to style different parts a composed character differently > - e.g. in different colors. HTML5 just made this legal - see bug 13502. > > Another example could be that some tests I made showed that, apart from > file searching (with a IE as an exception to that again), 'accént' in > decomposed form was treated more meaningful than 'accént' in composed > form. I tested amongst other things the screenreaders Jaws, VoiceOver > and NVDA to come to that - to myself - surprising conclusion. Simply > put, the decomposed variant was the only variant that was universally > meaningfully 'screen-read'. > > A third example could be authors that want to take advatage of NFD's > symmetrical shape: e.g. if you want to sort words based on word length > in a primitive fashion. > > [1] http://www.w3.org/Bugs/Public/show_bug.cgi?id=12839 > [2] http://www.w3.org/International/wiki/ > -- > leif halvard silli ---- Ivan Herman, W3C Semantic Web Activity Lead Home: http://www.w3.org/People/Ivan/ mobile: +31-641044153 PGP Key: http://www.ivan-herman.net/pgpkey.html FOAF: http://www.ivan-herman.net/foaf.rdf
Attachments
- application/pkcs7-signature attachment: smime.p7s
Received on Thursday, 13 October 2011 05:47:37 UTC