- From: Yves Savourel <ysavourel@enlaso.com>
- Date: Mon, 27 Aug 2012 05:52:50 -0600
- To: "'Felix Sasaki'" <felix.sasaki@dfki.de>
- CC: <public-multilingualweb-lt@w3.org>
Hi Felix, > - the escaping mechanism with \uHHHH would need to be > converted to numeric character references &#xHHHH; Using &#xHHHH; for all would be fine as it would work with all engines. I was using it to work around the ever-problematic issue of XML invalid characters. > - <> need to be converted to <> Do you mean that in the XML source I need to have forbiddenCharacters="<>" or forbiddenCharacters="&lt;&gt;" ? I didn't see anything special about < and > in the XML regex (besides that < literal must be < when in an XML file, but that > - Both \u0000 and \u001F are forbidden characters in XML. U+0000 and U+001F are, not in a \uHHHH notation. That's why I wasn't using &#xHHHH; But I see your point. My question then is: how do you work with such character and XML regex? If you can't then that's one more reason to avoid using XML regex. > We should either drop the regex at all, use XML Schema > regex (I say your counter arguments, so this is probably no option) > or define a clear specification about what to do when one > uses XML Schema regex, e.g. have a pointer to characters that are > disallowed in XML and XML Schema regex anyway. It seems to me that the third option would be the way. I'll try to post something asap. Thanks for the feedback. -yves
Received on Monday, 27 August 2012 11:53:23 UTC