Re: forbiddenCharacters data category - related to [ACTIOn-189]

On 27.8.2012 12:26, Felix Sasaki wrote:

> About the regex: the example is not compatible with XML Schema regex:
> - the escaping mechanism with \uHHHH would need to be converted to numeric
> character references &#xHHHH;
> - <> need to be converted to &lt;&gt;
> - Both \u0000 and \u001F are forbidden characters in XML.
> We should either drop the regex at all, use XML Schema regex (I say your
> counter arguments, so this is probably no option) or define a clear
> specification about what to do when one uses XML Schema regex, e.g. have a
> pointer to characters that are disallowed in XML and XML Schema regex
> anyway.


we can solve most of Felix's concerns by reverting data category.
Instead of enumerating forbidden characters we can enumerate allowed
characters. Then we can use charClass production from XML Schema spec to
define subset of regular expressions:

IMHO this should cover all use-cases from Yves draft:

"• Limit the characters which may be used in the UI of a game because of
some special font limitation.
• Prevent illegal characters to be entered for text content that are
file or directory names.
• Control what characters can be used when translating examples of login
name in a content."


  Jirka Kosek      e-mail:
       Professional XML consulting and training services
  DocBook customization, custom XSLT/XSL-FO document processing
 OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34 member

Received on Monday, 27 August 2012 11:12:32 UTC