W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > April 2013

RE: [ACTION 496] Allowed Characters regex

From: Pablo Nieto Caride <pablo.nieto@linguaserve.com>
Date: Mon, 29 Apr 2013 09:43:11 +0200
To: "'Felix Sasaki'" <fsasaki@w3.org>, "'Yves Savourel'" <ysavourel@enlaso.com>
Cc: <public-multilingualweb-lt@w3.org>
Message-ID: <0f6401ce44ad$332f4550$998dcff0$@linguaserve.com>
Hi Felix, all,

 

Hi Yves, Pablo, all,

Yves, thanks a lot for the update of the section, looks good. Just one thing: I have put the XML regex into Jirka's schema and validated the example
[^&lt;>:&quot;\\/|\?*]]
and it didn't validate. But maybe that's the example - should it be
[^&lt;>:&quot;\\/|\?*]

?



[PNC]: Yes, the first regex is wrong, the penultimate square bracket should be escaped [^&lt;>:&quot;\\/|\?*\]], anyway the second one is valid.


At Pablo, Yves, Jirka, all: would it be ok to add the two updated regex to the spec directory, with a non-normative note like this below the EBNF:

"Users may want to use a regular expression to make sure that they follow above definition. Sample regular expressions to verify the regular expression in allowed characters are provided: for XML and for Java."

With links to a text file containing the regex? 

 

[PNC]: Sounds like a good idea.

 

Cheers,

Pablo.


Best,

Felix

Am 26.04.13 15:35, schrieb Yves Savourel:

Thanks Pablo.
-ys
 
-----Original Message-----
From: Pablo Nieto Caride [mailto:pablo.nieto@linguaserve.com] 
Sent: Friday, April 26, 2013 3:37 AM
To: Yves Savourel; public-multilingualweb-lt@w3.org
Subject: RE: [ACTION 496] Allowed Characters regex
 
Hi Yves, all,
 
The updated ABNF is correct, sorry for not being aware that MultiCharEsc wasn't compatible with some engines.
 
The text and the examples seem good to me.
 
The updated regex in case you need it is:
XML:
((\\[nrt\\|.?*+(){}\ <file:///\\[nrt\|.%3f*+()%7b%7d\&%23x2D;\&%23x5B;\&%23x5D;\&%23x5E;%5d))|(\%5b((%5b%5e\&%23x2D;\&%23x5B;\&%23x5D;%5d|(\%5bnrt\|.%3f*+()%7b%7d\&%23x2D;\&%23x5B;\&%23x5D;\&%23x5E;%5d))-(%5b%5e\&%23x2D;\&%23x5B;\&%23x5D;%5d|(\%5bnrt\|.%3f*+()%7b%7d\&%23x2D;\&%23x5B;\&%23x5D;\&%23x5E;%5d))|%5b%5e\&%23x5B;\&%23x5D;%5d|((\%5bnrt\|.%3f*+()%7b%7d\&%23x2D;\&%23x5B;\&%23x5D;\&%23x5E;%5d)))+\%5d)|(\.)> &#x2D;\&#x5B;\&#x5D;\&#x5E;]))|(\[(([^\&#x2D;\&#x5B;\&#x5D;]|(\\[nrt\\|.?*+(){}\&#x2D;\&#x5B;\&#x5D;\&#x5E;]))-([^\&#x2D;\&#x5B;\&#x5D;]|(\\[nrt\\|.?*+(){}\&#x2D;\&#x5B;\&#x5D;\&#x5E;]))|[^\&#x5B;\&#x5D;]|((\\[nrt\\|.?*+(){}\&#x2D;\&#x5B;\&#x5D;\&#x5E;])))+\])|(\.)
 
Java:
((\\\\[nrt\\\\|.?*+(){}\\u002D\\\u005B\\u005D\\u005E]))|(\\[(([^\\u002D\\u005B\\u005D]|(\\\\[nrt\\\\|.?*+(){}\\u002D\\\u005B\\u005D\\u005E]))-([^\\u002D\\u005B\\u005D]|(\\\\[nrt\\\\|.?*+(){}\\u002D\\\u005B\\u005D\\u005E]))|[^\\u005B\\u005D]|((\\\\[nrt\\\\|.?*+(){}\\u002D\\\u005B\\u005D\\u005E])))+\\])|(\\.) <file:///\\\[nrt\|.%3f*+()%7b%7d\u002D\u005B\u005D\u005E%5d))|(\%5b((%5b%5e\u002D\u005B\u005D%5d|(\%5bnrt\|.%3f*+()%7b%7d\u002D\u005B\u005D\u005E%5d))-(%5b%5e\u002D\u005B\u005D%5d|(\%5bnrt\|.%3f*+()%7b%7d\u002D\u005B\u005D\u005E%5d))|%5b%5e\u005B\u005D%5d|((\%5bnrt\|.%3f*+()%7b%7d\u002D\u005B\u005D\u005E%5d)))+\%5d)|(\.)> 
 
Cheers,
Pablo.
-----------------------------------------------------------------------------------------------
 
Hi everyone,
 
I've updated the draft specification with the new regular expression definition, as well as the examples:
http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#allowedchars
 
I did removed allowance for the \d and \D constructs.
(I can edit things again if needed)
 
Thanks,
-yves
 
 
 
 
 

 
Received on Monday, 29 April 2013 07:43:44 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:32:07 UTC