W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > April 2013

RE: [Action-484] Create an ABNF based on http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Apr/0047.html

From: Yves Savourel <ysavourel@enlaso.com>
Date: Tue, 23 Apr 2013 10:05:32 -0600
To: "'Pablo Nieto Caride'" <pablo.nieto@linguaserve.com>, "'Felix Sasaki'" <fsasaki@w3.org>, "'Jirka Kosek'" <jirka@kosek.cz>
CC: <public-multilingualweb-lt@w3.org>
Message-ID: <00cc01ce403c$6338b800$29aa2800$@com>
Thanks Pablo.
It seems to be working: it compiles and basic tests work).
This is much appreciated.
-yves


-----Original Message-----
From: Pablo Nieto Caride [mailto:pablo.nieto@linguaserve.com] 
Sent: Tuesday, April 23, 2013 9:20 AM
To: Yves Savourel; 'Felix Sasaki'; 'Jirka Kosek'
Cc: public-multilingualweb-lt@w3.org
Subject: RE: [Action-484] Create an ABNF based on http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Apr/0047.html

((\\\\[nrt\\\\|.?*+(){}\\u002D\\\u005B\\u005D\\u005E])|(\\\\[dD]))|(\\[(([^\\u002D\\u005B\\u005D]|(\\\\[nrt\\\\|.?*+(){}\\u002D\\\u005B\\u005D\\u005E]))-([^\\u002D\\u005B\\u005D]|(\\\\[nrt\\\\|.?*+(){}\\u002D\\\u005B\\u005D\\u005E]))|[^\\u005B\\u005D]|((\\\\[nrt\\\\|.?*+(){}\\u002D\\\u005B\\u005D\\u005E])|(\\\\[dD])))+\\])|(\\.)

Hi Yves,

This should work:
((\\\\[nrt\\\\|.?*+(){}\\u002D\\\u005B\\u005D\\u005E])|(\\\\[dD]))|(\\[(([^\\u002D\\u005B\\u005D]|(\\\\[nrt\\\\|.?*+(){}\\u002D\\\u005B\\u005D\\u005E]))-([^\\u002D\\u005B\\u005D]|(\\\\[nrt\\\\|.?*+(){}\\u002D\\\u005B\\u005D\\u005E]))|[^\\u005B\\u005D]|((\\\\[nrt\\\\|.?*+(){}\\u002D\\\u005B\\u005D\\u005E])|(\\\\[dD])))+\\])|(\\.)

Cheers,
Pablo.
------------------------------------------------------------------------------
yes, Java.

-----Original Message-----
From: Pablo Nieto Caride [mailto:pablo.nieto@linguaserve.com]
Sent: Tuesday, April 23, 2013 5:46 AM
To: Yves Savourel; 'Felix Sasaki'; 'Jirka Kosek'
Cc: public-multilingualweb-lt@w3.org
Subject: RE: [Action-484] Create an ABNF based on http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Apr/0047.html

Hi Yves,

Yes the * if not inside a character class ([*]) it should be escaped \*. I don't have a parser but I can convert the regex easily into another language, in which language are you having trouble? In Java?

Cheers,
Pablo.
--------------------------------------------------------------------------------------------

-----Mensaje original-----
De: Yves Savourel [mailto:ysavourel@enlaso.com] Enviado el: martes, 23 de abril de 2013 13:13
Para: 'Pablo Nieto Caride'; 'Felix Sasaki'; 'Jirka Kosek'
CC: public-multilingualweb-lt@w3.org
Asunto: RE: [Action-484] Create an ABNF based on http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Apr/0047.html

Thanks Pablo,

Do you have a way to generate easily the one for java/C#/etc. ?
I'm getting errors with 'dangling *' for ".?*+(" or maybe * just need to be escaped?)

-yves

-----Original Message-----
From: Pablo Nieto Caride [mailto:pablo.nieto@linguaserve.com]
Sent: Tuesday, April 23, 2013 4:10 AM
To: 'Felix Sasaki'; 'Jirka Kosek'
Cc: public-multilingualweb-lt@w3.org
Subject: RE: [Action-484] Create an ABNF based on http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Apr/0047.html

Hi Felix, Jirka, all,

Since nobody protest about the ABNF, I have updated the regex based on such ABNF. Here it is:
((\\[nrt\\|.?*+(){}\&#x2D;\&#x5B;\&#x5D;\&#x5E;])|(\\[dD]))|(\[(([^\&#x2D;\&#x5B;\&#x5D;]|(\\[nrt\\|.?*+(){}\&#x2D;\&#x5B;\&#x5D;\&#x5E;]))-([^\&#x2D;\&#x5B;\&#x5D;]|(\\[nrt\\|.?*+(){}\&#x2D;\&#x5B;\&#x5D;\&#x5E;]))|[^\&#x5B;\&#x5D;]|((\\[nrt\\|.?*+(){}\&#x2D;\&#x5B;\&#x5D;\&#x5E;])|(\\[dD])))+\])|(\.)

And the ABNF again:
[1] charClass ::= charClassEsc | charClassExpr | WildcardEsc

[2] charClassEsc ::= SingleCharEsc | MultiCharEsc

[3] SingleCharEsc ::= '\' [nrt\|.?*+(){}#x2D#x5B#x5D#x5E]

[4] MultiCharEsc ::= '\' [dD]

[5] charClassExpr ::= '[' charGroup ']'

[6] charGroup ::= posCharGroup | negCharGroup

[7] posCharGroup ::= ( charRange | charClassEsc )+

[8] charRange ::= seRange | XmlCharIncDash

[9] seRange ::= charOrEsc '-' charOrEsc

[10] charOrEsc ::= XmlChar | SingleCharEsc

[11] XmlChar ::= [^\#x2D#x5B#x5D]

[12] XmlCharIncDash ::= [^\#x5B#x5D]

[13] negCharGroup ::= '^' posCharGroup

[14] WildcardEsc ::= '.'

Finally I included the sixteen Unicode planes (0000-10FFFF) like in the ABNF and in Shaun's first version, I hope this resolves the Issue.

Cheers,
Pablo.
_____________________________________________________________________________________

-----Mensaje original-----
De: Pablo Nieto Caride [mailto:pablo.nieto@linguaserve.com]
Enviado el: miƩrcoles, 17 de abril de 2013 10:47
Para: 'Felix Sasaki'; 'Jirka Kosek'
CC: public-multilingualweb-lt@w3.org
Asunto: RE: [Action-484] Create an ABNF based on http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Apr/0047.html

Hi Felix, all,

-----Mensaje original-----
De: Felix Sasaki [mailto:fsasaki@w3.org] Enviado el: miƩrcoles, 17 de abril de 2013 10:15
Para: Jirka Kosek
CC: Pablo Nieto Caride; public-multilingualweb-lt@w3.org
Asunto: Re: [Action-484] Create an ABNF based on http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Apr/0047.html

Hi Jirka, Pablo, all,

Am 17.04.13 09:05, schrieb Jirka Kosek:
> On 16.4.2013 18:29, Pablo Nieto Caride wrote:
>
>> The rules of the ABNF are:
>>
>> [14] charClassSub ::= ( posCharGroup | negCharGroup ) '-' 
>> charClassExpr
> I think that we don't want charClassSub at all. Argument was that many 
> RE engines doesn't support subtraction of classes.
>
>> Now if memory serves we need a RELAX NG schema to validate the grammar, don't we? Or are we going to use the regex finally?
> No, we just need this for specification. For schema we can rewrite 
> this into regex if there is strong demand for this.

For the record, I won't demand this - though it would be nice.

So if there is an updated regex *excluding* the subtraction, can you see if people agree on this during today's call? I would then do the edit for allowed characters and we could close the issue-67 next week.
[PNC]: I can do a final rework to the regex based on the new ABNF since I started it, and I'm more familiar with it, so as to close the issue next week once and for all :)

Best,

Felix

Cheers,
Pablo.
>
> 				Jirka
>
Received on Tuesday, 23 April 2013 16:06:10 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 9 June 2013 00:25:10 UTC