W3C home > Mailing lists > Public > public-multilingualweb-lt@w3.org > April 2013

RE: [Action-484] Create an ABNF based on http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Apr/0047.html

From: Pablo Nieto Caride <pablo.nieto@linguaserve.com>
Date: Tue, 23 Apr 2013 16:04:45 +0200
To: "'Felix Sasaki'" <fsasaki@w3.org>, "'Yves Savourel'" <ysavourel@enlaso.com>
Cc: "'Jirka Kosek'" <jirka@kosek.cz>, <public-multilingualweb-lt@w3.org>
Message-ID: <0a8801ce402b$82a2ecd0$87e8c670$@linguaserve.com>

+1 nevertheless I'll send the regex rewritten in Java to Yves.

Cheers,
Pablo.
----------------------------------------------------------------------------

Hi Yves, Pablo, all,

just for the record, I think at this point the regex ist not urging, since we won't put it normatively in the spec - if we following Jirka's argumentation at http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Apr/0102.html
So I would propose to close the issue during tomorrow's call with an action to put the ABNF in the spec. We even can keep the reference on XML Schema since the ABNF is a proper subset.

Best,

Felix

Am 23.04.13 13:51, schrieb Yves Savourel:
> yes, Java.
>
> -----Original Message-----
> From: Pablo Nieto Caride [mailto:pablo.nieto@linguaserve.com]
> Sent: Tuesday, April 23, 2013 5:46 AM
> To: Yves Savourel; 'Felix Sasaki'; 'Jirka Kosek'
> Cc: public-multilingualweb-lt@w3.org
> Subject: RE: [Action-484] Create an ABNF based on 
> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Apr/
> 0047.html
>
> Hi Yves,
>
> Yes the * if not inside a character class ([*]) it should be escaped \*. I don't have a parser but I can convert the regex easily into another language, in which language are you having trouble? In Java?
>
> Cheers,
> Pablo.
> ----------------------------------------------------------------------
> ----------------------
>
> -----Mensaje original-----
> De: Yves Savourel [mailto:ysavourel@enlaso.com] Enviado el: martes, 23 
> de abril de 2013 13:13
> Para: 'Pablo Nieto Caride'; 'Felix Sasaki'; 'Jirka Kosek'
> CC: public-multilingualweb-lt@w3.org
> Asunto: RE: [Action-484] Create an ABNF based on 
> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Apr/
> 0047.html
>
> Thanks Pablo,
>
> Do you have a way to generate easily the one for java/C#/etc. ?
> I'm getting errors with 'dangling *' for ".?*+(" or maybe * just need 
> to be escaped?)
>
> -yves
>
> -----Original Message-----
> From: Pablo Nieto Caride [mailto:pablo.nieto@linguaserve.com]
> Sent: Tuesday, April 23, 2013 4:10 AM
> To: 'Felix Sasaki'; 'Jirka Kosek'
> Cc: public-multilingualweb-lt@w3.org
> Subject: RE: [Action-484] Create an ABNF based on 
> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Apr/
> 0047.html
>
> Hi Felix, Jirka, all,
>
> Since nobody protest about the ABNF, I have updated the regex based on such ABNF. Here it is:
> ((\\[nrt\\|.?*+(){}\&#x2D;\&#x5B;\&#x5D;\&#x5E;])|(\\[dD]))|(\[(([^\&#
> x2D;\&#x5B;\&#x5D;]|(\\[nrt\\|.?*+(){}\&#x2D;\&#x5B;\&#x5D;\&#x5E;]))-
> ([^\&#x2D;\&#x5B;\&#x5D;]|(\\[nrt\\|.?*+(){}\&#x2D;\&#x5B;\&#x5D;\&#x5
> E;]))|[^\&#x5B;\&#x5D;]|((\\[nrt\\|.?*+(){}\&#x2D;\&#x5B;\&#x5D;\&#x5E
> ;])|(\\[dD])))+\])|(\.)
>
> And the ABNF again:
> [1] charClass ::= charClassEsc | charClassExpr | WildcardEsc
>
> [2] charClassEsc ::= SingleCharEsc | MultiCharEsc
>
> [3] SingleCharEsc ::= '\' [nrt\|.?*+(){}#x2D#x5B#x5D#x5E]
>
> [4] MultiCharEsc ::= '\' [dD]
>
> [5] charClassExpr ::= '[' charGroup ']'
>
> [6] charGroup ::= posCharGroup | negCharGroup
>
> [7] posCharGroup ::= ( charRange | charClassEsc )+
>
> [8] charRange ::= seRange | XmlCharIncDash
>
> [9] seRange ::= charOrEsc '-' charOrEsc
>
> [10] charOrEsc ::= XmlChar | SingleCharEsc
>
> [11] XmlChar ::= [^\#x2D#x5B#x5D]
>
> [12] XmlCharIncDash ::= [^\#x5B#x5D]
>
> [13] negCharGroup ::= '^' posCharGroup
>
> [14] WildcardEsc ::= '.'
>
> Finally I included the sixteen Unicode planes (0000-10FFFF) like in the ABNF and in Shaun's first version, I hope this resolves the Issue.
>
> Cheers,
> Pablo.
> ______________________________________________________________________
> _______________
>
> -----Mensaje original-----
> De: Pablo Nieto Caride [mailto:pablo.nieto@linguaserve.com]
> Enviado el: miƩrcoles, 17 de abril de 2013 10:47
> Para: 'Felix Sasaki'; 'Jirka Kosek'
> CC: public-multilingualweb-lt@w3.org
> Asunto: RE: [Action-484] Create an ABNF based on 
> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Apr/
> 0047.html
>
> Hi Felix, all,
>
> -----Mensaje original-----
> De: Felix Sasaki [mailto:fsasaki@w3.org] Enviado el: miƩrcoles, 17 de 
> abril de 2013 10:15
> Para: Jirka Kosek
> CC: Pablo Nieto Caride; public-multilingualweb-lt@w3.org
> Asunto: Re: [Action-484] Create an ABNF based on 
> http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Apr/
> 0047.html
>
> Hi Jirka, Pablo, all,
>
> Am 17.04.13 09:05, schrieb Jirka Kosek:
>> On 16.4.2013 18:29, Pablo Nieto Caride wrote:
>>
>>> The rules of the ABNF are:
>>>
>>> [14] charClassSub ::= ( posCharGroup | negCharGroup ) '-'
>>> charClassExpr
>> I think that we don't want charClassSub at all. Argument was that 
>> many RE engines doesn't support subtraction of classes.
>>
>>> Now if memory serves we need a RELAX NG schema to validate the grammar, don't we? Or are we going to use the regex finally?
>> No, we just need this for specification. For schema we can rewrite 
>> this into regex if there is strong demand for this.
> For the record, I won't demand this - though it would be nice.
>
> So if there is an updated regex *excluding* the subtraction, can you see if people agree on this during today's call? I would then do the edit for allowed characters and we could close the issue-67 next week.
> [PNC]: I can do a final rework to the regex based on the new ABNF 
> since I started it, and I'm more familiar with it, so as to close the 
> issue next week once and for all :)
>
> Best,
>
> Felix
>
> Cheers,
> Pablo.
>> 				Jirka
>>
>
>
>
>
>
Received on Tuesday, 23 April 2013 14:05:12 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:32:07 UTC