- From: Yves Savourel <ysavourel@enlaso.com>
- Date: Mon, 8 Apr 2013 11:33:31 -0600
- To: "'Pablo Nieto Caride'" <pablo.nieto@linguaserve.com>, "'Felix Sasaki'" <fsasaki@w3.org>, "'Jirka Kosek'" <jirka@kosek.cz>
- CC: <public-multilingualweb-lt@w3.org>
- Message-ID: <004a01ce347f$30e5caa0$92b15fe0$@com>
Hi Felix, Pablo, Jirka, The ABNF description is probably something we really have to have in the specification: it’s human readable and formal. Having a corresponding regex in the schema to check the values would be a big plus. But I don’t think not having it working yet should stop us to update the specification. -yves From: Pablo Nieto Caride [mailto:pablo.nieto@linguaserve.com] Sent: Monday, April 08, 2013 11:21 AM To: 'Felix Sasaki'; 'Jirka Kosek' Cc: public-multilingualweb-lt@w3.org Subject: RE: [Issue-67] [Action-385] Work on regex for validating regex subset proposal Hi Felix, Jirka, all, As I said I think that the ABNF approach it’s not bad, but I also think that having a list of allowed items and the regex in the schema is fine too, I don’t know what the implementers of the data category think about this. Thanks Jirka the new library works. Cheers, Pablo. ------------------------------------------------------- Am 08.04.13 18:28, schrieb Jirka Kosek: On 8.4.2013 18:15, Felix Sasaki wrote: Trying to move this forward: Would this ABNF make sense to you http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2013Apr/0027.html ("BMP+escapes" still needs to be defined) I'm not sure whether this ABNF does what it should do. For example this grammar allows ^ almost anywhere but I think that in most RE engines ^ should directly follow [ if it's meant as a negation. Agree - you could resolve that by removing neg from char = [neg] BMP+escapes and change allowedCharacters = start 1*range end ["+"] to allowedCharacters = start [neg] 1*range end ["+"] Maybe starting with grammar in W3C XML Schema spec and forbidding some rules would be easier. Currently in the spec http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#allowedchars-definition We reference the XML Schema grammar http://www.w3.org/TR/xmlschema-2/#charcter-classes but not a specific production in the grammar. Which one would you choose, e.g. http://www.w3.org/TR/xmlschema-2/#nt-charClassExpr ? I'm fine with the "XML Schema disallowing" approach. But ending up with a means to validate the regex, and not leaving that to the regex engine, seems crucial as part of resolving the issue. From previous discussions it seems pointing people to XML Schema with some additional information (e.g. "assume that this is not allowed" won't help - implementers will just use their (non XML Schema) engine. P.S.: different topic - I had the same issues as Pablo with the validation with the testsuite: I had to use my local copy of jing, the one in github didn't work. It works for me. Anyway I synced versions of Jing, so you can give it another try. Thanks, will do. Best, Felix
Received on Monday, 8 April 2013 17:34:09 UTC