- From: Jirka Kosek <jirka@kosek.cz>
- Date: Tue, 28 Aug 2012 00:21:47 +0200
- To: Yves Savourel <ysavourel@enlaso.com>
- CC: public-multilingualweb-lt@w3.org
- Message-ID: <503BF2FB.8060002@kosek.cz>
On 27.8.2012 22:30, Yves Savourel wrote: > But, as you show, such character class subtraction doesn't have a common syntax across different engines, or it's not supported in other (like JavaScript as far as I know). That's why I think we want to avoid picking the all the features of one engine. For example we wouldn't allow \d, \w, etc. either because it's not interoperable. We should think about users first and second about implementations. ITS is used in XML documents. XML documents are described by XML schemas. Global rules in ITS are expressed as XPath. These are good reasons to use regexp syntax which is defined by XML Schema (and used in XPath 2.0 as well) because XML-savvy people are already familiar with it. I don't think we should define our own subset, we should just reuse what people know and what's already standardized in XML world: http://www.w3.org/TR/xmlschema-2/#nt-charClass Of course implementations are also important. But as there are open-source implementations of XML Schema regexps for all major platforms -- for example Saxon for Java/.NET and libxml2 for C/C++ -- I don't see any problem here. You will simply reuse existing code instead of relying on default platform regexp engine. Jirka -- ------------------------------------------------------------------ Jirka Kosek e-mail: jirka@kosek.cz http://xmlguru.cz ------------------------------------------------------------------ Professional XML consulting and training services DocBook customization, custom XSLT/XSL-FO document processing ------------------------------------------------------------------ OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34 member ------------------------------------------------------------------
Received on Monday, 27 August 2012 22:22:16 UTC