- From: Jirka Kosek <jirka@kosek.cz>
- Date: Mon, 27 Aug 2012 14:15:53 +0200
- To: Felix Sasaki <fsasaki@w3.org>
- CC: Yves Savourel <ysavourel@enlaso.com>, public-multilingualweb-lt@w3.org
- Message-ID: <503B64F9.4020707@kosek.cz>
On 27.8.2012 13:56, Felix Sasaki wrote: >> My question then is: how do you work with such character and XML regex? >> If you can't then that's one more reason to avoid using XML regex. >> > > I would propose to avoid the regex completely then, since it seems that > then the proposal from Jirka at > http://lists.w3.org/Archives/Public/public-multilingualweb-lt/2012Aug/0280.html > wouldn't be a solution too. If you will list allowed characters (instead forbidden) you don't need way how to enter C0 and C1 characters as those can't be expressed in XML anyway. > We had concerns about the regex before, and Michael said this data category > would fulfil his needs without the regex. So let's go forward with that. > Otherwise we will create regex that don't work with the content we want > them to work on. If we need to restrict characters that are allowed for translations regexes are minimum. For some languages even more complex solutions which work on top of regexps might be needed, for example CREPDL schema language: http://lists.dsdl.org/dsdl-discuss/2009-04/att-0005/part7FDIS.pdf We really shoudln't reinvent wheel and use existing standardized stuff. Having comma separated list of forbidden characters is something which is not used elsewhere. Jirka -- ------------------------------------------------------------------ Jirka Kosek e-mail: jirka@kosek.cz http://xmlguru.cz ------------------------------------------------------------------ Professional XML consulting and training services DocBook customization, custom XSLT/XSL-FO document processing ------------------------------------------------------------------ OASIS DocBook TC member, W3C Invited Expert, ISO JTC1/SC34 member ------------------------------------------------------------------
Received on Monday, 27 August 2012 12:16:25 UTC