W3C home > Mailing lists > Public > xmlschema-dev@w3.org > September 2003

Re: regular expression in XML Schema

From: Hans Teijgeler <hans.teijgeler@quicknet.nl>
Date: Thu, 25 Sep 2003 17:18:35 +0200
To: xmlschema-dev@w3.org
Cc: "weitz, edi" <edi@agharta.de>, "paap, onno" <onno.paap@ezzysurf.com>
Message-id: <3F73074B.273CE4B@quicknet.nl>
Dear Experts,

This is the continuing sage of the Regular Expressions. Last time I thought I
had the answer, but alas!

Thanks to the good help of Edi Weitz I got a bit further, and we arrived at the
following RegEx for an identifier of the type Name:


([a-zA-Z][a-zA-Z0-9]*__)*[a-zA-Z0-9\.\-]+(&#x00B7;[a-zA-Z0-9\.\-]+)?

Everything works, the suffix at the end is now optional. BUT I still have some
problems/questions:

  1. I still need some document in which the whole subject of the Regualar
     Expressions in XML Schema is explained. I read through the concept book of
     Eric van der Vlist
     (http://books.xmlschemata.org/relaxng/RngBookWxsRegExp.html ) but that book
     assumes that I know much more than I do. I need something that starts at
     zero, for dummies, with MANY examples. Any suggestions?
  2. What is a "combiningchar" and what an "extender"? It is being talked about
     in XML as being an allowable part of Namechar, but nowhere I can find what
     it really IS and what it is used for. You guys/gals must have read
     something that I haven't, so apparently you know it (if not, why didn't you
     ask or complain?)
  3. I want to separate the first part of the identifier
     ([a-zA-Z][a-zA-Z0-9-]*__)*[a-zA-Z0-9.-]+  from the second (optional) part
     ([a-zA-Z0-9.-]+)? by means of a character that normally isn't used in
     system identifiers. So I chose the "middle dot" (#x00B7). I have three
     questions:
       1. Is the way it has now been introduced in the above RegEx correct?
       2. If I make an XML document based on an XML Schema (e.g. in Spy), how
          can I fill in such a middle dot as part of a Name? I have tried
          everything I could think of, but with no success
       3. In how far does the font type play a role? I found a middle dot in the
          Windows Character Map under Trebuchet MS (called U+00B7 Middle Dot),
          but Spy didn't accept that

Please enlighten me!

Regards,
Hans


Received on Thursday, 25 September 2003 11:14:51 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 11 January 2011 00:14:39 GMT