W3C home > Mailing lists > Public > xmlschema-dev@w3.org > September 2003

Re: regular expression in XML Schema

From: <fe.sola@infomed.sld.cu>
Date: Thu, 25 Sep 2003 23:04:03 -0400
Message-ID: <1064545443.3f73aca3b1506@webmail.sld.cu>
To: Hans Teijgeler <hans.teijgeler@quicknet.nl>
Cc: "xml-schema, mailing list" <xmlschema-dev@w3.org>, "weitz, edi" <edi@agharta.de>, "paap, onno" <onno.paap@ezzysurf.com>

Hello Hans,
>           All I want is to allow any program to easily detect where the identifier
> stops and
> the suffix starts.
>           For example you might see an identifier like
> FLUOR__HAA__P3712-05__ME00__40293u0dME14
>           that stops at 40283u0d. Then we get the suffix ME14, and in between some kind
> of
> weird character
>           that almost certainly is never used in an identifier (alternatives are
> welcome!)
Ok, I got the idea, I guess I simplified your requirements

>           The question was whether the requirement of a middle dot in the expression
>           ([a-zA-Z][a-zA-Z0-9]*__)*[a-zA-Z0-9\.\-]+(&#x00B7;[a-zA-Z0-9\.\-]+)?
>           was properly expressed

In the regex coach this expression matches the sufix ME14, so if you want your program 
to find all sufixes, it could work.

>           That's not the point, the difficulty is in the fact that it is not a simple
> dot
> (period),
>           but a Unicode #x00B7 middle dot (or any other allowable Unicode character,
> for
> that matter)
>    * It works with plain text.
>           The point is: If Unicode characters are allowable, how then do you enter them
> in a
>           fill-in-the-blanks XML document? (See my reply to Jeni Tennison)
I'm going to check that post again, and maybe this is a dummy idea, but if you are going 
to fill in the XML file by hand the use the Alt+# combination of the keyboard, if your 
program will generate the character, then use a function like Char(&#x00B7) (I think 
that's VBScript) and concatenate it to the selected string. 
Are you sure that middle dot is in the UTF-8 encoding? It might be in the UTF-16 and 
maybe that's why the Spy's processor can't recognize it. This might be a fatal error 
because the XML processor encounters an entity with an encoding that it is unable to 
Anyway, hth

Este mensaje fue enviado usando el servicio de correo en web de Infomed
Received on Thursday, 25 September 2003 23:15:44 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 14:56:03 UTC