Re: unconsistent union of patterns

Hi Michael,

> 1) It could be quite time consuming to define all the valid email
> extensions: org, com, edu, gov, net ....
> Is there a way to base a pattern on another one?
> For example comType is a subset of EmailType, how could I define this?

You should make comType by derived from EmailType. Its pattern then
only needs to distinguish it from the other values that are accepted
by EmailType, so you could simply do:

<xs:simpleType name="comType">
  <xs:restriction base="EmailType">
    <xs:pattern value=".+\.com" />
  </xs:restriction>
</xs:simpleType>

This says that strings of type comType have to match the pattern
defined by EmailType *and* the pattern defined here -- they have to
end in .com.

> 2) let's talk about comType and notabcomType (not sure about the
> writing of notabcomType) but let's assume it carries the
> interpretation of : a@b.com is not accepted, please correct my
> syntax,

The problem is that you're using ^ in the regular expression. I know
that in lots of regular expression languages, ^ means the beginning of
the string, but patterns in XML Schema *always* match *all* the
string, so the characters ^ and $ aren't required. I don't think that
you need to escape the @ either -- it's not a significant character,
so you should just use:

  <xsd:simpleType name="notabcomType">
    <xsd:restriction base="xsd:string">
      <xsd:pattern value="a@b\.com"/>
    </xsd:restriction>
  </xsd:simpleType>

> What is the syntax (if any) of comType union orgType but
> notabcomType, The one I am struggling with is the "but".

There's no way to create a simple type that is a union of several
things but not something else, nor is there an anti-pattern facet
to say that the type shouldn't match a particular pattern. So what you
have to do is create a pattern that matches everything aside from the
thing that you don't want it to be... I guess that your 'a@b.com' is a
sample rather than the actual thing you want to match, so I won't try
to put together a pattern for that, but I think it will be rather
complicated...

> If you write the way I did in the example modulo the syntax issue,
> the system has no way of finding that the semantic interpretation of
> notabcomType is inconsistent with the semantic interpretation
> of comType. Am I right?

Yes, that's right. Patterns are checked against instance values, not
against each other for consistency.

> 3) Do you see a better way to do what I want to do?

Possibly it would be better to constrain the basic form of the email
address as you are doing, but to check that it isn't a specific
address at a different layer in the validation, for example through
Schematron, where it's very easy to test these kinds of ad-hoc
constraints:

  <sch:report test=". = 'a@b.com'">
    The email address should not be 'a@b.com'.
  </sch:report>

Cheers,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/

Received on Saturday, 4 May 2002 07:58:20 UTC