- From: Michael Kay <mike@saxonica.com>
- Date: Thu, 14 Jun 2007 16:53:00 +0100
- To: "'Paul Warren'" <pdw@decisionsoft.com>, <bud@syndafeed.com>
- Cc: <xmlschema-dev@w3.org>, <tools@decisionsoft.com>
There's a long history here, only a small part of which is captured at: http://www.w3.org/Bugs/Public/show_bug.cgi?id=1889 In fact the current spec makes three statements within a few lines of each other, none of which agrees with the others, and there are no clues as to which one takes precedence: (1) [17] charRange ::= seRange | XmlCharIncDash which says that "-" is always a valid character-range (2) The [, ], - and \ characters are not valid character ranges; which says that "-" can't be a character range (3) The - character is a valid character range only at the beginning or end of a .positive character group.. which says it's sometimes valid and sometimes isn't (and says it in a very odd way, because how do you know whether you're at the end of a positive character group, especially where subtraction is involved?) In my current implementation in Saxon I decided to allow "-" anywhere within a character range, interpreting it as representing itself except in a context where it can be interpreted as a range operator [a-z] or a subtraction operator [\p{Lu}-[AEIOU]]. Users would be well-advised to steer clear of this and escape the "-" everywhere. Michael Kay http://www.saxonica.com/ > -----Original Message----- > From: xmlschema-dev-request@w3.org > [mailto:xmlschema-dev-request@w3.org] On Behalf Of Paul Warren > Sent: 14 June 2007 15:43 > To: bud@syndafeed.com > Cc: xmlschema-dev@w3.org; tools@decisionsoft.com > Subject: Re: validator disparity > > > Hi Bud, > > I should point out that our schema validation service is > simply a web frontend to the Xerces-J Schema Validator (this > used to be made clear on the web page, but it seems that > notice has gone AWOL - I'll get that fixed). > > A quick look at the schema spec suggests that Xerces is wrong on this > point: > > "The - character is a valid character range only at the > beginning or end of a .positive character group.." > > regards, > > Paul > > > On 14 Jun 2007, at 15:26, Bud Hovell wrote: > > > Hi, folks ... > > > > I've run across a bit of a puzzle and thought I'd at least > report it > > for examination by others technically qualified. > > > > In the course of validating a test output file found at http:// > > www.amexpat.com/primeloc.xml, I discovered both it and the schema > > validate without complaint on the W2C validator at http:// > > www.w3.org/2001/03/webdata/xsv, but the schema does not validate at > > http://tools.decisionsoft.com/schemaValidate/, which offers the > > following complaint: > > ================== OUTPUT ===================== XML Schema Validator > > > > Well Formed: VALID > > Schema Validation: INVALID > > > > The following errors were found: > > TYPELOCMESSAGE > > Validation 128, 38InvalidRegex: Pattern value '[-0-9]*' is > not a valid > > regular expression. The reported error was: ''-' is an invalid > > character range. Write '\-'.'. > > Validation 134, 38InvalidRegex: Pattern value '[-0-9+ ()]*' > is not a > > valid regular expression. The reported error was: ''-' is > an invalid > > character range. Write '\-'.'. > > > > ================ END OUTPUT =============== > > > > ... evidently because the non-range-denoting "-" character > is shown in > > the first position of the pattern match in brackets rather > than last. > > I'm not acquainted with the specific rules for schema > validation, but > > seem to recall that most regex matching rules DO require a literal > > naked dash to be mentioned last. In this case, the parser > evidently > > wants to see it backslashed so it is understood to denote a literal > > rather than a range. > > > > This is the text of the two relevant blocks in the > 2007-05-21 schema > > file (attached in full) which I received from the provider and have > > input for testing at DecisionSoft: > > > > <xsd:simpleType name="integerOrNull_Type"> > > <xsd:restriction base="xsd:string"> > > <xsd:pattern value="[-0-9]*"/> > > </xsd:restriction> > > </xsd:simpleType> > > <!-- Telephone can contains numbers, spaces, > brackets, +'s and > > -'s /--> > > <xsd:simpleType name="telephoneNumber_Type"> > > <xsd:restriction base="xsd:string"> > > <xsd:pattern value="[-0-9+ ()]*"/> > > </xsd:restriction> > > </xsd:simpleType> > > > > ... which shows no evidence of a backslash to protect the literal > > dash. > > > > These two parsers offer conflicting results given identical > input. > > While I'm agnostic as to which may be judged correct, they > should at > > least agree even if both are in error. :) > > > > I'm jointly addressing this to the W3C team and the folks over at > > DecisionSoft in hope this disparity may be resolved. > > > > Best regards, > > -- Bud Hovell bud@syndafeed.com http://www.syndafeed.com <?xml > > version="1.0" encoding="utf-8"?> > > <!-- edited with XMLSpy v2006 rel. 3 sp1 (http://www.altova.com) by > > Andy Dawkins (Primelocation) --> <xsd:schema > > xmlns:xsd="http://www.w3.org/2001/XMLSchema"> > > <xsd:annotation> > > <xsd:documentation xml:lang="en"> > > PrimeLocation.com FastcropX1 data schema - Last Update > > 2007-05-21 > > </xsd:documentation> > > </xsd:annotation> > > <xsd:element name="root" type="root_Type"/> > > <xsd:complexType name="root_Type"> > > <xsd:sequence> > > <xsd:element name="agentGroup" > type="agentGroup_Type" > > minOccurs="0" maxOccurs="unbounded"/> > > </xsd:sequence> > > </xsd:complexType> > > <xsd:complexType name="agentGroup_Type"> > > <xsd:sequence> > > <xsd:element name="mode" > type="agentGroupMode_Type" > > default="FULL"/> > > <xsd:element name="exportDate" > type="xsd:dateTime" minOccurs="0"/> > > <!-- Not madatory but useful for debugging --> > > <xsd:element name="agentBranch" > type="agentBranch_Type" > > minOccurs="0" maxOccurs="unbounded"/> > > </xsd:sequence> > > <xsd:attribute name="code" type="xsd:string" > use="required"/> > > </xsd:complexType> > > <xsd:complexType name="agentBranch_Type"> > > <xsd:sequence> > > <xsd:element name="property" > type="property_Type" minOccurs="0" > > maxOccurs="unbounded"/> > > </xsd:sequence> > > <xsd:attribute name="code" type="xsd:string" > use="required"/> > > </xsd:complexType> > > <xsd:complexType name="property_Type"> > > <xsd:choice> > > <xsd:sequence> > > <!-- Property Address Details /--> > > <xsd:element > name="fullPostCode" type="xsd:string"/> > > <xsd:element name="countryCode" > type="countryCode_Type" > > default="GB" minOccurs="0"/> > > <xsd:element name="name" > type="xsd:string"/> > > <xsd:element name="address" > type="xsd:string"/> > > <xsd:element name="regionCode" > type="xsd:string" minOccurs="0"/> > > <!-- Property Description /--> > > <xsd:element name="summary" > type="xsd:string" minOccurs="0"/> > > <xsd:element name="details" > type="xsd:string" minOccurs="0"/> > > <!-- Property Price Information /--> > > <xsd:element name="pricePrefix" > type="pricePrefix_Type"/> > > <xsd:element name="price" > type="integerRange_Type"/> > > <xsd:element > name="priceCurrency" type="priceCurrency_Type" > > default="GBP" minOccurs="0"/> > > <!-- Property sale specifics /--> > > <xsd:element > name="sellingState" type="sellingState_Type"/> > > <xsd:element > name="propertyType" type="propertyType_Type"/> > > <xsd:element name="newHome" > type="xsd:string" minOccurs="0"/> > > <xsd:element name="saleOrRent" > type="saleOrRent_Type"/> > > <xsd:element > name="sharedCommission" type="xsd:string" > > minOccurs="0"/> > > <!-- Rental Information /--> > > <xsd:element name="groundRent" > type="xsd:decimal" minOccurs="0"/> > > <!-- Value in GBP per annum /--> > > <xsd:element > name="serviceCharge" type="xsd:decimal" > > minOccurs="0"/> > > <!-- Value in GBP per annum /--> > > <xsd:element name="furnished" > type="xsd:boolean" minOccurs="0"/> > > <xsd:element > name="rentalLength" type="xsd:int" minOccurs="0"/> > > <!-- Tenure Information /--> > > <xsd:element name="tenure" > type="tenure_Type" default="" > > minOccurs="0"/> > > <xsd:element > name="leaseholdYearsRemaining" > > type="integerOrNull_Type" minOccurs="0"/> > > <!-- Property Room Information /--> > > <xsd:element name="bedrooms" > type="integerRange_Type"/> > > <xsd:element name="bathrooms" > type="integerRange_Type"/> > > <xsd:element > name="receptionRooms" type="integerRange_Type"/> > > <!-- Property Images, Supported > types: JPG, PNG, GIF /--> > > <xsd:element name="mainImage" > type="asset_Type" minOccurs="0"/> > > <!-- The file name of the image /--> > > <xsd:element > name="additionalImage1" type="asset_Type" > > minOccurs="0"/> > > <xsd:element > name="additionalImage2" type="asset_Type" > > minOccurs="0"/> > > <xsd:element > name="additionalImage3" type="asset_Type" > > minOccurs="0"/> > > <xsd:element > name="additionalImage4" type="asset_Type" > > minOccurs="0"/> > > <!-- Floorplans, Up to four > images ( JPG, PNG, GIF ) OR a single > > PDF /--> > > <xsd:element name="floorPlan1" > type="asset_Type" minOccurs="0"/> > > <!-- The file name of the image /--> > > <xsd:element name="floorPlan2" > type="asset_Type" minOccurs="0"/> > > <xsd:element name="floorPlan3" > type="asset_Type" minOccurs="0"/> > > <xsd:element name="floorPlan4" > type="asset_Type" minOccurs="0"/> > > <!-- Brochure, A single PDF /--> > > <xsd:element name="brochure" > type="asset_Type" minOccurs="0"/> > > <!-- The file name of the pdf /--> > > <!-- Virtual Tour --> > > <xsd:element name="vTourURL" > type="xsd:string" minOccurs="0"/> > > <!-- URL to a virtual Tour --> > > <!-- Virtual Tour --> > > <xsd:element name="vTour2URL" > type="xsd:string" minOccurs="0"/> > > <!-- URL to a virtual Tour --> > > <!-- HIP Document --> > > <xsd:element name="HIPDocument" > type="asset_Type" minOccurs="0"/> > > <!-- Filename or URL to an HIP > Document --> > > <!-- EPC Document --> > > <xsd:element name="EPCDocument" > type="asset_Type" minOccurs="0"/> > > <!-- Filename or URL to an EPC > Document --> > > <!-- Energy Efficiency Ratings --> > > <xsd:element name="EERImage" > type="asset_Type" minOccurs="0"/> > > <xsd:element name="EERCurrent" > type="xsd:integer" minOccurs="0"/> > > <xsd:element > name="EERPotential" type="xsd:integer" > > minOccurs="0"/> > > <!-- Environment Impact Ratings --> > > <xsd:element name="EIRImage" > type="asset_Type" minOccurs="0"/> > > <xsd:element name="EIRCurrent" > type="xsd:integer" minOccurs="0"/> > > <xsd:element > name="EIRPotential" type="xsd:integer" > > minOccurs="0"/> > > <!-- Optional Contact > Information. If provided will be used > > instead of contact information of the agent branch --> > > <xsd:element name="contactName" > type="xsd:string" minOccurs="0"/> > > <xsd:element name="contactTelephone" > > type="telephoneNumber_Type" minOccurs="0"/> > > <xsd:element > name="contactEmail" type="xsd:string" minOccurs="0"/> > > <!-- Additional Record Information /--> > > <xsd:element name="createdDate" > type="xsd:dateTime" > > minOccurs="0"/> > > <xsd:element > name="modifiedDate" type="xsd:dateTime" > > minOccurs="0"/> > > <xsd:element > name="additionalKeywords" type="xsd:string" > > minOccurs="0"/> > > <xsd:element name="notes" > type="xsd:string" minOccurs="0"/> > > </xsd:sequence> > > <xsd:sequence> > > <xsd:element name="delete" > type="xsd:string" default="1" > > minOccurs="0"/> > > </xsd:sequence> > > </xsd:choice> > > <xsd:attribute name="propertyID" > type="xsd:string" use="required"/> > > </xsd:complexType> > > <xsd:complexType name="asset_Type"> > > <xsd:simpleContent> > > <xsd:extension base="xsd:string"> > > <xsd:attribute > name="modifiedDate" type="xsd:dateTime" > > use="optional"/> > > </xsd:extension> > > </xsd:simpleContent> > > </xsd:complexType> > > <!-- countryCode is always 2 alpha characters /--> > > <xsd:simpleType name="countryCode_Type"> > > <xsd:restriction base="xsd:string"> > > <xsd:pattern value="[A-Za-z]{2}"/> > > </xsd:restriction> > > </xsd:simpleType> > > <!-- priceCurrency is always 3 alpha characters /--> > > <xsd:simpleType name="priceCurrency_Type"> > > <xsd:restriction base="xsd:string"> > > <xsd:pattern value="[A-Za-z]{3}"/> > > </xsd:restriction> > > </xsd:simpleType> > > <!-- price,bedrooms,bathrooms, etc > > can be a string representation of an integer > > or an integer range of two integers seperated by ' > TO ' or ' > > - ' /--> > > <xsd:simpleType name="integerRange_Type"> > > <xsd:restriction base="xsd:string"> > > <xsd:pattern value="([0-9]* ?(TO|-) > ?[0-9]*|[0-9]*)"/> > > </xsd:restriction> > > </xsd:simpleType> > > <xsd:simpleType name="integerOrNull_Type"> > > <xsd:restriction base="xsd:string"> > > <xsd:pattern value="[-0-9]*"/> > > </xsd:restriction> > > </xsd:simpleType> > > <!-- Telephone can contains numbers, spaces, brackets, > +'s and -'s > > /--> > > <xsd:simpleType name="telephoneNumber_Type"> > > <xsd:restriction base="xsd:string"> > > <xsd:pattern value="[-0-9+ ()]*"/> > > </xsd:restriction> > > </xsd:simpleType> > > <!-- agentGroupMode has a set list of possible values /--> > > <xsd:simpleType name="agentGroupMode_Type"> > > <xsd:restriction base="xsd:string"> > > <xsd:enumeration value="FULL"/> > > <xsd:enumeration value="INCR"/> > > <!-- Full /--> > > <!-- Incremental /--> > > </xsd:restriction> > > </xsd:simpleType> > > <!-- pricePrefix has a set list of possible values /--> > > <xsd:simpleType name="pricePrefix_Type"> > > <xsd:restriction base="xsd:string"> > > <xsd:enumeration value="F"/> > > <xsd:enumeration value="I"/> > > <xsd:enumeration value="O"/> > > <xsd:enumeration value="A"/> > > <xsd:enumeration value="S"/> > > <xsd:enumeration value="R"/> > > <xsd:enumeration value="B"/> > > <xsd:enumeration value="G"/> > > <xsd:enumeration value="P"/> > > <xsd:enumeration value="W"/> > > <xsd:enumeration value="M"/> > > <xsd:enumeration value="N"/> > > <!-- Asking price of /--> > > <!-- Offers in the region of /--> > > <!-- Offers in excess of /--> > > <!-- Auction guild price of /--> > > <!-- Subject to contract /--> > > <!-- Price range of /--> > > <!-- Prices from /--> > > <!-- Guide price /--> > > <!-- Price on Application /--> > > <!-- Weekly rental of /--> > > <!-- Monthly rental of /--> > > <!-- Annual rental of /--> > > </xsd:restriction> > > </xsd:simpleType> > > <!-- sellingState has a set list of possible values /--> > > <xsd:simpleType name="sellingState_Type"> > > <xsd:restriction base="xsd:string"> > > <xsd:enumeration value="V"/> > > <xsd:enumeration value="U"/> > > <xsd:enumeration value="H"/> > > <xsd:enumeration value="N"/> > > <xsd:enumeration value="S"/> > > <xsd:enumeration value="L"/> > > <!-- Viewing /--> > > <!-- Under offer /--> > > <!-- Hidden /--> > > <!-- New Instruction /--> > > <!-- Sold /--> > > <!-- Let /--> > > </xsd:restriction> > > </xsd:simpleType> > > <!-- propertyType has a set list of possible values /--> > > <xsd:simpleType name="propertyType_Type"> > > <xsd:restriction base="xsd:string"> > > <xsd:enumeration value="H"/> > > <xsd:enumeration value="F"/> > > <xsd:enumeration value="A"/> > > <!-- House /--> > > <!-- Flat /--> > > <!-- Agricultural /--> > > </xsd:restriction> > > </xsd:simpleType> > > <!-- saleOrRent has a set list of possible values /--> > > <xsd:simpleType name="saleOrRent_Type"> > > <xsd:restriction base="xsd:string"> > > <xsd:enumeration value="S"/> > > <xsd:enumeration value="R"/> > > <!-- Sale /--> > > <!-- Rent /--> > > </xsd:restriction> > > </xsd:simpleType> > > <!-- tenure has a set list of possible values /--> > > <xsd:simpleType name="tenure_Type"> > > <xsd:restriction base="xsd:string"> > > <xsd:enumeration value="F"/> > > <xsd:enumeration value="S"/> > > <xsd:enumeration value="L"/> > > <xsd:enumeration value="X"/> > > <xsd:enumeration value=""/> > > <!-- Freehold /--> > > <!-- Share of freehold /--> > > <!-- Leasehold /--> > > <!-- Not Specified /--> > > <!-- Not Specified /--> > > </xsd:restriction> > > </xsd:simpleType> > > </xsd:schema> > > -- > CTO, DecisionSoft Limited > +44 1865 203192 / +44 7968 408138 > > >
Received on Thursday, 14 June 2007 15:53:32 UTC